Jun 6, 2016
Photo by Susan Chiang/iStock
In response to research showing the critical role that teachers play in student learning and the inadequate job that districts have historically done judging teachers' effectiveness, the Bill & Melinda Gates Foundation launched the Intensive Partnerships for Effective Teaching initiative. The initiative involves three school districts (Hillsborough County Public Schools [HCPS] in Florida, Memphis City Schools [MCS] in Tennessee, and Pittsburgh Public Schools [PPS] in Pennsylvania) and four charter management organizations (CMOs) based in California (Aspire Public Schools, Alliance College-Ready Public Schools, Green Dot Public Schools, and Partnerships to Uplift Communities Schools). These sites have worked over a multiyear period to align teacher evaluation, staffing, professional development, compensation, and career-ladder policies to boost teaching effectiveness and increase low-income minority (LIM) students' access to effective teaching. The initiative's goal is dramatic gains in student achievement, graduation rates, and college-going, especially for LIM students. At the core of these changes is each site's adoption of a definition of effective teaching and development of a rigorous measure of effectiveness that combined classroom observation, gains in student achievement, and other factors to rate every teacher. Each site used its vision of effective teaching and the new evaluation metrics to improve its management of its teacher workforce, including hiring, placement, professional development and support, compensation, retention, and career advancement.
Two key components of the reform were the development of teacher- evaluation systems and the adaptation of personnel policies.
Photo by thelinke/iStock
Each site developed a vision of effective teaching, shared by site stakeholders and embodied in classroom-observation rubrics, and devised a new measure of effectiveness that included classroom observations, as well as a teacher's effect on student achievement and, in some cases, input from students. Although stakeholders took up to two years to agree on a vision and revise their evaluation systems, every site adopted the core elements, signaling a change in the conversation about effective teaching.
At the outset of the initiative, each site made a concerted effort to involve teachers in the design of the new evaluation system, keep them informed about new procedures, and address their concerns. These efforts achieved mixed results in generating teacher support. For example, large majorities of teachers in all seven sites thought that classroom observations were a valid way to measure effectiveness, and most teachers indicated that that they understood the observers' rating criteria, that their observers were qualified, and that they received useful feedback from the observations. There was less support, however, for the student achievement and student feedback components of the overall effectiveness measure.
Photo by choja/iStock
All sites adapted their staffing, professional development, compensation, and career-ladder policies to reflect their new views of effective teaching; however, many of the revised policies were not implemented until after the evaluation components were in place. The bulk of the intended policies and procedures were in place by the 2013–2014 school year.
In addition to changes in recruitment practices aimed at hiring more-effective teachers, effectiveness became a key consideration in retention and dismissal decisions.
By the 2013–2014 school year, all sites had adopted some form of effectiveness bonus; in most sites, this entailed awarding extra compensation to teachers with the highest effectiveness ratings. Although most teachers in the three districts thought that base pay should be based on seniority, a majority of teachers in all sites thought that additional compensation should be given for outstanding teaching skills and for working in low-performing schools.
Teacher responses were mixed on the fairness of their sites' compensation systems and whether those systems motivated them to improve their teaching, with CMO teachers being more likely than district teachers to respond positively to these reforms.
Sites emphasized using evaluations to inform professional growth, particularly the information gathered through the observations. However, they found it difficult to use the evaluation information to customize professional development and support for teachers. One challenge was that existing professional-development offerings did not always align with the dimensions of practice that the new teacher-observation rubrics captured. Sites' strategies to help teachers improve their effectiveness included centralized professional development targeting common challenges, customized workshops, local coaching and mentoring, and collaborative communities of practice.
Sites emphasized using evaluations to inform professional growth, particularly the information gathered through the observations.
Career ladders have not been implemented as quickly or widely as the other policies. Nevertheless, by the spring of 2014, all of the sites had developed some form of career ladder in which effective teachers take on new roles, such as coaching or mentoring, and receive additional salary or stipends for these responsibilities. Most teachers indicated that the selection process for these specialized positions was fair and, in most sites, reported aspiring to higher or specialized positions. Moreover, in most sites where career ladders were present, teachers reported that the opportunity to take a career-ladder position motivated them to improve their instruction and increased the chances that they would remain in teaching.
Using student achievement data from 2007–2008 through 2013–2014, we constructed a value-added measure of teacher effectiveness for each teacher whose students were tested in reading or mathematics. We used this common measure to examine how teacher effectiveness was distributed between LIM and non-LIM students. In the school years prior to the Intensive Partnerships initiative, teachers with more LIM students were rated more effective, on average, than teachers with fewer LIM students. This pattern has remained fairly consistent from 2007–2008 (several years before the intervention began) through 2013–2014, although districts differ in some subjects and years.
During both the preintervention and intervention periods, effective teachers were more likely to be assigned to schools with higher proportions of LIM students than to other schools. Within schools, effective teachers were less likely to be assigned to classes with higher proportions of LIM students than to other classes. That is, the sites have been somewhat successful at placing the most-effective teachers in schools with high percentages of LIM students; however, they have been less successful in placing the most-effective teachers within each school in high-LIM classrooms.
Photo by thelinke/iStock
According to statistical comparisons of the schools in the Intensive Partnership sites with similar schools in their state, the initiative does not appear to have had much of a positive effect on student achievement or graduation rates in the three districts from 2010–2011 through 2012–2013, when most of the reform components were being phased in, or in 2013–2014, when most were in place. (Because of a lack of testing data in California for school year 2013–2014, we could not examine effects for the four CMOs.) Our estimates of effects on student performance in the lower grades are mixed but mostly nonsignificant during this whole period (with the exception of MCS, which fared significantly worse after the start of the initiative). The initiative is also associated with negative effects on high school reading achievement. However, in 2013–2014, as more of the sites' policies and practices were in place, impact estimates increased in many sites.
More specifically, in PPS, there are some indications that math and reading achievement in grades 3–8 improved more than in comparable schools in Pennsylvania in 2014, but the difference was only statistically significant for math. High schools in PPS underperformed on reading scores in relation to comparable Pennsylvania high schools. There has, however, been a significant positive effect on graduation rates in most years.
In HCPS, there is evidence that grades 3–8 made more progress than comparable schools in the state in reading in 2014, although the estimate is not statistically significant. In grade 3–8 math, schools of HCPS performed similarly in 2014 to comparable schools in other Florida districts. The results indicate that high schools in HCPS performed worse than similar schools in Florida did on reading scores and that dropout rates were higher than in other schools.
Photo by Christopher Futcher/iStock
For MCS, although scores in math and reading in grades 3–8 worsened more than in similar schools in Tennessee in the initial years after the grant was awarded, they have improved since the 2012–2013 school year, as the initiative has been more fully implemented. The testing policies in Tennessee prevented us from estimating the initiative's effect on high school student achievement, but we did find evidence of lower graduation rates than in similar Tennessee high schools following program implementation. There was some difficulty forming an adequate comparison group of Tennessee schools with similar trends in achievement and demographics before the intervention, but these results were consistent using two different comparison methods.
Although the initiative appears to have had positive effects on student achievement in grades 3–8 in math or reading in HCPS and PPS in the 2013–2014 school year, as implementation became more complete, the size of these gains is modest thus far. Overall, the estimated effects appear small compared with the average expected achievement gains during a year of learning or with benchmarks from various school-level interventions (such as Comprehensive School Reform). However, the estimated effects in grade 3–8 reading in HCPS and PPS and in grade 3–8 math in PPS are about the same size (or larger, in the case of math in PPS) than the average effects found for charter schools in several large-scale studies and for other districtwide interventions that researchers have studied.
Delayed positive effects would not be surprising, given the long timeline for implementation of many of the initiative components.
One interesting pattern that bears mentioning is the upward trajectory of almost all student outcomes between school years 2012–2013 and 2013–2014. In all three districts, impact estimates in most subject-grade combinations are greater for 2013–2014 than in 2012–2013. Although only the math impact estimate in grades 3–8 in PPS is positive and statistically significant for 2013–2014, the recent upward trajectory in most of the achievement impact estimates suggests that the reforms might be on the way to having a positive effect after a few transition years with no effect or negative effect. Delayed positive effects would not be surprising, given the long timeline for implementation of many of the initiative components.
In 2017, we will release a final report that will include two more years of student outcomes. We will also explore the individual policy levers and mechanisms through which they might have had an effect, and we will provide estimates of the start-up costs to implement the initiative and the ongoing costs to operate it. Stay tuned.
The Intensive Partnerships for Effective Teaching