What New York City's Experiment with Schoolwide Performance Bonuses Tells Us About Pay for Performance
As educators and policymakers continue to search for strategies to improve public education, one approach receiving attention is the use of financial incentives tied to performance to compensate educators. Advocates of such incentives argue that they will motivate educators to improve their practices and attract more individuals to the profession, while detractors are concerned that such strategies negatively affect morale and collegiality. In the 2007–2008 school year, the New York City Department of Education (NYCDOE) and the United Federation of Teachers (UFT) implemented the Schoolwide Performance Bonus Program (SPBP). With funding from The Fund for Public Schools and the National Center on Performance Incentives, researchers from the RAND Corporation and Vanderbilt University independently evaluated the implementation and effects of this program.
The researchers conducted this evaluation from February 2009 through March 2011, using both qualitative and quantitative data and designed their analysis to take advantage of the program's experimental design. They found that, although the program was implemented fairly and smoothly, it did not improve student achievement or overall school performance and did not affect teachers' reported attitudes and behaviors. Given these findings, the researchers went on to examine potential explanations for the lack of effects and to identify implications for pay-for-performance policies in general.
Overview of New York City's Schoolwide Performance Bonus Program
Implemented for the first time in the 2007–2008 school year, this voluntary program provided financial rewards to educators in high-needs elementary, middle, K–8, and high schools. Each school's UFT-represented employees (including teachers, support staff, and counselors) voted on whether to participate, and participating schools were eligible to receive school-level bonus awards of up to $3,000 for each full-time UFT-represented staff member. Performance targets for awards were defined by NYCDOE's school Progress Reports, the district's main accountability tool for measuring student performance (growth on standardized tests and performance relative to other schools) and environment in all schools in the district. The program also required each participating school to establish a four-person compensation committee to determine how to distribute the bonus awards among staff members.
In 2007–2008, 427 high-needs schools were identified, and about half were randomly selected to participate (the treatment schools for the study's main analyses), and half were not selected (the control schools). Among those selected to participate, 205 schools participated in the first year, 198 schools in the second, and 196 in the third and final year. In the first year, 62 percent received bonuses, for a total of more than $20 million; in the second, 84 percent of eligible schools earned a bonus, for more than $30 million in awards. In year 3, after the state raised its proficiency thresholds, only 13 percent of the schools earned bonuses, for a total of only $4.2 million. The district suspended the program in January 2011.
The Program Did Not Produce the Intended Effects
The program did not improve student achievement at any grade level. The researchers found that the average mathematics and English language arts test scores of students from elementary, middle, and K–8 schools invited to participate in SPBP were lower than those of students from control schools during all three years of the experiment. However, the differences were very small and statistically significant only for mathematics in year 3 and were not significant when the researchers controlled for testing effects from multiple years and subjects. Similarly, researchers found no overall effects on state Regents Exam scores for high school students in the first two years (year 3 data were not available for analysis). The program's effects did not differ among schools of different sizes or according to bonus award distribution plan.
The program also did not affect school Progress Report scores. Across all years and all categories of scores for the Progress Reports (environment, performance, progress, and additional credit), the researchers found no statistically significant differences between scores of treatment and control schools. The lack of effects held true for elementary, middle, K–8, and high schools.
The program did not affect teachers' reported attitudes, perceptions, and behaviors. The researchers found no differences between the reported practices and opinions of teachers in treatment schools and those of the control group. The survey responses about instructional practices, effort, participation in professional development, mobility, and attitudes from the two groups were very similar, with no statistically significant differences. Furthermore, the vast majority of teachers who received bonuses said that the bonus did not affect their performance.
The majority of compensation committees developed nearly egalitarian award distribution plans, reflecting strong preferences among committee members that staff members share bonuses equally. Even though administrators were significantly more in favor of differentiating bonus awards, the majority of committees developed essentially equal-share distribution plans, giving most staff members an award of about $3,000 on average. Although most plans included some small amount of differentiation for a handful of individuals — typically because individuals worked at the school only part-time or only for part of the school year — compensation committees were much less likely to judge individual performance when allocating bonus shares.
The implementation had mixed results in creating the conditions that foster success. Although teachers reported being aware of the program and generally supportive of it, more than a third did not understand key elements of the program, including targets, bonus amounts, and how committees decided on distribution plans. The vast majority of teachers suggested that they had not been informed about distribution plans at the start of the year. The majority of teachers and compensation committee members felt bonus criteria relied too heavily on test scores, indicating limited buy-in for program performance measures. Teachers also seemed to overestimate the likelihood that their schools would receive an award. And although the majority of teachers expressed a strong desire for their schools to win the award, many recipients reported that, after taxes, the amount seemed insignificant. Each of these factors could have affected the program's success.
What Explains the Lack of Positive Effects Under the Program?
The student achievement findings suggest that the program did not achieve its goal of improving student performance over its three-year duration. The researchers discuss four possible explanations for this result. First, the newness of the program could have been a factor. However, if newness were the explanation, some positive effects should have emerged by year 3, but that was not the case.
Second, it is possible that several factors important for pay-for-performance programs (e.g., understanding of the program, buy-in for bonus criteria, perceived value of the bonus) did not take root in all participating schools, which might have weakened the motivational effects of the bonus.
Third, the theory underlying school-based pay-for-performance programs may be flawed. Motivation alone might not be sufficient. Even if the bonus here had inspired teachers to improve, they might have lacked the capacity or resources — such as school leadership, expertise, instructional materials, or time — to bring about improvement.
Finally, the lack of observed results could have been due to the low motivational value of the bonus relative to other accountability incentives that applied to all the schools. Many teachers and other staff acknowledged that other accountability pressures, such as receiving high Progress Report grades or achieving Adequate Yearly Progress targets, held the same if not greater motivational value as the possibility of receiving a financial bonus. While the bonus might have been another factor motivating SPBP staff members to work hard or change their practices, they would probably have been similarly motivated without it because of the high level of accountability pressure on all schools and their staffs.
Implications for Pay-for-Performance Policies
Overall, these results yield several implications relevant to the broader set of pay-for-performance policies that have received considerable attention in recent years:
- Conditions must foster strong motivation. This study supports existing research suggesting that there may be a set of key conditions (e.g., a reasonable time line and a high degree of understanding, expectancy, valence, buy-in, and perceived fairness) needed to bolster the motivational effect of financial incentives. Several of these purported key system components were lacking in SPBP and were identified by some educators as limiting the ability of program to change their behaviors.
- It is important to identify the factors that truly affect motivation. Motivation is the key to the theory of change in pay-for-performance programs. Even though teachers in this study reported that the bonus was desirable and motivating, they also reported not changing their teaching practices in response to the program. Thus, a desirable award might not be enough to change behavior. This may be particularly true in the context of high-stakes, high-profile accountability in which teachers are already responding to other motivating factors.
- Performance-based incentives may face challenges from the micropolitics of school-level implementation. This study highlighted the underlying political tensions inherent in a bonus system. Although many major program elements were implemented smoothly across participating schools, some schools had difficulty deciding how to distribute bonuses among staff, and some unequal disbursements exacerbated political tensions within schools. Those seeking to enact similar programs should recognize that the very idea of differentiating pay based on performance might challenge deeply ingrained norms of collaboration and egalitarianism.
- Pilot testing and evaluation are essential. From the outset, NYCDOE and teachers' union leaders planned to implement the program on a pilot basis. Implementing the program on a small scale and including random treatment and control groups for three years provided valuable information to inform future decisions about an untested policy innovation. Those considering similar programs should plan for pilot testing and evaluation of the theory and assumptions underlying any new pay-for-performance program.
This research brief describes work done for RAND Education documented in A Big Apple for Educators: New York City's Experiment with Schoolwide Performance Bonuses: Final Evaluation Report, by Julie A. Marsh, Matthew G. Springer, Daniel F. McCaffrey, Kun Yuan, Scott Epstein, Julia Koppich, Nidhi Kalra, Catherine DiMartino, and Art (Xiao) Peng, MG-1114-FPS, 2011, 312 pp., ISBN: 978-0-8330-5251-3 (Full Document).
This research brief was written by Jennifer Li.
This product is part of the RAND Corporation research brief series. RAND research briefs present policy-oriented summaries of individual published, peer-reviewed documents or of a body of published work.
The RAND Corporation is a nonprofit research organization providing objective analysis and effective solutions that address the challenges facing the public and private sectors around the world. RAND’s publications do not necessarily reflect the opinions of its research clients and sponsors.
Copyright © 2011 RAND Corporation