Why Don't We See Any Effects for P4P?
Daniel McCaffrey is a senior statistician at the RAND Corporation, where he holds the PNC Chair in Policy Analysis. He is a fellow of the American Statistical Association and is nationally recognized for his work on value-added modeling for estimating teacher performance. McCaffrey oversees RAND's efforts as part of the Gates Foundation's Measures of Effective Teaching study to develop and validate sophisticated metrics to assess and improve teacher performance. He is currently leading RAND's efforts on two additional studies comparing value-added measures to other measures of teaching, including classroom observations, and is a major partner in the National Center on Performance Incentives, which is conducting random control experiments to test the effects of using value-added to reward teachers with bonuses. McCaffrey received his Ph.D. in statistics from North Carolina State University. Read more about Daniel McCaffrey »
What accounts for P4P's lack of effects on student achievement or teacher practices and attitudes?
There are three broad possible reasons: The bonus did not motivate change; the bonus motivated change but the changes they made did not lead to better high student achievement; or the studies failed to identify effects that occurred.
Which seems to be the source?
We believe the potential to earn the bonus did not motivate teachers to change, for a number of reasons. Less than half the teachers in Round Rock, for example, reported having a clear understanding of the program, and teachers at all three sites questioned the test-based metric for determining awards and the fairness of such a plan; that suggests that buy-in and acceptance and perception of fairness were weak. Teachers basically reported that they saw the bonus as a "pat on the back" for the hard work they were already doing.
Why else might teachers not see the bonus as a reason to change?
The program was not part of teachers' regular evaluation or compensation system, and the bonuses were paid in October or November of the school year following the one for which performance was evaluated. Also, the awards were for outcomes that might be a possible consequence of teachers' actions rather than a direct payment for actions teachers could definitely control.
Yes, all teachers--both those in the treatment groups and those in the control groups--already had strong incentives to raise test scores because of state and local accountability requirements. Given the strong incentives all teachers faced, there may have been no room for teachers to do more to earn a bonus. It could be that the teachers in the treatment groups were taking actions to win the bonus, but control teachers were doing the same things to save their jobs in the face of potential state takeover.
Could there be effects you couldn't see or measure?
It's possible. The program durations may have been too short for effects to take hold. Although we didn't generally see effects increasing over the course of the programs, the true beneficial effect of changing teacher compensation might be through changes in the composition of the teacher workforce over time. It might be hard to identify or create such changes with a two- or three-year, limited-duration pilot program.
So, where do we go from here?
The programs RAND evaluated involved only paying teachers bonuses based on their students' achievement. The programs didn't integrate the performance measurement into teachers' evaluations or provide any specific training or other resources to help teachers meet their performance targets. Rigorous evaluations of more broadly defined compensation reforms could find that such reforms are effective. The revised teacher evaluation systems states are now implementing as part of or in response to Race to the Top could provide a rich test bed to study such interventions.
Are there alternative methods of P4P that might be beneficial?
We know research of performance incentives in other contexts finds that directly rewarding people for completing beneficial tasks tends to lead to better outcomes than paying for outcomes. For example, paying students to read books led to higher achievement scores, whereas paying for higher achievement scores did not. School systems might pilot this approach to performance pay. But if they do, they should include rigorous evaluations; as RAND research has shown, such evaluations are feasible and can be extremely valuable in identifying reforms that do or do not work.