A group of parents across the country are rejecting the idea of high-stakes testing as a means of judging both schools and educators, choosing to have their children opt out of standardized high-stakes tests like the PSSAs that public school children in Pennsylvania are taking right now.
Under the No Child Left Behind Act, these tests became a means for states to assess educational performance. The opt-out movement argues that such high-stakes testing can narrow curriculum, saddle both teachers and students with excessive stress and threaten the jobs of educators.
While the opt-out movement's concerns are valid and have sparked an important conversation, the use of high-stakes tests is a complex issue. A decade of research on testing and accountability conducted at RAND indicates that these tests can have both positive and negative consequences depending on how policymakers, school districts and teachers use and implement them.
Our research has shown that the NCLB's heavy focus on mathematics and reading has led to reduced emphasis on some of the many other aspects of learning that public schools are expected to provide. In several states, including Pennsylvania, we have found evidence to support concerns about the narrowing of curriculum: Testing in a small number of subjects has often led to less time devoted to non-tested subjects and to non-tested activities within the tested subjects. But not all the time, or in every school.
Teachers may, for example, assign fewer extended essays and more multiple-choice questions in a reading class to better prepare their students for standardized tests. However, these changes are not observed in every school and classroom, and it is rare to see entire subjects eliminated. And while there is also some evidence that gains in test scores may be inflated because students have learned how to take tests or have mastered a narrow range of content, there has been no evidence that learning has been reduced in the subjects tested.
At the same time, there is evidence that NCLB-mandated tests have prompted schools and communities to increase efforts to meet the educational needs of students in traditionally underserved subgroups — such as those from certain racial and ethnic groups, special education students, students from lower-income families and English-language learners — thanks to the requirement that subgroup scores be reported. Numerous teachers in our research have noted that these accountability policies brought increased attention to serving the needs of all students.
NCLB-mandated tests do not necessarily affect individual students since they are not required to be linked to grades, promotion decisions or placement into programs. But given the high-stakes nature of these tests for schools and teachers, it is understandable that the stress felt by school staff could trickle down to students. Students should be given the message that the tests are hardly the sole measure of their performance or the performance of their teachers.
Research also shows that even in schools that have failed to meet their targets for many years, staff members typically do not lose their jobs. The recent school closures that have occurred in Pittsburgh and other cities most often result from declining populations and the impact that has on resources.
Test results often are used as one factor to make decisions on which schools to close, but they are not the sole factor. Other actions, such as converting a school to a charter school or replacing staff, are quite rare; instead, most low-performing schools take other approaches, such as adopting new curricula or bringing in supports such as coaches for teachers. In fact, this type of assistance for underperforming schools is mandated under NCLB.
All of this may change as tests begin to be included in the evaluations of individual teachers and principals, but the goal of many of the new evaluation systems is to use test scores as only one of several sources of information to help educators make decisions about professional development, curriculum and instruction, rather than to use them in a punitive way.
Testing alone does not improve instruction, but neither does it necessarily lead to the kind of narrow, test-focused instructional environment that many critics fear. It is clear that while no test can adequately measure everything that a student has learned, multiple-choice tests are especially limiting. If we want testing to exert beneficial effects on teaching and learning, we need to advocate for higher-quality tests and for evaluation and accountability systems that use multiple measures and do not rely exclusively on test scores.
Laura Hamilton is a senior behavioral scientist and Gabriella C. Gonzalez is a social scientist at the nonprofit, nonpartisan RAND Corp. Both are based in RAND's Pittsburgh office.
This commentary originally appeared in Pittsburgh Post-Gazette on April 21, 2013. Commentary gives RAND researchers a platform to convey insights based on their professional expertise and often on their peer-reviewed research and analysis.