(QRISs) have advanced and matured, a number of states and localities have undertaken evaluations to validate the systems. Such efforts stem from the desire to ensure that the system is designed and operating in the ways envisioned when the system was established. Given that a central component in a QRIS is the rating system, a key concern is whether the rating process, including the use of particular measures and the manner in which they are combined and cut scores are applied, produces accurate and understandable ratings that capture meaningful differences in program quality across rating levels.
The aim of this paper is to review the set of studies that seek to validate QRIS rating systems in one of several ways: by examining the relationship between program ratings and objective measures of program quality; by determining if program ratings increase over time; and by estimating the relationship between program ratings and child developmental outcomes. Specifically, we review 14 such validation studies that address one or more of these three questions. Together, these 14 studies cover 12 QRISs in 11 states or substate areas: Colorado, Florida (two counties), Indiana, Maine, Minnesota, Missouri, North Carolina, Oklahoma, Pennsylvania, Tennessee, and Virginia. In reviewing the literature, we are interested in the methods and measures they employ, as well as the empirical results.
To date, most validation studies have found that programs with higher ratings had higher environment rating scores (ERSs), but the ERS is often one of the rating elements. Independent measures of quality have not always shown the expected positive relationship with quality. The handful of studies that have examined how ratings change over time have generally shown that programs participating in the QRIS did improve their quality or quality ratings. Studies that examine the relationship between QRIS ratings and child development are the most challenging to implement and can be costly to conduct when independent child assessments are performed. Consequently, there has been considerable variation in methods to date across these studies. Among the four studies with the stronger designs, two found the expected relationship between QRIS ratings and child developmental gains. The lack of robust findings across these studies indicate that QRISs, as currently configured, do not necessarily capture differences in program quality that are predictive of gains in key developmental domains.
Based on these findings, the paper discusses the opportunities for future QRIS validation studies, including those conducted as part of the Race to the Top — Early Learning Challenge grants, to advance the methods used and contribute not only to improvement of the QRIS in any given state, but also to add to the knowledge base about effective systems more generally.