Measuring Deeper Learning Through Cognitively Demanding Test Items
Results from the Analysis of Six National and International Exams
- On each of six nationally and internationally administered tests, what percentage of items are considered cognitively demanding?
- To what extent do these tests meet the two criteria for a high-quality assessment of students' deeper learning skills?
In 2010, the William and Flora Hewlett Foundation's Education Program has established the Deeper Learning Initiative, which focuses on students' development of deeper learning skills (i.e., the mastery of core academic content, critical-thinking, problem-solving, collaboration, communication, and "learn-how-to-learn" skills). Two test consortia are developing the next generation of tests to measure students' attainment of the Common Core State Standards. These tests are expected to assess deeper learning skills to a greater extent than existing large-scale tests. A RAND study rated the cognitive demand of mathematics and English language arts items on six nationally and internationally administered exams: Advanced Placement, International Baccalaureate, the National Assessment of Educational Progress, the Programme for International Student Assessment, the Progress in International Reading Literacy Study, and the Trends in International Mathematics and Science Study, using Norman Webb's Depth of Knowledge framework and the Partnership for Assessment of Readiness for College and Career's self-developed frameworks. It found that these tests were more cognitively demanding than previously studied state achievement tests in both subjects, on average. The test items' level of cognitive demand varied by subject and format. The six tests varied in their percentages of cognitively demanding items, with only two tests meeting both criteria proposed by a panel of education researchers for high-quality measures of deeper learning. Moreover, the tests' cognitive demand levels varied with test purpose and the characteristics of the targeted students. The findings establish a benchmark for comparing how well the new generation of tests performs in assessing deeper learning.
Six Nationally and Internationally Administered Tests Had Greater Cognitive Demand Than Previously Studied State Tests
- On average, six benchmark tests — Advanced Placement, International Baccalaureate, the National Assessment of Educational Progress, the Programme for International Student Assessment, the Progress in International Reading Literacy Study, and the Trends in International Mathematics and Science Study — were more cognitively demanding than previously studied state achievement tests in both mathematics and English language arts.
The Cognitive Demand of Test Items Varied by Subject and Item Format
- In general, the cognitive demand on the six tests was greater for English language arts than for mathematics, and it was greater for open-ended items than for multiple-choice items.
The Six Tests Varied in the Extent to Which They Assessed Students' Deeper Learning Skills
- Only two of the six tests met both criteria proposed by a panel of education researchers for high-quality measures of deeper learning skills.
- The cognitive demand level of the tests varied with test purpose and the characteristics of the targeted students.
- It is necessary to analyze the operational forms of the new generation of assessments developed to measure students' attainment of the Common Core State Standards to understand the extent to which they will actually measure deeper learning when they are available in 2015.
- Future analysis of the new generation of assessments should choose tests with similar purposes and targeted student populations as benchmark tests for comparison.
Table of Contents
Tests Included in This Study
Cognitive Demand Frameworks and Ratings for the Benchmark Tests
Discussion and Implications
Distributions of NAEP Items, by Grade and Specific Framework Dimension
Distributions of TIMSS Items, by Content and Cognitive Domain
Exemplary Test Items at Each DOK Level
Exemplary Test Items at Each PARCC Level, by Subject and Dimension
Distribution of Modified PARCC Ratings
Results for the Original PARCC Dimensions and Levels