Media Materials - Evaluating the Content and Quality of Next-Generation Assessments

Media Materials
Evaluating the Content and Quality of Next-Generation Assessments
Thomas B. Fordham Institute

The material on this password-protected site is embargoed, and not for use until Thursday, Feb. 11, 2016 at 12:01 AM.

February 11, 2016

Evaluating the Content and Quality of Next Generation Assessments is a groundbreaking new study that examines secure, previously unreleased items from three multi-state tests (ACT Aspire, PARCC, and Smarter Balanced) and a best-in-class state assessment, Massachusetts’ state exam (MCAS). No one has ever gotten under the hood of these tests and published an objective third-party review of their content, quality, and rigor. Until now.

Over the last two years, the Thomas B. Fordham Institute, along with two rock-star principal investigators and almost forty equally stellar reviewers used a new methodology designed to answer policymakers’ most pressing questions: Do these tests reflect strong content? Are they rigorous? What are their strengths and areas for improvement?

As our benchmark, we used the Council of Chief State School Officers’ Criteria for Procuring and Evaluating High-Quality Assessments. We evaluated the summative (end-of-year) assessments in the capstone grades for elementary and middle school (grades 5 and 8). (The Human Resources Research Organization evaluated high-school assessments. Find their report here.)

Here’s just a sampling of what we found.

Overall, PARCC and Smarter Balanced assessments had the strongest matches to the CCSSO Criteria.
ACT Aspire and MCAS both did well regarding the quality of their items and the depth of knowledge they assessed
Still, panelists found that ACT Aspire and MCAS did not adequately assess—or may not assess at all—some of the priority content reflected in the Common Core standards in both ELA/Literacy and mathematics.

Overall, programs received the following marks on content and depth across math and ELA.

Our reviewers spotted areas of strengths and improvement for all four programs:

ACT Aspire’s combined set of ELA/ Literacy tests (reading, writing, and English) require close reading and adequately evaluate language skills. Its math test items are also generally high-quality and clear. (minus) In ELA/Literacy, reading items fall short on requiring students to cite specific textual information in support of a conclusion, generalization, or inference and in requiring analysis of what has been read.
MCAS’s ELA/Literacy tests require students to closely read high-quality texts, and both math and ELA assessments include a good variety of item types. (minus) While mathematical practices (such as modeling and making mathematical arguments) are required to solve items, MCAS does not specify their connections to the content standards.
PARCC’s ELA/Literacy assessment includes appropriately complex texts, require a range of cognitive demand, and include a variety of item types. In math, the test is generally well-aligned to the major work of the grade. (minus) PARCC would better meet the criteria by increasing the focus on essential content at grade 5.
Smarter Balanced’s ELA/Literacy tests assess the most important skills called for by the Common Core standards, and its assessment of writing and research and inquiry are especially strong. In math, the test is also generally well-aligned to the major work of the grade. (minus) In ELA/Literacy, a greater emphasis on academic vocabulary would further strengthen Smarter Balanced relative to the criteria.

In addition, we found big differences in the types of items that programs use on their tests. This figure for ELA/Literacy sums it up (see report for the math figure):

As mentioned, HumRRO reviewed high-school exams. View the combined results here.

We’re glad to be the bearers of good news for a change. All four tests we evaluated boasted items of high technical quality, and the next generation assessments that were developed with the Common Core in mind have largely delivered on their promises. Yes, they have improvements to make, but they tend to reflect the content deemed essential in the Common Core standards and demand much from students cognitively. They are, in fact, the kind of tests that many teachers have asked state officials to build for years.

Now they have them.

Additional Resources for reporters:

Key Findings and Charts—download a presentation here showing key findings, with references to corresponding pages in the text.
What do test items look like? All four tests—PARCC, Smarter Balanced, ACT Aspire, and MCAS—have made some test items public (note that the Fordham panels reviewed unreleased, operational items, and therefore cannot share or comment on specific test items).
Sample Test items: The following links connect to actual problems and answer sheets that were used by each of the testing companies. Reporters may wish to use some of the items in the documents as examples for the public of questions from each test. Please note that the questions in the links below were released earlier by the assessment companies but were not the actual ones the Fordham Institute and its reviewers examined for this report.

*Please note that Table 23 in the full report contains a addition error. Please contact aschwenk@edexcellence.net for an updated version of the table.

Which test is your state using?

State Use of Next-Generation Assessments, 2016-17
Source: Education First Consulting, LLC
PARCC (9 states)	CO, DC, IL, LA, MD, NJ, NM, NY, RI
Smarter Balance (17 states)	CA, CT, DE, HI, ID, IA, MI, MT, NV, NH, NC, ND, OR, SD, VT, WA, WV
ACT Aspire (3 states)	AL, AR, SC
MCAS (1 state)	MA
Used to Use PARCC or Smarter Balance But Dropped in the last two years (7 states)	AR, ME, MS, MO, OH, WI, WY
Never Used These Assessments (15 states)	AK, AZ, FL, GA, IN, KS, KY, MN, NE, OK, PA, TN, TX, UT, VA

Download the full report

To speak with a researcher about the report, contact Alyssa Schwenk at 202-223-5452.