Editor’s note: This is the third in a series of blog posts that will take a closer look at the findings and implications of Evaluating the Content and Quality of Next Generation Assessments, Fordham’s new first-of-its-kind report. The first two posts can be read here and here.
The ELA/literacy panels were led by Charles Perfetti (distinguished professor of psychology and director and senior scientist at the University of Pittsburgh’s Learning Research and Development Center) and Lynne Olmos (a seventh-, eighth-, and ninth-grade teacher from the Mossyrock School District in Washington State). The math panels were led by Roger Howe (professor of mathematics at Yale University) and Melisa Howey (a K–6 math coordinator in East Hartford, Connecticut).
Here’s what they had to say about the study.
***
Which of the findings or takeaways do you think will be most useful to states and policy makers?
CP: The big news is that better assessments for reading and language arts are here, and we can expect further improvements. Important for states is that, whatever they decide about adoption of Common Core State Standards, they will have access to better assessments that will be consistent with their goals of improving reading and language arts education.
LO: I hope that all stakeholders view this study as a tool for understanding what needs to happen next to ensure that all students are tested fairly with high-quality assessments. That means the assessments they choose or develop should be aligned closely to high-quality standards, and they should assess the students at appropriate depths of knowledge. We should never waste our valuable time in schools giving tests that are not meaningful and closely tied to the skills we practice in the classroom.
RH: The new tests really do present an improvement, in that some of the questions they ask
are more probing than what was previously asked. It is not that the questions are
technically hard or tricky, but they ask students to put together several pieces of
reasoning to make a conclusion. This is more “real-world” than one-step items.
MH: There were many findings and takeaways that came about during this process. Particularly, the findings about the major work in the grade levels and the cognitive level of the questions were very evident in the different assessments. We were able to see the similarities and differences between the four assessments and the benefits and downfalls of administering each. Another takeaway was that there is not a perfect assessment out there to evaluate our knowledge of the standards. But it was definitely evident that the standards should not be taught in isolation; they should instead be taught so that the students can make connections between the standards and have a deeper understanding through application.
Did anything surprise you as you were reviewing these tests?
CP: The quality of the assessments was generally high, but this did not surprise me. I did find interesting some of the solutions test developers had to the problem of creating items for deeper understanding. And it was interesting to see the variability in the use of technology in the tests. I think we can expect significant, rapid gains in the appropriate use of technology.
LO: Having worked diligently to align my classroom practice to the Common Core State Standards, I was surprised that important shifts presented by these standards seemed to be underrepresented in some assessments. For instance, in an English language arts classroom, we emphasize writing about what we read using evidence from the text. Although this skill was sometimes represented in an assessment, it was underrepresented overall. On the other hand, as a skeptic when it comes to assessment, I was also pleasantly surprised by the overall quality of the tests.
MH: One big surprise was how difficult it was to categorize items as conceptual, procedural or application. It might be helpful to have clear guidance so that reviewers will be able to agree on the categorization of the items in the future.
RH: A pleasant surprise was the thoughtfulness of some of the extended response questions. A not-so-pleasant surprise was the extent of editorial issues, including mathematical ones.
What advice would you give others undertaking a similar study?
CP: [T]he lesson I see is that finding the right mix of expertise for the panels is very important. The panels were amazing in their combination of specific expertise, reflective decisions, and wise judgments. This was crucial to our success, as was all the groundwork in developing study methods and providing resources to the panel.
LO: It is important that future studies clearly delineate between text types and qualities for English language arts assessments. Within our parameters, it was difficult to evaluate this aspect of the assessments. Defining what constitutes informational text and creating criteria for evaluating text quality will provide informative results.
RH: Getting [the methodology] developed delayed the start of the study, and during the study, we found that it was an imperfect fit for the assessments along several dimensions. We had a number of discussions about how to make the methodology function better, and I am sure the next study of this nature will benefit from those discussions.
MH: One of the biggest lessons I learned was that with any type of undertaking to evaluate assessments, it is imperative that a structured format be established and in place in order to have the most accurate information. The Fordham Institute did an amazing job of putting together a seamless process to evaluate these assessments that allowed us time to collaborate and focus on the review without getting distracted from the task at hand.
***
The Fordham Institute wishes to thank all the reviewers who lent their time, talents, and expertise to the project. Read more here.