Testing, accountability, NAEP, and reading

Chester E. Finn, Jr.

11.23.2020

For those of us who still believe that results-based school accountability is an essential part of the education renewal that America sorely needs, not many things are looking great this week.

Before turning to recent developments that alarm me, however, and mindful that it’s self-promotion (as well as sound policy advice, of course!), the Hoover Education Success Initiative (HESI) has just released my own longish paper on the past, present, and future of school accountability, alongside a trio of excellent research papers on this topic by Tom Dee, by Paul Manna and Arnie Shober, and by David Steiner and Alanna Byorklund-Young.

In this paper, I look beyond ESSA to the next phase of school accountability—which should be pegged more to “readiness” than to “proficiency,” and should deploy both interventions and the school-choice marketplace as consequences for failing schools. Along the way, it suggests how states can make the most of ESSA, reviews the research evidence on results-based accountability as an effective reform strategy, and recounts the saga of how we got to where we are and today’s several challenges to this form of accountability.

Which brings me to the worrisome parts. I suspect that, in coming weeks, we’ll learn that the National Assessment of Educational Progress (NAEP) won’t be able to perform its ESSA-mandated biennial assessment of reading and math in 2021 due to the difficulty of testing a proper sample of U.S. students, not enough of whom will actually be in school during the testing window. I suspect they’ll try to reschedule for 2022, which would be a three-year interval since the last round.

Under normal circumstances, I’d say fine, three years (or even four) is generally enough to track slow-moving changes in kids’ reading and math prowess. Yet circumstances are far from normal. With all the Covid-induced shutdowns—some caused by actual disease spikes, some (as in New York City) by teacher unions—evidence is mounting fast that millions of kids have already experienced serious learning losses. We urgently need systematic NAEP-style data on achievement and achievement gaps, and we’ll need such data for years to come. But in addition to the data hole that postponing NAEP in 2021 will create, the federal example will encourage more states to suspend their own tests for another year and seek waivers from ESSA’s testing requirements. Education Week is already reporting much state-level pushback against standardized testing. Although Secretary DeVos has so far declined to grant a second year of such waivers, it’ll be harder for her to maintain that position through January 20 if she’s also asking Congress to defer NAEP. And even if she stands firm, President Biden’s education secretary will be pressed hard by the unions and (I assume) by many state and district leaders to waive the ESSA assessments this spring when he or she takes office.

That will leave a two-year data hole, which means no reliable external information on student learning and school performance since spring 2019 and—if nothing else changes—no more to be had until spring 2022. That’ll wreak havoc on value-added calculations and thus on school accountability, not to mention the millions of kids and teachers and local and state leaders who will be stumbling around without solid information on who has and hasn’t learned what. (Smart states might move their canceled spring 2021 tests to fall 2021 so that they can at least calculate achievement and growth during the 2021–22 school year.)

Can you handle more? Then join me in fretting once again about pending changes in the NAEP reading framework. The Assessment’s governing board (NAGB) met last week to review possible revisions in the draft framework that I (and many others) criticized a few months back. No final decision is to be made until March, but the revisions described to the board last week don’t solve the big problems that beset that draft. In some ways, they mask those problems.

Recapping the background: At the heart of every NAEP assessment is a lengthy yet little known “framework” document that sets forth in detail what’s to be assessed and how that’s to be done. Here’s how the National Center for Education Statistics (NCES) describes them:

Frameworks define the subject-specific content and thinking skills needed by students to deal with the complex issues they encounter in and out of the classroom. The NAEP frameworks are devised through a development process that ensures they meet current educational requirements. Assessments must be flexible and mirror changes in educational objectives and curricula. Therefore, the frameworks must be both forward-looking and responsive, balancing current teaching practices with research findings.

A framework is hefty—the current one for reading runs to seventy-three pages—and is always the product of much heavy lifting over several years by multiple committees, contractors, and reviewers prior to adoption by NAGB.

Frameworks need periodic updating as curricular emphases, pedagogical practices, and state standards evolve. But changing a NAEP assessment is harder than moving a cemetery. It takes years, costs lots, and requires endless palaver among people with divergent views of the subject. Remember, it’s a national assessment, yet (in some subjects) results are reported for every state and nearly thirty big districts, as well as both public and private schools. Since the same assessment will be taken by school kids in Oregon and Texas, in Cleveland and Miami-Dade, in Vermont and Wyoming, it’s no easy matter to reach agreement on what to test.

Moreover, changing a frameworks risks (in NAEP parlance) “breaking the trendline.” Inasmuch as NAEP is America’s most valued source of information about changes over time in student achievement, losing the trendline is tantamount to starting over. Yet if the new framework and the tests based upon it differ in big ways from their predecessors, that’s what usually happens. It’s akin to what happens every time the College Board “re-centers” the SAT or replaces an Advanced Placement framework. They try hard to deploy fancy psychometric techniques to equate the scores and “bridge” the trendline, but that is not always possible, not always credible, and even if it works OK during the first round of testing under the new framework, the equating doesn’t always endure.

For all these reasons, NAGB doesn’t often replace frameworks. But it’s well into a humongous effort to do just that in the most core of all core subjects, namely reading. The current reading framework dates to 2009, and the replacement effort aims to have a new one in place in time to guide the assessment in 2025 and thereafter.

That effort has been through one full draft (a whopping 149 pages), extensive public comment (mixed reviews there), and reconsideration of some of the most contentious elements by the NAGB committee with direct responsibility. The full board meeting the other day devoted ninety minutes to feedback from other members (though committee members and consultants spent much of that time explaining themselves and praising their own handiwork).

Much of the feedback was cautionary, to put it mildly, though it’s far from clear how the committee—and its myriad outside advisors—will accommodate that in the revised version that they’ll ask NAGB to sign off on in the spring. (Further delay would likely mean the new assessment can’t be ready in time for 2025—if indeed that remains a “NAEP year” after the 2021 disruption.)

Here are some of the main points of contention:

There’s a very high probability that the proposed new assessment would indeed break the NAEP reading trendline. The committee tried to finesse this problem at the board meeting by declaring that most of the old test items could be reused, but NCES’s NAEP major domo responded that an enormous number of new test items would have to be added. That’s a recipe for starting anew. Yet this is no time to do that, certainly not with Covid-caused school stoppages upon us and with ESSA just five years old and unlikely to be reauthorized for a number of years. The current reading trendline goes back to 1998, which means it spans the NCLB and ESSA eras. Going forward, it’s crucial that national, state, and TUDA performance in reading stays on an unbroken line, not least because of the devolutions to states that occurred with ESSA. How else will states know how they’re doing? How else will federal officials determine whether ESSA did kids—and which kids?—more good than NCLB?

The developers of the present draft seem to be trying to introduce into NAEP’s reading assessment every concern on the minds of contemporary practitioners. Perhaps most concerning, their approach to “leveling the playing field” by supplying various assists, clues, and cues to test-takers may actually conceal shortcomings in the reading prowess of millions of school kids and mask the failure of the schools they attend to teach them how to read well. And they’re overlooking well-established forms of “bias review” that already excise test items that rely overmuch on terms or knowledge that many students won’t fathom. (The “chat box” of the Zoomed NAGB meeting the other day made much of the possibility that the word “couscous” might appear in a reading passage. Yet under standard protocols we would expect “couscous” to bite the dust long before the reading assessment is administered—along with “semolina” and “quinoa.”)

Exotic vocabulary to the contrary notwithstanding, background knowledge is fundamental to students’ reading comprehension, yet the new framework’s architects are trying to make that reality vanish. Rather than accept decades of research about the importance of background knowledge in students’ reading prowess, the framework is trying to suppress it. This is sorely misguided. One of the solidest findings of reading research—like it or not—is the influence of background knowledge on reading comprehension. Students who know more about what they’re reading learn more from what they are reading. But the framework drafters view differences in students’ knowledge as a biasing influence on NAEP results that needs to be minimized. Thus, for example, they would have questions about reading content on the solar system that don’t give an advantage to students who know something about the solar system, questions on literary works that don’t give an advantage to students who know more of the vocabulary and settings in the literary works, and so forth. This is like trying to eliminate the effect of players’ heights when assessing the ability of basketball players. Reading ability and background knowledge are inextricable everywhere except in the minds of those who drafted this framework.

Budgetary woes. NAEP’s budget is in a parlous condition. NAGB has had to cancel scheduled non-mandated assessments because of insufficient funds. Launching a new assessment based on the draft reading framework will be very expensive. Persisting with this plan is akin to a family that is having trouble making its mortgage payments deciding to put in a new pool.

On top of all that, the new framework’s authors seem to trying to redefine reading itself. In the real world, reading is a process by which an individual derives meaning from print. The draft framework, however, intends to reconceptualize reading as the process of understanding multimodal content as presented in digital formats. Reading print content about the history of Civil War and answering questions for which an avatar or friend provides hints becomes reading. Watching a video about the solar system and answering questions that draw on information in that video becomes reading. And so on. Yet NAEP Reading is, by statute, a test of reading achievement, not a test of students’ abilities to integrate information from multimodal content. Relabeling the use of multimodal content as reading doesn’t make it so.

This sort of thing isn’t why Congress mandated the assessment of American students’ reading achievement via NAEP, and we must hope that somehow this gets fixed—or nixed—between now and March. Which probably means retaining—and perhaps lightly tweaking—the current reading framework for some years to come.

Of course, none of this will much matter if the United States gradually abandons on testing and results-based accountability!

Happy Thanksgiving.

Policy Priority:

High Expectations

Topics:

Accountability & Testing

Curriculum & Instruction

Governance

Teachers & School Leaders

Tags:

New York