The National Assessment Governing Board is in the middle of an enormous effort to revamp its framework for assessing reading, a central element of the National Assessment of Educational Progress. Frameworks set forth what is to be assessed and how that’s to be done. Changing them is harder than moving a cemetery, requiring years of lead time, costing much money, and entailing endless palaver among people with divergent views of the subject. Unfortunately, in the proposed set of revisions, the bad outweighs the good by a considerable margin
At the heart of the National Assessment of Educational Progress (NAEP) are lengthy yet little known documents called “frameworks” that, for every subject NAEP touches, set forth what is to be assessed and how that’s to be done. Here’s how the National Center for Education Statistics (NCES) describes them:
Frameworks define the subject-specific content and thinking skills needed by students to deal with the complex issues they encounter in and out of the classroom. The NAEP frameworks are devised through a development process that ensures they meet current educational requirements. Assessments must be flexible and mirror changes in educational objectives and curricula. Therefore, the frameworks must be both forward-looking and responsive, balancing current teaching practices with research findings.
These are bulky documents—the current math framework runs to seventy-five pages, the U.S. history framework to sixty-five—and they’re always the product of much heavy lifting over several years by multiple committees, contractors, and reviewers before final adoption by the National Assessment Governing Board (NAGB).
NAEP is now fifty years old, and its subject frameworks periodically need revision as curricular emphases, pedagogical practices, and state standards evolve. But changing a NAEP assessment is harder than moving a cemetery. It takes years of lead time, costs lots of money, and requires endless palaver among people with divergent views of the subject. Remember, it’s a national assessment, yet (in some subjects) results are reported for every state and nearly thirty big districts, as well as both public and private schools. Since the same assessment will be taken by school kids in Oregon and Texas, in Cleveland and Miami-Dade, in Vermont and Wyoming, it’s no easy matter to reach agreement on what to test.
Moreover, changing a frameworks risks (in NAEP parlance) “breaking the trend line.” Inasmuch as NAEP is America’s most valued source of information about changes over time in student achievement, losing the trend line is tantamount to starting over. Yet if the new framework and the tests based upon it differ in big ways from their predecessors, that’s what usually happens. It’s akin to what happens every time the College Board “re-centers” the SAT or replaces an AP framework. They try hard to deploy fancy psychometric techniques to equate the scores and “bridge” the trend line but that is not always possible and not always credible. (Think about it. If you test addition on three consecutive Fridays and subtraction on the next three Fridays, how do you determine whether a kid is better or worse at math in week five than in week two? You really can’t, not when what the tests are testing is so different.)
For all these reasons, NAGB doesn’t often replace its frameworks. But it’s currently in the middle of a humongous effort to do just that in the most core of all core subjects, namely reading. The reading framework dates to 2009, and the replacement effort aims to have a new one in place in time to guide the assessment in 2025 and thereafter. The replacement process commenced last year when NAGB charged newly constituted “Visioning and Development Panels” with recommending changes that would “maximize the value of NAEP to the nation” while also taking advantages of “the affordances [sic] of digital based assessment.”
We’re now at the stage where NAGB’s contractors and panels have presented a full draft of the proposed new reading framework (swollen to 149 pages), and public comment is invited through tomorrow.
Many, many comments have arrived (and more will in the coming hours), including my own gloomy assessment of what’s being proposed, which I, sorrowfully, provide here.
I’ve seen a lot of NAEP frameworks, revisions, and proposed revisions over the past several decades. This is, I believe, the first time I’ve ever seen one that leaves me feeling that the bad outweighs the good by a considerable margin.
To begin with a few key practicalities:
- The changes proposed here are so extravagantly comprehensive that I’m certain their implementation would break the NAEP reading trendline. This is no time to do that, certainly not with Covid-caused school stoppages upon us and with ESSA just five years old and unlikely to be reauthorized for a number of years. (If Congress moves on that at the same speed it moved on NCLB, a new law might be signed in 2028 and, presumably, kick in a year or two later.) The current reading trend line data—the one that incorporates accommodations—goes back to 1998, which means it spans the NCLB and ESSA eras. Going forward, it’s crucial that national, state, and TUDA performance in the ultimate core subject of reading stays on an unbroken line, not least because of the devolution of so many decisions to states that occurred with ESSA. How else will states know how they’re doing? How else will federal officials determine whether ESSA did kids—and which kids?—more good than NCLB?
- The changes are so extravagantly comprehensive that implementing them would cost tons more money than continuing with the present assessment framework and test design. (Just think of the expanded samples needed to yield valid data for the subgroups of subgroups that are being urged.) NAEP’s budget is stretched so tight already that vital twelfth grade state-level data on reading and math are missing and other key subjects cannot be assessed on a regular basis. When they are, it’s usually just a single grade and national-only. This is no time to burden the budget further! And because many of the changes proposed here will be hotly controversial in Congress and elsewhere, I see little prospect that they’ll lead to a budget increase. (The opposite is a lot more likely!) I’m sobered by the fact that the House Appropriations Committee recently rejected the administration’s request for an additional $28 million for NAEP and think NAGB needs to keep in mind that any changes that balloon the cost of one assessment will likely result in the bobtailing or elimination of others.
- Many of the changes in the proposed reading framework are so extravagantly comprehensive (see, for example “Shift #8”) that either all of NAEP must be changed to align with them—every subject, every grade level, etc.—or else reading will be analyzed and reported completely differently from the rest of NAEP. That’s deeply confusing for everyone and ultimately just unacceptable.
More cosmically, the developers of the present draft yearn for NAEP to be and do something more than it’s capable of and more than Congress ever assigned it to do. NAEP is more like a thermometer than a CT scan. It does one thing pretty well, which is to record the prowess of school children in large units and groups at handling the knowledge and skills they are expected to learn in key subjects. It does not explain why they’re doing that well or poorly. It’s not an experimental design, so it cannot account for causation and it cannot erase performance differences that exist, whatever the reasons may be. The most it can do is offer various correlations.
Of course children differ in their motivation, in their background knowledge, and in the opportunities they have had. At the micro-level, I see many such differences among my three grandchildren, notwithstanding that they share just about every “sociocultural” characteristic noted by the drafters. NAEP really can’t account for those things. It can, of course, distinguish between kids’ reading prowess, and it can divide the student population into the kinds of “subgroups” that are common in federal statistical programs of all kinds and specified in ESSA. (ESSA’s nine subgroups: Economically disadvantaged students, Children with disabilities, English learners, African-American, American Indian/Alaska Native, Asian, Native Hawaiian/Other Pacific Islander, Hispanic or Latino, and White.) Reading is one of the subjects (for grades four and eight only) where NAEP can also “sort” students geographically, i.e., by state and TUDA district. And of course NAEP can look at student populations at various levels of performance (whether on the vertical scale or by achievement level) to see how many of which student populations are in those performance levels.
Within rather severe constraints, NAEP can also seek correlations with school and classroom characteristics, but for this it depends on teacher and principal surveys. It also gathers whatever background information can be furnished by participating student test-takers, but that’s often shaky. And some key correlate data are growing shakier, especially the SES information, as many schools now include all their students, regardless of income, in federal nutrition programs. But NAEP gets into trouble when it gets inquisitive about children’s home circumstances—privacy considerations, touchy parents, ill-informed kids—and it’s also gotten into trouble when it has attempted to “explain” too much that bears on complex societal issues and policy debates as opposed to simply reporting.
All that’s by way of saying that one of the framework developers’ key impulses is a truly worrying overreach for NAEP. I conclude by observing—with some regret—that, on balance, American education and America’s children would be better served by retaining the present reading framework.
As state and district leaders face the challenges posed by Covid-19, safely reopening schools within the current budgets is first, second, and third on their priority list. At the same time, most students will start their schooling in the fall further behind than they have ever been, with the most disadvantaged among them very likely experiencing the greatest learning loss.
How can leaders address logistical issues of transportation, social distancing, hybrid and/or staggered learning schedules, and at-risk students and teachers, while not losing sight of the learning goals of schooling? In The Return, the Johns Hopkins Institute for Education Policy (IEP) and Chiefs for Change articulated powerful strategies to effect system-wide improvement—knowing that many will take time and collective effort to implement. What could be done rapidly to accelerate student learning, whether in face-to-face or remote learning models?
One of us, Johns Hopkins University’s Bob Slavin (designer of the well-known Evidence for ESSA) is well versed in evaluating educational interventions. Bob’s research has led him to the conclusion that tutoring is one of the most powerful interventions of all. In his work, “tutoring” refers to one-to-one or small-group instruction. Tutoring may involve one teacher or one teaching assistant working with one student, or one teacher or teaching assistant working with a very small group of students, usually two to four at a time.
In The Return, IEP and Chiefs for Change wrote about reconfiguring school staffing models to distribute instructional expertise more effectively, by (for instance) enabling at-risk teachers to remain at home, where they provide virtual instruction and/or support for students’ social and emotional well-being.
Tutoring, whether virtual or in school, could leverage such innovative models as part of a reconfigured and quite promising instructional design. Indeed, well-structured tutoring programs can produce gains in reading or math that are equivalent to about five months of learning beyond students’ ordinary progress.
In math, recent meta-analyses from Slavin and his colleagues of the best research on tutoring shows very positive outcomes: Tutoring to small groups in elementary math showed an intervention effect size of 0.30—a much stronger positive impact than any other single math intervention, including traditional strategies such as professional development for teachers. The numbers of rigorous studies in some categories of tutoring were not large, so these findings must be interpreted with caution, but it is important to note that, while all forms of face-to-face tutoring by paid adults had quite positive impacts on achievement, the outcomes were highest for one-to-small group approaches.
Similarly, in English language arts, Slavin and his team also conducted a meta-analysis of the most rigorous research findings of a range of interventions. And once again, tutoring was—by a substantial margin—the most effective overall. In this case, the findings for one-on-one tutoring were startling: The results from forty-six studies of one-to-one tutoring had a mean effect size of +0.41, or about five additional months of learning beyond what students ordinarily learn.
These gains are not restricted to more socio-economically advantaged students. Nearly all tutoring studies have been done with disadvantaged students.
More importantly, tutoring is a real-world intervention that has passed muster in the public arena. In Maryland, for example, Governor Larry Hogan very recently announced the state’s intention to invest $100 million in tutoring programs as part of the educational response to Covid-19. The funds are “for local school systems that implement tutoring and learning programs designed to help students in need.” The announcement cites the fact that “research [has] shown that the rate of learning gain can be improved with intensive tutoring.” Across the Atlantic, the United Kingdom and the Netherlands have announced similar investments of $1.24 billion and $278 million, respectively, in programs of which tutoring is an integral part.
Given the strong outcomes found in research on tutoring programs, leaders in the United States should follow the UK and the Netherlands in supporting national funding for tutoring programs. One strategy might be to introduce tutoring as a means of providing meaningful jobs to recent college graduates, who are entering the work force during a recession. But given the Covid-19 learning slide and its pernicious impact on America’s most vulnerable students, it is imperative that state and district leaders work towards standing up their own programs, in advance of any possible federal effort.
Slavin recently addressed how such an initiative in tutoring might be inaugurated:
The answer could be to build on organizations that already exist and know how to recruit, train, mentor, and manage large numbers of people. The many state-based AmeriCorps agencies would be a great place to begin, and in fact there has already been discussion in the U.S. Congress about a rapid expansion of AmeriCorps for work in health and education roles to heal the damage of Covid-19. Other national non-profit organizations such as Big Brothers Big Sisters, City Year, and Communities in Schools could each manage recruitment, training, and management of tutors in particular states and regions.
Tutoring also addresses a second critical problem in American education: our habitual practice of remediation. As one of us, David Steiner, recently affirmed, this effort—deeply demoralizing to students—just doesn’t work. Well intentioned teachers try to “meet students where they are,” but in trying to teach them all that they missed, are unable to bring them anywhere close to grade level work. Tutors, by contrast, can focus on the absolutely critical skills and knowledge that give students accelerated access to grade-level material. This concentrated effort will be compounded where school districts have adopted high-quality instructional material. Such curricula engage students with rigorous materials that are on, or sometimes even slightly above, the level of traditional grade-level content.
Tutoring also helps with an unfortunate but ubiquitous problem: Teachers who are given high-quality instructional materials are often tempted to water their content down or disregard it altogether. For example, 40 percent of teachers whose schools have adopted Eureka Math—a highly-rated math curriculum—water it down whenever they believe it to be too challenging for their students. Tutoring offers a counterweight, in which tutors can reassure teachers and equip students with the scaffolding and entry-skills they need to manage their classwork. In this way, tutoring could be effectively integrated with the school’s math and ELA curricula to achieve still greater leverage on student learning.
There are no “silver bullets” at the ready to close achievement gaps and substantially raise the performance of U.S. K–12 students. School systems are complex and imperfect. In addition to having to cope with vastly unequal background social-economic conditions, American educators work in a fragmentary system in which key elements have not been designed to work together. In the face of these realities, which of course pre-date and will post-date Covid-19, the health crisis adds a potentially devastating blow to the already low learning gains of so many children.
Of the single interventions that could be instituted at relatively modest cost and with quite rapid speed, tutoring stands out: The research base for its effectiveness is unusually consistent and strong, the practice is internationally endorsed, and there are many college graduates who will soon look for meaningful employment. The crisis of Covid-19 should and must find us attentive to the physical and mental health of our students—but let’s not forget that academic learning is still the core responsibility of all educators. Tutoring could help mitigate what would otherwise be yet another devastating outcome of our current crisis.
With Covid-19 cases on the rise and state budgets in crisis, federal lawmakers seem poised to pass another round of stimulus. It appears that K–12 education will receive a decent portion of the emergency aid, likely exceeding the $13.5 billion-plus provided to U.S. schools in the last package. A recent Washington Post article reports that federal lawmakers are mulling the possibility of tying K–12 funding to school reopenings—perhaps unsurprising given the president’s recent comments urging schools to reopen for in-person learning this fall.
No one can be sure whether the feds will come through with additional funds (and if so when), or whether they’d condition funding on reopening. But even if Congress doesn’t require it, state policymakers should—provided it can be done within federal rules—work to ensure that any forthcoming federal relief be used to help schools safely reopen. Here are three reasons why.
1. It’ll cost more to safely operate brick-and-mortar schools. The stringent health and safety measures that schools will need to implement are going to impose new costs. A National Science Academies report, for example, estimates $1.8 million in costs to a typical 3,200 student district. Among the expenses are hand sanitizer, masks and protective equipment, and additional bus routes. Schools may also need to purchase thermometers or scanners for temperature checks. It’s true that remote learning also poses some unique costs, most notably related to technology, but those expenses are much lower than the costs of safely operating facilities during a health crisis.
2. Reopening schools is critical to meeting student needs. A multitude of voices, including the American Academy of Pediatrics, have noted the serious harms to students, especially those with significant needs, when schools are shut. But absent prodding from state officials or parents, there’s little reason for schools to reopen this year. Some districts will receive the same funding regardless of their decision, and many face pressure from teachers unions to keep schools closed. Targeting supplemental aid to reopening could ensure that schools meet health guidelines, helping to reassure parents and teachers that it’s safe to go back.
3. Teachers and staff deserve additional pay for working onsite. Being married to a healthcare provider, I can appreciate the concerns that school employees have about going back to work. Even with safety measures in place, they’ll face greater risks of contracting the virus or spreading it to loved ones than teachers who work remotely. They’ll also have to carry out their responsibilities in an altered environment, wearing protective equipment and helping students not only learn English and math, but also how to be extra diligent about hygiene. Extra dollars would provide an opportunity to reward teachers who are going above and beyond the call of duty.
As is often the case with funding, several details about distributing funds to support reopenings would need to be ironed out. First, the distribution model would need to ensure that funds actually reach the schools that are open—a concern, for example, in districts that reopen their elementary schools but keep their high schools shut. Second, states may need a process that verifies schools are actually open. Third, the model should take into account student needs, so that high-poverty schools that reopen receive more supplemental aid than wealthier ones.
Reopening schools safely should be a national and state priority. Students, foremost, need schools to be in operation so that they can continue making progress after significant time out of the classroom. Many parents are also counting on schools to reopen so that they can get back to work. Targeting relief funds towards reopening schools would be another step in supporting students and families during these difficult times.
School funding mechanisms are the largest and perhaps most obvious levers for policymakers to pull when attempting to reform how education dollars are distributed. To wit, a new research report from a trio of scholars tells us that there were a whopping sixty-seven major school finance reforms (SFRs) across twenty-seven states between 1990 and 2014. From court ordered reforms to legislative initiatives to combinations of both, SFRs varied in scope and operation. In response, some states changed their funding formula, others changed how much “weight” they gave to particular student needs, still others looked to new funding sources. But what was the result of all of this change?
The study examines state-level variation in the effect sizes of SFRs on school spending among low- and high-income districts within states, as well as variation in the types of resources purchased. Analysts looked at whether the reforms improved outcomes on average in the poorest districts in the state and whether the reforms were progressive in nature: whether bottom-tercile districts benefited more relative to top-tercile districts. Using district-level household income from the 1990 census, they computed the average level of resources in the bottom and top terciles of the income distribution. To estimate these state-by-district income impacts by tercile, they used the recently-developed “ridge augmented synthetic control method.” The simplified (!) goal of this method is to obtain more equivalent matches between groups before the SFRs occurred. Ultimately, the researchers compare the twenty-seven states with SFRs—as their own individual case studies—to a single control group composed of the remaining states without SFRs.
Treatment begins the year after the reform is legislated or ordered. The researchers use data from the F-33 federal forms submitted by the SFR states that include total revenue; total expenditures; and subcategories of expenditures including instructional staff support services, capital outlays, and teacher salaries for school years 1989–90 to 2013–14. They also utilize full-time equivalent (FTE) student counts as reported in National Center for Education Statistics (NCES) data.
On average, across all SFRs, expenditures increased by 6 percent in low-income districts and 1 percent in high income districts. Overall, low-income districts increased spending in greater amounts relative to high income districts after the reforms, meaning that the SFRs had progressive effects on school resource allocation. Across low-income districts, a 10 percent increase in total spending corresponds to a 19 percent increase in capital spending and 6 percent increase in salary spending.
When they examined heterogeneous effects, however, the researchers found variation in how this plays out. Among low-income districts, ten states increased spending, two decreased spending, and in fourteen states, there was no change in low-income spending. Comparing low- and high-income districts, they find that a 1 percent increase in spending among low income districts is associated with a 0.46 percent increase in spending among high income districts. Further, as mentioned, results show that low-income districts boosted their spending on capital expenditures—renovation, repair, or new construction of facilities—rather than on other classroom-related items such as increasing teacher salaries or reducing class sizes.
Importantly, the researchers find suggestive evidence that many states without SFRs also increased spending among low income districts at the same or even higher levels than states that enacted SFRs. They speculate that spending changes could have occurred through referendum or when states earmark dollars to education from sources like gaming and sin taxes (see also class action payouts)—all absent a specific education mandate from court or legislature.
In sum, SFRs are generally intended to benefit low-income districts, but this did not come to pass in fourteen states which enacted specific reforms. What’s more, some states appear ironically to have made more progress on the progressive goals of SFRs by not adopting them. It’s hard to nail down why. But this much we know: School funding formulas are complex, labyrinthine, and sometimes capricious. Student migration, economic development independent of schools, and widespread recession can all serve to undermine even the best efforts of policymakers to improve education by funding reforms. And unfortunately, with the financial fallout of Covid-19 staring at us in the face, figuring out how to fund schools is about to get even more complicated and dire.
SOURCE: Kenneth A. Shores, Christopher A. Candelaria, and Sarah E. Kabourek, “Spending More on the Poor? A Comprehensive Summary of State-Specific Responses to Finance Reforms from 1990-2014,” retrieved from Annenberg Institute at Brown University (May 2020).
On this week’s podcast, Checker Finn and David Griffith discuss the flawed effort to revamp NAEP’s reading framework. On the Research Minute, Amber Northern and David Griffith examine how inequality has affected families’ engagement with online learning during the pandemic.
Amber's Research Minute
Andrew Bacher-Hicks, Joshua Goodman, and Christine Mulhern, “Inequality in Household Adaptation to Schooling Shocks: Covid-Induced Online Learning Engagement in Real Time,” NBER Working Paper #27555 (July 2020).