Warning: this one’s going to get super wonky.

A year ago, I published a five-part (!) series digging into the question of whether school choice programs’ impacts on test scores were predictive of their students’ long-term success—in other words, whether higher test scores among participants in such programs foreshadowed positive outcomes like high school graduation, college enrollment, college completion, and gainful employment.

It’s a crucial question, especially because of the ongoing debate over school accountability. If schools that help their pupils make gains on reading, writing, and math tests also help those same young people succeed in the real world, then it’s fair to make high-stakes decisions about the schools themselves based substantially on test scores (especially gains on those tests). It would make sense, then, to close chronically low-performing charter schools and kick chronically low-performing private schools out of voucher programs. But if test scores and long-term outcomes are completely unrelated, it should give us pause about these accountability measures—especially in the context of school choice, where parents can hold schools accountable by voting with their feet.

Most of us at Fordham have long been on Team Test Scores Matter, meaning we believe—and have seen evidence to suggest—that those scores are indeed predictive of future outcomes. Others in the school choice movement, including scholars Mike McShane, Collin Hitt, and Pat Wolf, have been on Team Test Scores Aren’t Reliable Measures; they don’t see a strong relationship between scores and long-term success.

A year ago, we went several rounds over what the research literature says about this question. I pointed to the ten extant studies that had looked at student achievement, on the one hand, and college enrollment or completion, on the other. All eight of the studies that had examined college enrollment found effects pointing in the same direction for both achievement and enrollment—suggesting a strong correlation between test scores and college-going. Their data included analyses of charter high schools in Boston, Chicago, and Florida. This also held true for one of two studies examining college completion; New York City’s school voucher program both improved scores and the rate at which graduates earned college degree. The other study, on Charlotte’s open-enrollment program, was the only one of the ten that didn’t find a correlation between test scores and future outcomes. That program lowered scores, although students participating in it were more likely to graduate from college.

As I wrote at the time, that track record looked pretty good for Team Test Scores Matter. The case was surely not closed, but was plenty encouraging.

Now we have more data, as the past week brought two new analyses examining the relationship between test scores and long-term outcomes. The first is an extension of an MDRC study that was included in the nine mentioned above that suggest a positive relationship. It’s a gold-standard evaluation of New York City’s small schools of choice program from the early 2000s. The previous edition reported positive impacts on both student test scores and future outcomes, including high school graduation and college enrollment. The new extension bolsters this, again finding that the program positively impacted scores, but also examines a different kind of long-term outcome: whether high school graduates were participating in a “productive activity,” i.e., enrolled in postsecondary education, employed, or both.

Yes, is the answer, when compared with control groups (who did not attend such high schools). So score another one for Team Test Scores Matter!

The second new study was conducted by Mathematica for the Institute of Education Sciences. It examines the impact of a non-nationally representative sample of charter middle schools on students’ college enrollment and completion—a major shift from most of the aforementioned studies, which look at high schools. Like the MDRC study, Mathematica provides an update on its earlier analysis of the impact of these schools on student achievement, as measured by test scores. Analysts found no impact on achievement for these middle schools—and, lo and behold, also no impact on college enrollment or completion. Another match between test scores and long-term outcomes (null impacts in both cases). Go Team Test Scores Matter!

But Team Test Scores Aren’t Reliable Measures also got two bits of good news from the Mathematica study. One: The original analysis found positive impacts on achievement for urban charter schools, and negative impacts on achievement for suburban ones; yet this pattern did not repeat itself with respect to college enrollment or completion. In other words, the positive test scores for urban charters and negative test scores for suburban ones were not predictive of later life outcomes.

(Interestingly, there was a statistically significant difference between the impact on college completion of charter schools serving large proportions of students of color versus the impact of other charter schools in the sample. But not all of those schools were the “urban” ones that had an impact on achievement a decade ago, and vice versa.)

The second piece of positive news for the other team was related to Mathematica’s analysis of whether individual schools in the sample improved test scores and long-term outcomes. Researchers found no statistically significant pattern—meaning that school-level test scores did not predict school-level long-term outcomes. This led Team Test Scores Aren’t Reliable Measures captain Jay Greene to do somersaults, shouting with glee that “changing test scores is not a particularly good indicator of schools that will improve their students’ lives.”

OK, Jay, enjoy your champagne. But do stay sober, for the school-level analysis did find a mildly positive association between test scores and long-term impacts. “Although most estimates of charter schools’ impacts on middle school achievement were positively related to impacts on college outcomes,” reads the report, “none of these regression coefficients (Table B.6) or correlation coefficients (Table B.8) were statistically significant.”

Which really isn’t too surprising. Once the study started slicing and dicing the sample by subgroups, and especially by schools, it reduced sample sizes. This meant that small improvements in outcomes would not be enough to be considered statistically significant. According to my communication with the study’s author, the bumps would need to be 9 or 10 percentage points to yield significance in small samples. Boosting college completion rates by so much would be a tall order for any program, much less a middle school.

***

What to make of this—both the two new studies, and the research literature as a whole? First, regarding the Mathematica study, we shouldn’t be surprised that the relationship between test scores at the middle school level and long-term benefits is harder to find than for high schools, given the time lag. Fade-out is a serious concern. But that doesn’t mean we should ignore what happens to kids prior to high school. Education is of course cumulative, and we need to worry about every step along the way.

Second, the relationship between test scores and long-term outcomes may be stronger for some types of students than for others. Across most of the studies, kids of color and low-income students seem especially to benefit from school choice programs—in terms of both test score gains and college enrollment and completion. This also means that all of us who care about putting more disadvantaged kids on paths to success should continue to see school choice as among our most promising strategies.

Third—and here is where I agree with Jay—we need to be careful about making judgments about schools based only on short-term test score changes. That’s for lots of reasons, but certainly one is that the connection between those scores and long-term outcomes isn’t settled, especially for elementary and middle schools. Still, the vast majority of studies do suggest that such a connection does exist. So we should neither ignore test scores altogether nor always defer to parents’ judgment.

The responsible policy is the one that the best charter school authorizers embrace: When one of their schools is chronically low-performing, they spend time with its parents, teachers, and students, seeking to determine what’s going wrong, and whether it has the potential to set matters right. If they decide that, in their professional judgment, the terrible test score results are indicative of deep-set problems, they move to close the school—ideally with a plan to open a much better one nearby.

Jay calls that approach “technocratic.” I call it responsible. And since Team Test Scores Matter has most of the current evidence on our side, I would argue that our view wins.

Mike Petrilli is president of the Thomas B. Fordham Institute, research fellow at Stanford University's Hoover Institution, executive editor of Education Next, and a Distinguished Senior Fellow for Education Commission of the States. An award-winning writer, he…

View Full Bio