Last week, in the wake of President Obama’s pledge to reduce the amount of time students spend taking tests, my colleagues Robert Pondiscio and Michael Petrilli weighed in with dueling stances on the current state of testing and accountability in America’s schools. Both made valid points, but neither got it exactly right, so let me add a few points to the conversation.

Like Robert, I don’t see how we can improve our schools if we don’t know how they’re doing, which means we need the data we get from standardized tests. But I also believe that—because we’re obligated to intervene when kids aren’t getting the education they deserve—some tests must inevitably be “high-stakes.” The only real alternative to this is an unregulated market, which experience suggests is a bad idea.

Must this logic condemn our children to eternal test-preparation purgatory? I hope not, but I confess to some degree of doubt. The challenge is creating an accountability system that doesn't inadvertently encourage gaming or bad teaching. Yet some recent policy shifts seem to have moved us further away from that kind of system.

As Mike noted, the problem of over-testing has been exacerbated in recent years by the requirement that all states adopt test-based teacher evaluation systems. This has increased the testing burden by as much as twenty-eight percent in some states. Like Robert, I think the tests themselves are less pernicious than the ubiquitous test preparation. However, I think it’s possible to distinguish between the stakes of the tests and the nature of the incentives they create, which too often drive teachers and principals to produce a short-term bump in scores instead of long-term results.

Consider the following scenarios:

First, imagine a teacher whose performance rating is partly determined by how much her students improve in reading. As Robert is quick to explain to the uninitiated, reading comprehension depends greatly on content knowledge, which is acquired in myriad settings both inside and outside of school. Consequently, it's difficult for this teacher to move the needle much in just one year, and the incentive to devote class time to academically questionable reading strategies is strong.

Now imagine a principal who runs a charter school that has just had its contract renewed. In five years, the school (and the principal) will be judged based in part on students’ growth in reading. Thus, what is most important isn’t this year’s test scores, but how kids are performing in five years’ time (a fact authorizers would also do well to bear in mind). An enlightened principal, realizing that five years of test preparation is unlikely to turn her low-performing third graders into high-performing eighth graders, might respond to this incentive by designing a curriculum that saturates students in content knowledge of all kinds—from the rings of Saturn to the legionnaires of Rome—thereby promoting real learning instead of the fake kind.

As these not-so-imaginary scenarios demonstrate, test-based accountability is most dangerous when timelines are short and when what's being evaluated can’t be controlled by who’s being evaluated. Conversely, it’s least dangerous when timelines are longer and the person or organization being evaluated has more control over what they being are evaluated on—as is the case with charters.

Now, back to the real world. Don’t some principals respond less constructively when confronted with the second scenario I described? And isn’t it true that as the decision to renew a school’s charter approaches, the incentive to ramp up the test preparation grows stronger?

Of course they do, and of course that’s true. I taught at a low-performing charter school while its charter was being considered for renewal, and believe me, the desire to raise scores sooner rather than later was palpable (and ultimately futile). Furthermore, there were just too many tests, most of which didn’t satisfy any requirement I was privy to, and none of which were used to inform instruction. To this day, I don’t know what motivated this approach, but it was probably the product of some vague impulse to become more “data-driven.”

If we get accountability right, a test-preparation culture will ultimately be self-defeating—and self-terminating—which will lead to a gradual reduction in the amount of testing drills and the like. I don’t mean to sugarcoat this process. It obviously won’t be perfect, and the decision to close a school is painful for all involved. Still, the only thing worse than an accountability system that encourages bad teaching is one that allows it to continue unchecked. In other words, if we’re going to make threats, we ought to follow through on them. Otherwise, we’ll get the worst of all worlds.

An important implication of the long view I’m arguing for is that it’s better to hold schools accountable for test scores than teachers, who usually get just one year with kids. This doesn’t necessarily rule out all forms of teacher value added, but it does rule out those that are based on reading scores, which are a big part of the current system. Robert makes a convincing case that we would be better off abandoning reading tests in the higher grades and testing for content instead. Yet accomplishing this would require that we first agree on what content should be tested—not just for English and math, but for science and social studies too. Based on the public reaction to the (utterly unobjectionable) content of the Common Core standards, I’m not optimistic about our ability get that specific—and I don’t think we should press the “pause” button on accountability until we can manage it.

Ideally, accountability systems should incorporate a number of long-term outcomes, such as college enrollment and course completion, in addition to measures of school climate (such as attendance). After all, there are many skills and attributes that tests don’t capture that matter greatly to a child’s long-term success, and it would be nice if we could recognize and incentivize their development as well. In the case of schools, this is becoming more feasible as state longitudinal data systems mature. In the case of teachers, however, it may never be possible because the appropriate timeline for intervention is too short (and because we’re still a long way from tracking “non-cognitive” skills in real time). We can’t keep bad teachers in the classroom while we wait to see if their former students graduate from college. Once again, it makes more sense to hold schools accountable than teachers.

In the end, the approach to accountability embodied by charter schools seems best because it respects the inherent complexity of teaching and learning. The system should seek to hold the school (or its principal) accountable for long-term results, including student progress on tests. The principal, in turn, should manage her teachers and other staff as she sees fit. And short-term, perverse incentives should fall by the wayside. 


Policy Priority:

David Griffith is a senior research and policy associate at the Thomas B. Fordham Institute, where he helps manage a variety of projects in Fordham’s research pipeline. A native of Portland, Oregon, David holds a bachelor’s degree in politics and philosophy from Pomona College and a master’s degree in public policy from Georgetown University. Prior to joining Fordham, he worked as a staffer for Congressman Earl Blumenauer…

View Full Bio