Tom Vander Ark is a very smart guy who cares deeply about education, has wide-ranging experience in it (including service as a district superintendent), and knows far more about technology than I do. I like and respect and often agree with him. And perhaps the grand, arguably utopian, scheme he is proposing by which to “end a century of standardized testing” can one day come to pass. In the short run, however, I fear his fanning of the anti-testing flames could deliver a serious setback to results-based accountability for American schools and immerse our education system even deeper into Lake Wobegon than it already is.
Tom wants to replace our current statewide assessment regime with “cumulative validity,” illustrated by “500 data points from six sources collected over eight months about a middle-grade student’s progress on ratios and proportions.” (If you can’t quite puzzle out that passage yourself, join the club. His piece sometimes exceeds my comprehension, too.)
A sophisticated, comprehensive, well-integrated high-tech information system, properly fed from multiple sources and lubricated by artificial intelligence, Tom says, could yield this appealing alternative to today’s week or two in mid-spring when learning comes to a halt in many a school as the standardized testing machinery cranks up. A colleague who read his description terms it “techno-utopianism” and, if viewed as a complete system, it feels that way to me as well. In support of his vision, however, Tom cites several examples of programs, schools, and even whole states that are taking steps in this direction, including existing “diploma networks” such as the International Baccalaureate program, which he terms a “comprehensive outcome framework” that still falls short of a “schoolwide model with strong systems and supports.”
It’s great when schools, districts, and states to go as far as they can to gain comprehensive information about student learning, to do so in as unobtrusive fashion as possible, and to feed it back to instructors and administrators in ways and at times that foster mid-course corrections, not just end-of-year appraisals. A constant feedback loop is far preferable to the three-times-a-year testing that we typically get from “formative assessment” offerings such as MAP and iReady. So let the R & D and experimentation continue, by all means.
But be aware of the risks, which are real, and the obstacles, which are many. (Most obvious is the need to change federal law! If Tom truly thinks Congress is going to revisit ESSA anytime soon, he’s engaging in utopianism or chemical alteration far beyond my capacity.)
I see several overarching hazards in this approach. Start with the possible loss of comparability. Will it still be possible, under Tom’s scheme, to know how Boston’s fourth graders are doing in math when compared with those in Wellesley or whether the schools of St. Louis are yielding a greater ROI (in terms of student growth, let’s say) than those of Kansas City? Do we not risk reviving the soft bigotry of low expectations if kids in Baltimore get held to a lower standard than their classmates in Bethesda? (Due to the eclipse of the Common Core and the shrinkage of PARCC and Smarter Balanced coalitions, it’s already very difficult to make valid interstate comparisons except via NAEP.) I’m sure it’s possible, with enough data points and enough fancy AI, to gin up some valid intra-state comparisons—but are parents, taxpayers, and legislators really going to regard a mysterious algorithm as a trustworthy and uncontroversial basis for stating whether the Lincoln School is more effective than the Jefferson School? And will such techie evidence “stand up in court” when the time comes for someone to intervene in the school that keeps failing?
Speaking of those algorithms, even if they’re so smart as to define success based on some combination of data points (in a way that’s beyond my present ken), what if students in one place have a different “success profile” than those somewhere else? In one place students may be better at homework and another place they are better at group projects. How would we compare these places?
Most of the sources of data going into the new system and the computers that rule it will be based on teacher judgments of one kind or another, without trustworthy external verifications. (Exam-based systems such as International Baccalaureate and statewide end-of-course exams for high school courses can ameliorate that risk, but eventually they, too, will run afoul of Tom’s animus toward “standardized” testing.) Teacher judgments are indispensable, of course, but they (like “parent satisfaction”) must not be the sole determinants of whether a kid is learning or a school is succeeding. They’re too easily corrupted in too many ways, as we already see in rampant grade inflation, in teachers pressured by their principals not to give failing grades, and in teachers worried about their own careers if they confer too many C’s to the children of politically influential (or simply loud and annoying) parents. (And those are the teachers who believe in their hearts that the disadvantaged kids in their classrooms could achieve at far higher levels—which is by no means all teachers!)
Insofar as sources beyond the classroom teacher will gauge students’ learning under Tom’s plan, the information they analyze for that purpose is inevitably stuff that can be found on screens of one sort of another. Isn’t that bound to drive more and more instruction onto screens, resulting in student work that can be evaluated from far away, which isn’t necessarily the kind that’s best and most enriching for kids. (I’m not even going to get into the potential privacy pitfalls other than to note that they lurk.)
External verification of student learning and school effectiveness still seems to me essential for any kind of accountable education system, as are comparability from place to place and over time, and transparency regarding the actual performance of children, classrooms, schools, districts, and indeed whole states. As a recent alumnus of the Maryland State Board of Education—and one who labored for months, along with fellow board members and in the face of much opposition from the “school establishment” and its political allies to forge a new statewide accountability system that complies with ESSA—I’m mindful of the limits of standardized assessments and the many important questions they cannot answer, the many educational problems that they do not solve, and the many needs of education practitioners that they cannot meet. I’m also mindful of the costs, burdens and inconveniences that they bring, particularly when the stakes are high and teachers drop much else for weeks to help their pupils prepare. But we’ll be even worse off as a state—and society—if we succumb to the allure of no testing, or if we replace the regimen we’ve got, flawed as it is, with something hazy, technologically complex, and even more vulnerable to human error, wishful thinking, and outright chicanery.
Tom has done well to sketch a future we might one day possibly achieve, and I hope the experimentation continues. But let’s be very cautious before we replace the devil we know with one that may dwell in an even hotter place. Think ahead to the day when it gets revealed that the governing algorithms of our new assessment system are biased against this or that group. (Or perhaps it’s just that a group looks worse when seen through that algorithm.) Not only will the whole system be back in state and federal court—and before investigators, prosecutors, litigators, and legislators. It will be getting reviewed by people who haven’t the faintest understanding of how it actually works, and who can’t make head or tail of the possible corrections that Tom and his fellow techies are suggesting to address the problem, much less imagine the unintended consequences that will follow.