Teachers will always have to figure out how to provide the right level of instruction to each student that’s neither so difficult it's overwhelming, nor so easy it's boring. Fitness studios face a similar challenge: provide a great, challenging experience for thirty-some students that accounts for great variation in fitness levels and goals and enables participants to gauge their own progress on metrics that they trust and understand. The approaches of one studio and one educational model are especially promising.
Almost a decade ago, I wrote that “the greatest challenge facing America’s schools today isn’t the budget crisis, or standardized testing, or ‘teacher quality.’ It’s the enormous variation in the academic level of students coming into any given classroom.”
All these years later, I still believe that’s true, and it feeds into current debates over whether teachers should meet students where they are, or aim for grade-level instruction instead, even for kids who are far behind.
Not that any of this is new. It goes all the way back to the one-room schoolhouse. As long as kids’ readiness levels are varied—in other words, forever—teachers will have to figure out how to provide the right level of instruction for each of their students that’s not too challenging as to be overwhelming, but not too easy as to be boring.
It wouldn’t be so hard if we could afford a private tutor for everyone. But that would be hugely expensive. And it would be fantastic if every student could learn on their own—whether in the old fashioned way, from books, or the modern way, from videos, online modules, and the like. But while independent learning is essential to a great education, few students have the drive and focus to do all of their learning that way without the support of teachers. Even adults struggle to finish MOOCs because they miss the human interactions and relationships of a classroom. And of course, schools do more than teach academics. They need to encourage and reinforce social and emotional skills, self-discipline, and good character, too.
So in the real world schools have to make this work with groups of students while individualizing as much as is practical.
In this article, I’ll take a look at the major approaches that schools are taking to solving this problem today—but will do so using the world of fitness to provide insights into their pros and cons. That’s because fitness studios have a similar challenge: how to provide a great experience for twenty-five or thirty students at the same time, one that meets everyone where they are, can cope with great variation in fitness levels and goals, challenge everybody (but not so much as to drive them away), and enable the participants to gauge their own progress on metrics that they trust and understand.
Group fitness works for lots of people because, as with private tutoring, one-on-one personal training is exorbitantly expensive. And while some gym rats can walk into a gym and get a great workout all by themselves, most of us need lots of support to exercise smartly enough not to hurt ourselves and intensely enough to get a real benefit. And many people find the social aspect of group classes fun and motivating.
The analogy isn’t perfect. Providing a great education to our children is much more important than providing a great workout to adults. And the intellectual challenge of teaching diverse skills and content to schoolchildren is infinitely more complex than being a fitness coach. Yet the parallels are still instructive.
Option one: Whole group instruction
This is the most traditional, but least personalized, approach to teaching a group of students: Throw everyone in the same class and aim for the middle.
That’s how it worked at the dawn of the fitness craze in the 1970s and 80s. Jane Fonda and other aerobics teachers offered one level for all participants. It was too slow for elite athletes and too hard for many newcomers.
It is also exactly what many classrooms were like for decades, and too many still are today. Take a group of kids whose only similarity is their age, put them in a room together, and do your best.
This clearly leaves a lot to be desired.
Option two: Ability grouping
Teaching children of vastly different reading or math levels, or clients of vastly different fitness levels, is inherently frustrating. There was, and is, an obvious solution: Group students by their current ability level.
In the fitness world, you see that with different levels of classes, like yoga levels one, two, and three. Newbies get intense instruction in the basics, while experienced and accomplished students can challenge themselves to find the edge of their capacities. But this has its own drawbacks. Studios worry about schedules that become overly complicated and keep clients away or classes that become too small to be financially sustainable.
That sort of ability grouping has also been common in schools forever. In elementary schools, it usually means putting students into reading or math groups for part of the day, based on their current skill levels, or providing acceleration to gifted students in or out of the regular classroom. In middle and high school, it means having different classes for advanced students—honors and AP and the like—and others for kids who are “on-level” or below.
There’s lots to be said for this approach, but one of the biggest concerns in schools is that students in the lower groups may not make enough progress to catch up, or might not ever get out of that low group due to inadequate support. We also worry about segregation, given the harsh realities of achievement gaps and what they imply for kids’ readiness levels on average.
Option three: Differentiating instruction
What if you could have the best of both worlds: keep students, or clients, of different levels together in one group, while also providing a personalized, “differentiated” experience, so everyone gets what they need? Sounds great, sure, but it’s really hard to pull off. Maybe careful planning and the clever use of technology can help.
Enter OrangeTheory. This fitness company has studio franchises all over the world. Every day, a new workout comes out from headquarters, which combines the use of treadmills, rowers, and weights—nothing fancy. Instructors take groups of twenty or more students through the class, rotating among stations. It’s personalized because every client chooses their own pace on the treadmills and rowers, and size of the weights on the floor. But everyone is also wearing heart rate monitors to make sure their effort is hitting a target. The right amount of intensity is key to good results.
Some new educational models are experimenting with a similar approach. Teach to One: Math, a middle and high school math program designed by New Classrooms, is arguably the best example. It is designed to personalize instruction to help all students make as much progress as possible toward college-ready standards.
Here’s how it works: At the end of each day, students take a brief assessment to gauge how well they have mastered the math they’re working on. Overnight, an algorithm designed by New Classrooms figures out the exact skill each student is ready to learn next, as well as the “modality” that would be the best fit—like whole group instruction, small group instruction, or online learning. In the morning, kids look up at “airport monitors” to find out what and where they will be learning that day, and off they go. Most of the instruction is done with a teacher, in large or small groups, but those groups are constantly changing, bringing students together who are all ready to learn the same skill.
To bring this vision to fruition, New Classrooms had to dissect state math standards and understand precisely what students needed to know and be able to do before moving onto the next level. It also had to find the best teaching materials—whether for whole group, small group, or online instruction. None of this was easy, or really about “technology,” but about a deep understanding of math and learning. And it’s reasonable to wonder whether it would work for any subject other than math, given how non-linear many other domains of knowledge can be.
In both the fitness and school models, the role of the teacher changes in important ways. They are not expected to play curriculum developer or workout designer. That’s handled centrally. But this frees them to focus their attention on helping students, in the moment, building relationships, and giving personalized instructions and corrections. Of course, it still takes lots of sound judgment and skill to make it work.
Technology obviously plays a key role in these approaches—the heart rate monitors in the case of OrangeTheory, and the “airport monitors,” algorithms, and online modules for Teach to One. But the technology is not front and center; it plays a supportive role from the backroom.
What’s most important, both for fitness and for learning, is the effort that students put into it. Intensity is key. And that requires motivation. So both models are obsessed with keeping their students motivated—with regular reports about the progress they’re making, coupled with much encouragement from coaches and teachers.
OrangeTheory and Teach to One aren’t the only fitness studios, or educational models, trying this kind of approach. SoulCycle studios, CrossFit gyms, and others offer group classes with personalized or “scalable” experiences, and there are zillions of apps (Strava, for instance) offering stickers and rewards for meeting your fitness goals. Peloton even offers a virtual version of this approach, with real-time group spinning classes broadcast directly into people’s homes.
And other education innovators are trying to make personalized pacing work within school settings, many of them using some sort of blended learning, combining teacher-led instruction and online resources. One fascinating example is Wildflower Schools, which combines the timeless principles of Montessori education (arguably the first personalized model) with the cutting edge use of technology.
No single fitness chain or school model will work for everyone—which is another reason choice is so critical.
What’s important is that OrangeTheory isn’t just an app or a discrete exercise product you can buy. It’s a totally different fitness experience that thoughtfully integrates instructors, technology, and equipment around the needs of each gym-goer.
Similarly, Teach to One isn’t a software product or tool for teachers to use in whichever way might make sense. It’s a fundamentally different classroom experience that integrates teachers, technology, and classroom materials into a holistic learning model to support the needs of each student.
The good news is that we’re making progress—starting to figure out workable ways to give groups of students a personalized experience, allowing them all to make progress at their own pace, and using technology in clever, effective ways.
In both fitness and in academics, we’re at the start of a journey. Let’s keep moving ahead.
Half a century has passed since I first fell through the looking glass into the peculiar world of federal education research and development. As an extremely junior domestic-policy aide in the Nixon White House, I helped Pat Moynihan, Jim Allen, George Shultz, and others craft what, in March 1970, became a presidential message to Congress proposing creation of a “National Institute of Education” (NIE). Two years later, it came into existence and it’s been reinvented and reconstructed twice since then—plus innumerable fine-tunings—into what is now the Education Department’s Institute for Education Sciences (IES). I had the honor to preside for three years—while working with education secretary Bill Bennett—over one of those interim iterations, the Office of Educational Research and Improvement (OERI).
Behind both the first launch and the later rebootings was widespread frustration that, unlike so many other key realms of our national life—health care with NIH, basic science with NSF, the Agricultural Research Service, the Energy Department’s national labs, the Pentagon’s DARPA, and much more—education had no organized, purposeful, and coherent home for basic and applied research. Yet education, even back then, more than a decade before A Nation at Risk, was understood to be in trouble; it simply wasn’t working very well, its outcomes were both weak and uneven, its gaps were wide, and its return on investment was inadequate. The ambitious intervention programs of the Great Society—Title I, HeadStart, Upward Bound, more—weren’t yielding the hoped-for results. Yet there was no clarity of a scientific sort as to what might cause this fundamentally important national enterprise to work better.
Simply throwing more resources at the problem wasn’t likely to do the trick. By 1970, we already had James Coleman’s penetrating (and disillusioning) big study, as well as discouraging early evaluations of the big war-on-poverty education initiatives. Though Nixon was accused of proposing research instead of budgeting more money for such programs, Moynihan and others—including Congressman John Brademas (D-IN)—recognized that more needed to be understood about the mechanisms of teaching and learning and the sorts of interventions (if any) that might yield better outcomes. And to get that work done, the education-research train needed an engine and conductor.
All these decades later, that’s still the goal. IES’s mission today is “to provide scientific evidence on which to ground education practice and policy and to share this information in formats that are useful and accessible to educators, parents, policymakers, researchers, and the public.”
In pursuit of that ambitious mandate, IES has become far more adept, sophisticated and determined to conduct education research in ways that yield trustworthy—and, with luck, actionable—results, at least when the results of a study show that something actually made a difference! Beginning in 2002, the first IES director, Grover J. (“Russ”) Whitehurst, insisted on scientific rigor, preferably via research studies that, like serious appraisals of the efficacy of medical procedures, medications and devices, follow proper experimental procedures, commonly known as “randomized controlled trials” (RCTs). His successors, for the most part, have continued that emphasis, despite much squawking from the education-research “community,” and mindful that some important issues, such as strengthening school governance and leadership, don’t lend themselves to that form of investigation. Other potentially valuable RFCTs turn out to be politically or morally difficult to mount, particularly when they entail denying the “control” kids an appealing form of help, intervention, or innovation that the “treatment” youngsters are receiving.
Despite notable progress on the research side, however, IES today remains a stunted little tree among the tall timber of federal research agencies. Despite hundreds of informative studies and vast troves of essential data (for it also contains the National Center on Education Statistics and the National Assessment of Educational Progress), it has very little money and enjoys neither the visibility nor the stature among education practitioners and policymakers that, for example, NIH has among doctors.
Nor has IES come close to triumphing over the predilections and practices of the thousands of ed-school professors, think-tank denizens, and Beltway bandits who flock to the annual meetings of the American Education Research Association and its many affiliates. Many of those folks engage in much simpler (and sometimes less costly) modes of research, such as before-and-after comparisons and classroom observations. Much of what they do is subjective and impressionistic. (The polite term is “qualitative.”) A couple of private foundations share the IES commitment to rigorous research designs, but by no means has that approach conquered the field. And when funds are as scarce as they are—IES is funded by Congress in a miserly fashion—there’s much frantic jockeying for the available dollars and plenty of efforts to use lobbyists and friends on Capitol Hill (as well as friends deep in the “peer review” process for reviewing grants) to get scarce funds directed toward oneself or one’s institution.
IES also suffers from too many masters with widely differing priorities—and a dearth of strong political backing to help withstand stakeholder pressures. As education historian Ellen Condliffe Lagemann wrote in Educational Researcher back in 1997—and no less true today—“Members of arts and sciences and humanities faculties still tended to be dismissive of educationists, some of whom retaliated…by urging more professionalization….Still arrogant toward practitioners, many educationists continued purposefully to distance themselves from the diurnal problems of teachers and other school personnel, while large number of practitioners were still prone to dismiss education research as mere theory that had little chance of bringing new insight to their work.”
Given those profound constraints, it’s been difficult for IES to mount the kinds of large-scale, long-term research that might yield major breakthroughs. One can fairly blame infighting and irresolution within the “field” for this unhappy state of affairs, but at least as important are the agency’s skimpy appropriations and the fact that a huge fraction of its limited moneys routinely get gobbled up by a handful of not-very-useful feeders at this trough that have managed to retain friends in the appropriations process even though their contributions to education research have been paltry.
The total IES budget—about $600 million in Fiscal 2018, alongside $7.8 billion for NSF and close to $40 billion for NIH—is predictably complicated, as this agency has multiple units and obligations. Suffice to say, the amount available for general research in education that year (not including special ed, for example) was about $186 million, a sum that had barely changed over at least half a decade. And after all its continuing commitments to sundry projects and dependent organizations were met, that left barely $55 million to support “new research awards and enhance dissemination activities.”
Considering that American education—just the K-12 part—is a $650 billion enterprise, it’s simply laughable that the most explicit federal investment in finding new ways to make it work better is a sum that forces you to get to five decimal places on your calculator before you can even detect its portion of the total.
To be continued next week...
From AOC to Stuyvesant to the Varsity Blues scandal, high-stakes assessments have returned to their role as punching bags after a brief hiatus. Back in 2015, it was the “opt out” movement, Atlanta, and President Obama’s “testing action plan” that captured the media’s imagination. This time around, the heightened attention carries even greater implications because, even though some of the anti-testing fervor has subsided, philanthropic support for standardized testing looks particularly bleak.
Standardized tests certainly have their flaws, but the truth is that all indicators are flawed. Nevertheless, civil rights groups, among others, value their illuminating powers. Moreover, even if the Almighty himself appeared with a divine set of assessment tools, it’s unclear whether the rancor around testing would be completely quelled. In fact, the incessant noise may be less about assessment quality and more about an aversion to accountability. Notwithstanding the dim prospects for ESSA reauthorization next year, or for the foreseeable future, it’s cold comfort to witness the accumulation of anti-testing kindling.
Enter innovation guru Tom Vander Ark earlier this month with a book of matches. In a provocatively titled piece, “A Proposal to End Standardized Testing,” Vander Ark indulges his palpable distaste for state-mandated tests and joins the growing chorus of those locking arms to rid schools of them. Unlike most people in that ensemble, however, Vander Ark deserves credit for offering potential solutions at a time when everyone recognizes the need for better tests.
Problem solving is particularly important because education advocates and others who work tirelessly to defend the current testing regime under statehouse domes across the country are engaged in a game of Catch-22. On the one hand, education reformers recognize the tension between annual testing and the latest trends (e.g., personalized learning, which can by design focus on off-grade level content). On the other hand, they are hesitant to broach the topic because doing so might risk providing an opening for opponents who are all too eager to dismantle the assessment and accountability edifice. It appears we are at an impasse. But the fly in the ointment is that doing nothing could further exacerbate these tensions, and ultimately result in the collapse of annual testing that reformers are working so strenuously to avoid.
Vander Ark believes that the opportunity costs associated with standardized testing aren’t worth the nominal price tag. He writes, “It’s time to end a century of standardized testing and focus instead on helping young people do work that matters. We no longer need to interrupt learning and test kids to find out what they know.” It’s a questionable assertion, not only because any interruption attributable to state tests themselves is relatively small, but also because of the false choice created between assessments and meaningful learning.
An enthusiastic cheerleader for digital learning, Vander Ark argues that artificial intelligence and other technologies should be put to better use to “take advantage of everything teachers know about their students.” This would include leveraging the current wealth of formative assessment data using the fantastical sounding “cumulative validity” as part of the next generation of assessments. Vander Ark writes:
An example of cumulative validity is 500 data points from six sources collected over eight months about a middle-grade student’s progress on ratios and proportions. With that much information, you have a pretty good idea of what they know and you don’t need to start from scratch with 50 new questions—but that’s exactly what standardized tests do. (Adaptive assessments can automatically adjust difficulty and short cut the process but they still don’t take advantage of what is known about a learner trajectory.)
He suggests a three-step process for states to adopt to move beyond standardized tests, which essentially involves seeking a waiver from current testing requirements and doing a comparability analysis between the current system and this technologically enhanced one. Vander Ark believes his approach would work for reading, writing, and math—and that any federal policy barriers to this can be easily removed.
In what might be described as an assessment moonshot—though this rocket seems squarely pointed at Neptune—Vander Ark’s optimism is both admirable and doubt-inducing. For starters, he is grossly overestimating the current capabilities of state departments of education, the agencies that would likely be charged with carrying out such a system. And even if SEAs could be magically endowed with boundless capacity and unlimited resources, I can only begin to imagine the other obstacles that would stand in the way of operationalizing something so Byzantine. (I’m not even going to ask—though I’m admittedly curious—what Vander Ark has up his sleeves to overcome Beltway gridlock.)
Vander Ark’s ideas will doubtlessly appeal not only to standardized testing foes, but also the subset of technophiles within the reform community who are ready to pursue the two in the bush. The question is whether his proposal is workable. Suffice it to say, I harbor some doubts. But skepticism aside, I share Vander Ark’s dissatisfaction with the tests we have today. More R & D is badly needed.
My colleague Checker Finn is absolutely right that reformers should proceed with caution. The political and technical challenges to what Vander Ark is proposing cannot be overstated. It would also be prudent to keep in mind the “underlying forces” around testing and accountability that Andy Rotherham astutely points out. Whatever lies ahead, our students—and our country—cannot afford to return to the dark days of no testing. Any brave new world mustn’t compromise on equity and the continued need to shine a bright spotlight on results, especially with the student populations who are too often marginalized by society.
Three years ago, we released a study on school closures in Ohio that found mostly positive results for displaced students, particularly when those students transferred to higher-quality schools. The latest edition of Economics of Education Review includes a study of school closures in Philadelphia and finds a similar pattern. Matthew Steinberg and John MacDonald examine the impact of closures on student achievement and behavioral outcomes—specifically absences and out-of-school suspensions (OSS) up to three years after closure. The School District of Philadelphia closed more than 10 percent of its lowest-performing and most under-enrolled traditional schools between the 2011–12 and 2012–13 school years. In total, twenty schools closed with over 3,800 students displaced.
Analysts use student-level data for pupils in grades three through eighth attending a traditional school in the 2010–11 through 2015–16 school years. Their difference-in-differences approach compares changes in student outcomes for displaced students who were required to change schools and their receiving-school peers, relative to students attending schools that did not receive displaced students in the year after closure. They are able to demonstrate that pre-closure student achievement trends evolved similarly for the three groups, which is key to the efficacy of their study design. Since displaced students left lower performing and more disadvantaged schools (on average), the change in outcomes experienced by students in the receiving schools may simply reflect the inclusion of the displaced peers. So they take care to separate the groups, estimating the effect of closures on displaced students and those in the receiving schools, as well as distinguish between students in non-closed schools that did and did not receive displaced pupils.
The key finding is that closing schools had, on average, no impact on the academic achievement of displaced students. However, achievement improved significantly for displaced kids who enrolled in higher-performing schools (cue our Ohio study), and schools with a lower concentration of displaced students following closure. In contrast, pupils attending schools that received displaced students experienced a significant decline in academic achievement; specifically, they saw a decrease of 0.04 standard deviation in math and 0.06 SD in English language arts by the end of year two. The decline in achievement was greatest for students attending schools with the highest concentration of displaced students (approximately 25 percent).
School closures also affected the behavioral outcomes for both displaced and receiving-school peers. For displaced students, closures significantly increased the days of school missed due to absences; that impact was even more negative for displaced students attending schools with a higher concentration of other displaced peers. Absences and OSS days also increased as the distance that displaced students travelled to their new schools increased (e.g., those who travelled an additional one mile to their new school realized a 5 percent increase in days missed due to OSS). Finally, OSS and absences were greatest among the receiving-school students whose schools had the highest concentrations of displaced students, compared to receiving- school students in schools with lower concentrations.
As you can see, the study is chock full of findings for various groups by various outcomes. But its message is simple: Closing low-performing or under-enrolled schools is necessary but should be done with care. Unfortunately, that’s way easier said than done. Students need to be placed in higher performing schools to see an academic payoff, but there aren’t enough of them. And when you find them, local officials shouldn’t relocate large cohorts of displaced students in any one school since they likely won’t be well accommodated en masse (perhaps due to negative peer effects?). But if you try to spread them out, make sure the kids aren’t travelling long distances. Geez, that’s a tough job!
SOURCE: Matthew P. Steinberg and John M. MacDonald, “The effects of closing urban schools on students’ academic and behavioral outcomes: Evidence from Philadelphia,” Economics of Education (2019).
The Data Quality Campaign, an organization dedicated to advocating for effective educational data policy and use, recently released its third comprehensive review of school report cards in all fifty states and D.C. This year’s is particularly important because it marks the first time that states are required to report information under ESSA requirements.
Overall, DQC finds that the majority of states have report card systems that are easier to find and use than in years past. Forty-two states have report cards that, because they appear within the top three results of an internet search, are considered easy to access. Report cards are also easier to use. Many families now access school information using mobile devices, so it’s good news that thirty-one states have a mobile-friendly version of their report cards. Thirty-five states also offer downloadable data, which allow families, policymakers, and analysts alike to dig deeper into the numbers.
DQC notes that several states have made significant changes to their systems. The places that boasted the biggest improvements focused on improving design, including more and better data and experimenting with different processes, such as partnering with external vendors. In general, three basic design approaches are used by states: 1) the one-stop shop, which organizes report card data in a single resource; 2) the parent-facing front door, a landing page for moms and dads that links to a separate, more wide-ranging data site; and 3) the data hub, which typically takes the form of a dashboard and allows users to explore data in different ways. Many states have also started providing helpful definitions for various technical terms and data elements. When this is done well, definitions are easy to understand and summarize why the data matters.
Despite these bright spots, there is still plenty of room for growth. According to DQC, many states need to work harder to make report cards easy for all citizens to understand. Only fifteen translate information into a language other than English, for example, and text is often written at a postsecondary reading level, which DQC measured by using hemingwayapp.com. Many report cards also lack critical information about student performance. A whopping forty-two states do not include disaggregated achievement data for at least one federally required subgroup, and twenty-one still don’t disaggregate data based on gender—a requirement that’s been in place for twenty years.
Many states also lack important non-academic data on school report cards. Over half exclude discipline data, such as suspensions and expulsions. Twenty-seven forgo postsecondary enrollment numbers, though several states do report that data in places other than school report cards. And twenty-six states lack data on the number of inexperienced teachers, teachers with emergency or provisional credentials, or educators who are teaching outside their field of expertise.
A few states are singled out as making considerable progress since DQC’s last review. Mississippi, for instance, released a brand new design that is more comprehensive and easier to navigate than its previous iteration. Texas’s report cards offer parents a “show me how it works” feature that breaks down each indicator with simple illustrations and text explanations. And Pennsylvania includes data on the variety of pathways that students take after high school, including disaggregated military enlistment rates and the number of students who entered the state’s workforce.
DQC is right that parents and taxpayers need information about how subgroups are performing, and they’re right that data should be communicated simply and clearly. Let’s hope this review helps nudge more states in that direction.
SOURCE: “Show Me the Data: States Have Seized the Opportunity to Build Better Report Cards, but the Work is Not Done,” Data Quality Campaign (April 2019).
On this week’s podcast, Jessica Sutter, a newly elected member of the DC State Board of Education, joins Mike Petrilli and David Griffith to discuss the politics of Washington’s ed reform scene. On the Research Minute, Amber Northern examines how Philadelphia school closures affect academic and behavioral outcomes.
Amber’s Research Minute
Matthew P. Steinberg and John M. MacDonald, “The Effects of Closing Urban Schools on Students’ Academic and Behavioral Outcomes: Evidence from Philadelphia,” Economics of Education Review (April 2019).