Like everyone else, we education reformers would love to have a crystal ball. Yet, in practice, predicting the performance of schools, like almost every other form of prediction, is inherently challenging.
Still, it’s essential that we do our best, particularly when it comes to forecasting the performance of proposed charter schools. After all, despite the growing pile of research that suggests charter schools outperform traditional public schools on average, the U.S. has too many mediocre or downright bad charters and too few truly excellent ones.
That’s why the National Association of Charter School Authorizers (NACSA) has developed various resources that outline “best practices” for its members (and other authorizers) to use when reviewing the plans of would-be schools. But as sensible as these practices are, they are largely the product of accumulated wisdom and experience. And there are certain questions, such as whether some practices make a bigger difference than others and where those reviewing charter school applications should focus their attention, that they cannot answer.
Thus, the need for empirical research on the evaluation of proposed charter schools by authorizers. Yet there has been strikingly little investigation of this vital subject, with the exception of a 2017 Fordham study conducted by Anna Nicotera and David Stuit, Three Signs That a Proposed Charter Schools is at Risk of Failing, and NACSA’s subsequent expansion of that analysis. For example, to our knowledge, there is essentially no research on one of the most fundamental questions—namely, whether the applications that authorizers rate more highly tend to produce schools that perform more strongly.
One reason for that deficit is the challenge of bulky, non-comparable data. Actual charter applications are rather cumbersome, and their format varies from one authorizer to the next. But an even bigger challenge is sample size. Nationally, only a handful of entities have authorized enough schools to make a rigorous quantitative analysis possible.
Which brings us to North Carolina, where the state’s sole authorizer, the State Board of Education, has now presided over the creation of more than one hundred schools and the closure of more than twenty-five low-performing or struggling schools since the abolition of the statewide cap in 2011, making it the exclusive overseer of one of the largest charter portfolios in the land.
To make the most of the research opportunity presented by North Carolina’s track record, we partnered with Adam Kho, an assistant professor and rising star at the University of Southern California, who is well known for his work on school turnaround and charter schools, and who, like us, was interested in examining how authorizers might increase the likelihood that new schools get a strong start.
That partnership resulted in the Fordham Institute’s latest report, Do Authorizer Evaluations Predict the Success of New Charter Schools?, which Adam conducted with coauthors Shelby Leigh Smith (USC) and Douglas Lee Lauen (the University of North Carolina). With the help of the North Carolina Department of Public Instruction, the team constructed a unique dataset that includes the ratings external reviewers gave to specific portions of proposed schools’ written applications; the votes members of the state’s Charter School Advisory Board took after reviewing those applications (and interviewing the most promising candidates); and the outcomes of students in newly approved schools.
Because the data are limited to the period after North Carolina lifted its charter cap and the pandemic struck, Adam and company were able to analyze the evaluations and votes that determined the fate of four cohorts of applications and then follow those schools that were approved for one to four years after they opened. That amounts to 179 applications, fifty-three approved applicants, and forty-three schools that actually managed to open their doors.
So, what did they find?
First, schools that more reviewers voted to approve were more likely to open their doors on time but no more likely to meet their enrollment targets. In other words, there is some evidence that reviewers were able to identify applicants that had their ducks in a row.
Second, schools that more reviewers voted to approve performed slightly better in math but not in reading. In other words, reviewers’ collective judgment also said something about how well a new school was likely to perform academically.
Third, ratings for specific application domains mostly weren’t predictive of new schools’ success, but the quality of a school’s education and financial plans did predict math performance. Importantly, these domain-specific ratings were based exclusively on evaluations of schools’ written applications (unlike reviewers’ final votes, which also reflected their interviews with applicants and whatever other information was at hand).
Finally, despite the predictivity of reviewers’ votes, simulations show that raising the bar for approval would have had little effect on the success rate of new schools. For example, reducing the share of applications that were approved from 30 percent to 15 percent wouldn’t have discernibly boosted approved schools’ reading or math performance, nor would increasing the number of “yes” votes required for approval.
(Unfortunately, we cannot assess the implications of lowering the bar for approval, since we can’t gauge the effectiveness of schools whose proposals were rejected. Had those schools been in the mix, it’s possible that both reviewers’ votes and their ratings of specific application domains would have been more predictive.)
So, what does all of that imply for authorizing in North Carolina, the seven other states with a single (statewide) authorizer, and the thirty-six states with other combinations of state and local authorizers?
Given the diversity of approaches that states have taken to authorizing—and their geographic and demographic diversity—caution is warranted. But in our considered opinion, the findings suggest at least three takeaways.
First, authorizers should pay close attention to applicants’ education and financial plans. Per Finding 3, the quality of these plans significantly predicts the resulting schools’ math performance (unlike other elements of the application, such as the perceived quality of a school’s mission statement). Our sense is that’s no coincidence, as instructional prowess and budgetary competence are “must-haves” for a successful school.
Second, authorizers should incorporate multiple data sources and perspectives. Like a strong cover letter, a well-written charter school application is a sign that an applicant deserves serious consideration. But of course, the decision to approve should also reflect those intangibles—largely gleaned from face-to-face interviews—and the age-old adage that two heads are better than one.
Finally, authorizers must continue to hold approved schools accountable for their results. After all, we know that the quality of charter schools, like the quality of individual teachers, varies drastically once they are entrusted with the education of children. So if we can’t reliably weed out low performers before they are approved, the only surefire way to ensure that charters fulfill their mission is to intervene when their performance consistently disappoints (meaning, in this case, that chronically-low-performing schools should be drastically overhauled or closed).
To be clear, the latter is not our preferred outcome. But so long as a minority of approved charters underperforms, we see no alternative.
Someday, perhaps, the guidance that empirical research provides to authorizers will make the process for approving new schools more certain and less dependent on human judgment—that crystal ball that so often fails us. Until then, we’ll just have to take it one application at a time.