How to address “gap closing” accountability under ESSA

Aaron Churchill

4.4.2016

Regular Gadfly readers know that we usually rely on two metrics when analyzing school performance—Ohio’s performance index and the value added measures. However, the state assigns A–F ratings along several other measures, including one called the “gap closing” component (a.k.a. annual measureable objectives, or AMOs). This measure deserves scrutiny now that state policy makers have the opportunity to retool the accountability system under the Every Student Succeeds Act of 2015 (ESSA).

First implemented in 2012–13 as a modification to the federal Adequate Yearly Progress (AYP)provisions, AMOs are meant to hold schools accountable for the proficiency of student subgroups (e.g., low-income or race/ethnicity). Specifically, the measure compares the proficiency rates of a school’s subgroups to statewide proficiency targets—the “measureable objective.” The AMO methodology also gives partial (up to full) credit to schools when subgroup proficiency increases from year to year. The idea of AMOs is to maintain pressure on schools to close longstanding gaps between low-achieving subgroups and their peers.

Shortly after ESSA passed, the U.S. Department of Education notified states they were freed from using AMOs in their accountability systems. That doesn’t mean that low-achieving students will be forgotten: The new federal law requires an “indicator of proficiency” split out by subgroup and based on “ambitious, State-designed long-term goals.” But the law does not set forth specifications around the indicator’s design nor does it precisely define “proficiency.” With new leeway on how to implement this measure, state policy makers should now consider what to do with AMOs—an important decision, since AMOs are currently a high-profile A–F graded component and slated to become a significant part of Ohio’s overall rating system.

Should Ohio continue with AMOs, modify them, or discontinue them altogether? If seeking a different path, what are some alternatives?

The case against AMOs

As relics of the bygone No Child Left Behind era, the first step state policy makers should take is to scrap AMOs and start over. Let’s review three flaws of the AMO measure as presently implemented in Ohio.

First, they rely on proficiency rates—the percentage of students reaching an acceptable level of achievement as determined by the state. In high-stakes accountability, there are serious problems with relying too heavily on proficiency rates: For example, researchers in Chicago found evidence that proficiency-based accountability promoted a stronger focus on students near the proficient threshold, at the expense of very low- or high-achievers. Due to problems such as these, in Fordham’s ESSA Design Competition, no one suggests using simple proficiency rates in school accountability; most recommend a performance index (a weighted measure of achievement) or using scaled scores.

Second, using year-to-year changes in proficiency, as AMOs do, is problematic. For example, let’s suppose a school (call it Lincoln Elementary) in which 40 percent of low-income students were proficient in 2013–14, while 45 percent met that bar in 2014–15—a five-point increase. Should we understand this increase as legitimate improvement, even “gap closing,” or has the composition of the subgroup simply changed? At Lincoln Elementary, what if the increase in proficiency was due an influx of high-achieving, low-income kids? Because a school’s subgroup composition can change, researchers have urged caution when interpreting such changes in proficiency. We don’t know whether one-year increases in subgroup proficiency (reported at the school level) are evidence of real improvement—or just a mirage.[1]

Third, high-poverty schools—even when they earn exemplary scores on Ohio’s student growth measure (value added)—overwhelmingly receive failing grades on the AMO rating. Consider the following table, which displays the AMO ratings of high-poverty schools that earn value-added index scores within the top 10 percent of the entire state. On this list, you’ll notice that thirty-four out of the thirty-eight schools (89 percent) received F ratings on AMOs. If we believe—as many researchers do—that value added is a truer gauge of school effectiveness, why do so many high-performing, high-poverty schools receive Fs on AMOs? Are these schools “widening the achievement gap”? Sadly, even when they contribute positive learning gains (which, if accumulated over time, would narrow the achievement gap), they are deemed as failing AMOs.[2]

Table: AMO ratings of high-performing, high-poverty schools in Ohio, 2014–15.

[[{"fid":"115857","view_mode":"default","fields":{"format":"default"},"type":"media","link_text":null,"attributes":{"height":"728","width":"752","style":"width: 500px; height: 484px;","class":"media-element file-default"}}]]

Source: Author’s calculations based on data from the Ohio Department of Education. Notes: “High-poverty” schools have greater than 80 percent economically disadvantaged (ED) students. To be included as a “high-performing” school, its value-added index score must be within the top 10 percent statewide (total number of rated schools in Ohio = 2,520). Charter schools are italicized. For reference, the statewide distribution (school-level) of A-F ratings for AMOs is as follows: A-20%; B-15%; C-8%; D-9%; F-48%. The distribution of A–F ratings for high-poverty schools (greater than 80 percent ED) is as follows: A-5%; B-4%; C-2%; D-3%; F-86%.

Ideas for improvement

In my view, AMOs are unsalvageable. Relying purely on a proficiency measure in a high-stakes accountability system narrows the focus to kids on the cusp of passing ; year-to-year changes in proficiency may be no more than an illusion of improvement (and as my colleague Robert Pondiscio suggests, incentivizes poor instruction, at least in reading); and something is amiss with the AMO methodology the state currently uses. In short, it’s not clear that AMOs are accomplishing the important function they were designed to serve—accurately telling us which schools are serving our most disadvantaged students well and which aren’t. To be clear, state policy makers are not entirely responsible for these problems: Recall that AMOs are part of the former federal accountability system. But it’ll take their action to make revisions to state law and undo AMOs.

Subgroups still matter. Not only does ESSA still require some type of “indicator” that breaks out achievement results by subgroup, it’s also critical that Ohio never go back to a time when the academic achievement of any subset of students could be ignored. What to do in a post-AMO world? Here are four thoughts.

Implement a subgroup performance index. If allowed by federal officials (and it should be), Ohio ought to disaggregate subgroup achievement using the state’s performance index (PI). As a measure that takes into account students across the entire achievement distribution, PI alleviates some of the problems associated with focusing strictly on the proficiency bar. In the ESSA design competition, Josh Boots of Empower K12 puts forward the notion of an “achievement gap index” that would essentially assign PI scores for each school’s subgroup and then compare that score to a statewide benchmark. Ohio should pursue a measure such as this.
Create reasonable statewide goals for subgroups. Due to its harsh design, AMOs doomed virtually all high-poverty schools to failure—even those contributing well over a year’s worth of learning. It will be important for policy makers to set rigorous but attainable statewide goals for the achievement and continued improvement of disadvantaged subgroups—goals that deserving schools can meet and exceed.
Don’t make subgroup achievement a standalone A–F component. Ohio’s report cards already have two achievement-based A–F ratings that typically reveal gaps between schools (the performance index and indicators met, with a third—a composite of the first two—soon to come). A separate graded component for subgroup accountability, such as AMOs, may unnecessarily pile on the Ds and Fs for schools serving disadvantaged populations. Policy makers should consider implementing ESSA’s “subgroup indicator” as a non-graded component, or placing the subgroup indicators within the state’s larger achievement component. The point is not to conceal subgroup achievement, but rather to ensure the report card system is fair to schools.
Resist the urge to evaluate schools on “gap closing.” Attempting to evaluate whether a school closes gaps is fraught with peril. The state’s current measure of gap closing (AMOs) is imperfect, and its results are perplexing. But a wholly different concern is that we might be pitting one subgroup against another, especially if we use a higher-achieving group as a benchmark. (It goes without saying that one perverse way to narrow the gap—particularly at the school level—is to depress the achievement on the top end.) The most prudent path for Ohio would be to avoid measures that explicitly aim to measure gap closing. Those measures invariably suggest that education is a zero-sum game, but we know that all students can and should achieve at higher levels.

Breaking down achievement (and growth data) by subgroup is an important dimension of a school evaluation system. Overall averages shouldn’t be used to hide the low performance of certain groups of students. Moreover, subgroup data can reveal a school’s strengths and weaknesses—a foundational element for school improvement. But accountability at a subgroup level must be done carefully and thoughtfully as well. Now that federal law puts Ohio firmly in charge of its own accountability system, policy makers should use this opportunity to refine policies in a way that makes even better sense for Buckeye schools and students.

[1] Vice versa, we cannot be sure that decreases in proficiency aren’t the results of a change in student composition.

[2] While high-poverty schools often struggle to meet the statewide AMOs, they can earn “partial credit” on the measure. For a given subgroup of a school, this calculation is as follows: one-year improvement in proficiency / AMO gap. Schools with the widest AMO gaps therefore have to make remarkable proficiency rate increases in one year to receive substantial credit. For example, a school with a twenty-five-point gap for low-income students (relative to the statewide AMO) would need to increase proficiency by 22.5 percentage points to earn an A (22.5 / 25 = 90%). If such a school made a reasonable fourteen-point improvement, it would have received an F (14 / 25 = 56%). The AMO proficiency targets, methodology, and cut points for the A–F ratings can be found here.

Tags:

Ohio

United States Department of Education

Ohio Department of Education

Chicago

Every Student Succeeds Act