The Ohio Department of Education and Workforce (DEW) will soon release Ohio’s school report cards for the 2023–24 school year. These report cards, which will be available on the DEW website, assign an overall quality rating (from one to five stars) to each school and district based on up to five individually-rated components: Achievement, Progress, Gap Closing, Graduation, and Early Literacy. The least understood of these components is Progress, which is based on statistical estimates of how much “value added” schools provide in terms of student test scores. This metric is meant to isolate schools’ contributions to student learning from the multitude of other factors that affect students’ performance on standardized tests, such as family and environmental influences.[1]
This post explains what the Progress indicator captures and how schools and communities might interpret Progress ratings.[2] Because of how these ratings are presented on school and district report cards, some will overreact to their schools’ ratings, while others won’t give the ratings the attention they deserve. The information below is intended to temper both of these reactions by helping stakeholders understand the value and limitations of Progress ratings.
What does the Progress rating capture?
The Progress indicator, according to school and district report cards, “looks closely at the growth all students are making based on their past performances.” Each school or district receives a rating from one to five stars, and next to this rating the report card provides a short description of what the rating means. These descriptions appear in Table 1, focusing on schools as opposed to districts for the purpose of illustration.
Table 1. Official descriptions of Progress ratings provided on school report cards
The report cards rightly feature concise definitions of each rating, but the five-star rating system (which the General Assembly codified in law) and the report-card descriptions (provided in Table 1) take some significant liberties in characterizing the underlying value-added estimates. In particular, they imply sharp distinctions in school quality when, in fact, there may be minimal statistical support for those distinctions. They are also vague in how they characterize achievement growth, which has led to significant misunderstandings. Table 2 restates the official rating descriptions in an attempt to clarify them and to better capture what the underlying value-added estimates convey about school quality.
Table 2. Alternative descriptions of Progress ratings on school report cards
One difference between the official school Progress rating descriptions (Table 1) and the modified rating descriptions (Table 2) is that the latter clarify that, in the context of value-added models, “growth expectations” are based on comparing test scores between students in a single year. Specifically, students who “met growth expectations” are those whose test scores at the end of the school year were about the same as those of other students with similar test scores in prior years.[3] In other words, growth expectations have nothing to do with whether students are reading or doing math at grade level, for example, or making one year’s worth of academic progress based on Ohio’s academic content standards. Indeed, if the average student has fallen behind in terms of grade-level content (as happened during the pandemic), then a school might obtain a three-star rating—and a signal that their students have “met growth expectations”—because they are just as far behind as other students. In other words, the Progress rating is a relative measure, not an absolute one.
A second difference is that the modified descriptions in Table 2 categorize schools receiving five stars similarly to those receiving four stars, and they categorize schools receiving two stars similarly to those receiving one star. That is because the method by which value-added estimates are translated to specific ratings does not in fact enable one to distinguish with statistical confidence between five-star and four-star schools or between two-star and one-star schools. Although the underlying estimated value-added effect sizes do indeed differ between five-star and four-star schools (and between two-star and one-star schools), those differences could just be statistical noise. We simply don’t know which is the case for a particular school or district due to how value-added estimates are translated to ratings.
A third difference is that the modified rating descriptions in Table 2 provide two possible interpretations of the three-star rating. Whereas Ohio report cards label a three-star rating as providing “evidence that the school met student growth expectations,” the description in Table 2 basically states that we don’t know with statistical confidence whether a school’s average student test score was above, at, or below the average for students with similar test scores in prior years.
The extent to which these last two points are an issue for a given school or district depends on the number of tested students used in the value-added calculation. That’s because, under the current system, schools and districts could receive different ratings due to differences in the statistical confidence associated with value-added estimates, as opposed to differences in actual performance. In general, the more test scores used to generate a value-added estimate, the more confident we are that a particular star rating is the correct one for that school or district.
More specifically, the informativeness of the ratings depends in part on the number of tested students in a school or district in a given year, as well as the number of years of annual value-added estimates that are averaged together to generate the Progress rating. Smaller schools and districts, and schools with a small number of tested grades, may get ratings of three stars simply because there are not enough data to determine what rating they should receive. Similarly, ratings that are based on just a single year of value-added estimates are likely to misclassify a lot of schools and districts. Indeed, research suggests that school ratings should be based on a multi-year average of school value-added estimates.[4]
Thus, setting aside issues related to the pandemic that might lead to highly unstable Progress ratings from year to year,[5] the informativeness of the Progress rating and associated report-card descriptions can depend significantly on the school or district at hand, as well as whether the rating is based on a multi-year average of annual value-added estimates.
Some rules of thumb for interpreting the Progress component in the 2024 report cards
- Put more weight on the Progress rating when more tested students are included in the calculations.
One should put more weight on Progress ratings for larger schools with more tested grades than for smaller schools with fewer tested grades. Similarly, one should put more weight on ratings for districts than those for individual schools. Perhaps most importantly, one should put more weight on the forthcoming 2024 Progress ratings (based on a three-year average of value-added estimates) than on Progress ratings from 2022 (based on the 2021–22 school year only) or 2023 (based on 2021–22 and 2022–23 value-added estimates). For example, if the school in question has few students and only two tested grades (e.g., a typical K–4 elementary school), then one should not read much into a school Progress rating of three stars, and one should not read too much into any rating (from one to five stars) if that rating is from 2022 or 2023. If, on the other hand, one is concerned about the effectiveness of a particular school district and the Progress rating is based on a three-year average (as will be the case in the forthcoming 2024 report card), then a three-star rating becomes meaningful.
- Focus on whether the Progress rating is above or below three stars.
One should not make strong distinctions between four and five stars or between one and two stars, even if a school or district has many tested grades and the Progress rating is based on three years of value-added estimates (as will be the case in the forthcoming 2024 report card). Going forward—beginning with the forthcoming 2024 report card—one should take notice if a school or district moves between one/two stars and three stars, between three stars and four/five stars, and, especially, from one/two stars to four/five stars. Movement would then likely capture improvements or declines in student learning (relative to the average Ohio student). That said, for districts or large schools with many tested students, distinctions between four/five and one/two stars become meaningful if they are observed for multiple years (e.g., if a district receives five-star ratings for multiple years beginning with the 2024 report card).
- Consider the Progress rating alongside achievement-based ratings.
Remember that if value-added estimates are based on a sufficient amount of data, schools and districts that “met growth expectations” are those whose student test scores were about the same as those of other Ohio students with similar prior test scores. Thus, if students are behind to begin with, they may stay behind even if they continue to meet “growth expectations.” The easiest way to determine whether a school’s students are behind is to consult the report card’s Achievement rating, which provides insight into how students perform against the state’s grade-level academic standards.
That said, do not ignore the Progress rating!
Value-added statistical techniques, such as those used to generate the Progress rating, are currently the most valid way to isolate schools’ contributions to student learning, holding constant the numerous other factors that affect student achievement—including poverty, family support, and innate ability. Value-added estimates also capture student learning equally for all students, regardless of the knowledge and skills they had prior to attending a school. In contrast to absolute measures of student achievement, value-added measures level the playing field and incentivize schools to help all students learn.[6] Thus, Ohio’s value-added Progress indicator is the best measure available to assess school quality, provided that we are aware of its limitations and consider it along with other measures of student educational outcomes.
[1] A discussion of the extent to which value-added estimates successfully isolate schools’ contributions to student learning, and how much weight to assign to test-based metrics overall, is beyond the scope of this post. For now, it is fair to say that the “value added” approach to measuring academic progress is likely the best means available to isolate the true impact of Ohio schools—at least when it comes to student achievement in the tested grades and subjects. It is also fair to say that such achievement measures are predictive of students’ future income and other measures of wellbeing (e.g., lower rates of criminality and teenage pregnancy).
[2] I hope to describe in a future post how schools can access and interpret the underlying “effect sizes,” which are available on the DEW website and can provide more concrete information on how effectively schools are educating their students.
[3] As in the Ohio report cards, the descriptions in Table 2 are also stylized and take some liberties—particularly when stating that comparisons are made to other students with similar past scores. It would be more accurate to say that value-added models indicate whether a student’s test scores are higher or low than those of the average Ohio student after controlling for students’ prior test scores, grade level, and test taken. Put differently, value-added techniques employ statistical procedures meant to compare test scores across all Ohio students as if all students had similar past test scores, were in the same grade, and took the same tests. Thus, in reality, the “expected growth” benchmark for comparison is the average Ohio student, as opposed to the average Ohio student with similar past test scores. The shorthand in Table 2 is meant to convey that the statistical modeling attempts to compare students as if they were academically similar prior to the school year.
[4] The linked study suggests a three-year average for performance indicators that weight value-added heavily. Ohio’s 2024 ratings will be the first post-pandemic Progress ratings to be based on a three-year average of school value-added estimates. They will put more weight on the 2023–24 value-added estimates than on prior years’ estimates, however, so they may still be less stable than a three-year average that weights each year’s value-added equally.
[5] For example, schools where students fell furthest behind during the pandemic might have made more progress in 2021–22 or 2022–23 simply because they had more ground to make up—not because they are better at educating their children. More generally, schools with very high (or low) scores one year are likely to have lower (or higher) scores the following year. In statistical terms, this general phenomenon is known as “regression to the mean.”
[6] Ohio’s Achievement indicator considers multiple performance thresholds and, thus, better captures student achievement across the student ability distribution than a simple proficiency rate, for example. However, an influx of higher-achieving students could make a school seem like it is improving when, in reality, its students may not be learning more than before. The Progress rating takes account of students’ past test performance and, thus, largely mitigates this problem.