Ohio’s student growth measure—value added—is under the microscope, which provides a good reason to take another look at its important role in school accountability and to see if there are ways it can be improved. On April 19, state Representatives Robert Cupp and Ryan Smith introduced House Bill 524, legislation that calls for a review of Ohio’s value-added measure. In their sponsor testimony, both lawmakers emphasized that their motivation is to gain a strong understanding of the measure before considering any potential revisions.
The House Education Committee has already heard testimony from the Ohio Department of Education and Battelle for Kids; it is expecting to hear from SAS, the analytics company that generates the value-added results, on May 17. In brief, value added is a statistical method that relies on individual student test records to isolate a school’s impact on growth over time. Since 2007–08, Ohio has included value-added ratings on school report cards, though data were reported in years prior.
As state lawmakers consider the use of value added, they should bear in mind the advantages of the measure while also considering avenues for improvement. Let’s first review the critical features of the value-added measure. (The discussion in this article pertains mainly to school-level value added, not teacher-level—a subject for another day.)
Unlike student proficiency measures, value-added results don’t correlate with demographics. While education reform is focused on narrowing achievement gaps, those gaps remain stubbornly wide. Accountability systems based on proficiency alone unfairly punish disadvantaged schools even when they are helping students make progress over time. This is because academically challenged students are less likely to “pass” the state assessment, especially now that passing scores are more rigorous. But as a measure that zeroes in on growth, value added places schools on a more even playing field for accountability purposes. Consider Figure 1, which displays almost no relationship between value-added scores and economic disadvantage (for charts from previous years, see here and here). Schools with highly disadvantaged populations can and do perform well on this measure. It has the added benefit of highlighting how well schools in middle- and upper-income communities are meeting the needs of their students too.
Figure 1: Value-added index scores versus economic disadvantage – Ohio schools, 2014–15
[[{"fid":"116116","view_mode":"default","fields":{"format":"default"},"type":"media","link_text":null,"attributes":{"height":"679","width":"1157","style":"width: 500px; height: 293px;","class":"media-element file-default"}}]]
Value-added methods have been studied extensively by education researchers. Value-added methods are not a passing fad or methodological mystery. In fact, the pioneering value-added models go back to the early 1970s when Stanford economist Eric Hanushek and others began using testing data and statistical methods to measure a teacher’s contribution to learning. Since that time, economists and education researchers have studied, advanced, and refined the methods; today, most states incorporate a value-added measure (or student growth percentiles) into their school accountability systems. There are, of course, limitations to the use and application of these measures, and few would argue that value added alone is sufficient in an evaluation of either a school or teacher. But value added provides the best possible empirical evidence of a school’s impact on achievement. As Harvard researcher Thomas Kane writes, “Value-added estimates capture important information about the causal effects of teachers and schools.”
Value added focuses on the growth of individual students. Using student-level data—as value added does—is the proper way to measure a school’s contribution to growth over time. Value added takes into account a student’s prior achievement and compares her year-to-year growth to pupils with a similar achievement history. When prior achievement is accounted for and appropriate student-to-student comparisons are made, analysts can control for the influence of demographics while also ensuring that schools are evaluated by a consistent growth expectation—in layman’s terms, one standard year of learning. While helpful in the absence of individual student data, statistical analyses that use school-level data (like the California Similar Students Measure) are not as robust as measures like value added. Ohio policy makers should insist on the highest-quality measurements of school effectiveness and not settle for anything less.
* * *
Because value added plays such an important role in Ohio, lawmakers should also consider strengthening state policy as it relates to value added. The following adjustments would provide a good start:
- Reset the cut points that determine A–F ratings. State law currently establishes the value-added scores that correspond to each A–F grade (i.e., the cut points for each rating). But these cut points yield far too many As and Fs: As Figure 2 displays, roughly 35 percent of Ohio schools receive an A rating; vice-versa, about 35 percent receive an F. The identification of this many A-rated schools diminishes the rating’s value, particularly for deserving schools in the high A range (schools with scores well above 2.0, the minimum threshold for an A). Conversely, the present cut points overstate the number of F-rated schools. While there is some statistical logic behind the current cut points, the A–F distribution makes little common sense. Lawmakers should reset of the A–F cut points in a way that creates a more sensible distribution of ratings: An A ought to signal extraordinary performance, while an F ought to denote true reason for alarm.
Figure 2: The distribution of A–F value added ratings by district (left) and by school (right) – 2014–15
[[{"fid":"116117","view_mode":"default","fields":{"format":"default"},"type":"media","link_text":null,"attributes":{"height":"592","width":"1206","style":"width: 600px; height: 295px;","class":"media-element file-default"}}]]
- Ensure that three years of data is used for the value-added ratings, though potentially weighting results in favor of the current year. State law requires value-added ratings to be based on a three-year average. This is good policy: With multiple years of data, analysts have more test scores at their disposal, thus improving the precision of the value-added estimates. The three-year average is especially critical for calculations at a classroom or subgroup level, where one year of data might include a small numbers of test takers. (The estimate has more statistical noise when fewer students are included.) Furthermore, the multi-year averages can help to smooth some of the year-to-year fluctuations—the averaging doesn’t overly reward or penalize a good or bad year. When using multi-year averages, one concern is that prior year results may not reflect the most current practice at a school. To ease this worry, policy makers could place greater weight on the current year’s value-added results relative to the two previous ones—a weighted three-year average.
- Make information about individual students’ academic growth—and what it predicts—accessible to their parents. Based on students’ growth on state exams, data analytics companies like SAS could develop statistical models that help to predict individual students’ college admissions scores or their odds of being accepted into certain universities. In the hands of parents and students, this would provide powerful data. It would give families insight into which college or career pathway their children are on at a much earlier stage. For some, this predictive evidence would provide an opportunity to make course corrections before it’s too late; for other youngsters, it might inspire them to their sights on higher goals. Policy makers should make sure that Buckeye families receive all the information regarding their own children’s test records—and what it means for their future success.
- Ensure substantial weight on value added in the overall school grading system. According to ODE’s December 2015 presentation on the overall grading formula, the state is considering an approach that would place too much weight on measures that correlate with demographics. These include the Achievement, Graduation Rate, Gap Closing, K–3 Literacy, and Prepared for Success components, which would together constitute 80 percent of the overall weight; value added represents just 20 percent. Under this weighting formula, most high-poverty schools will likely fall into the D or F category. Ohio lawmakers should make sure this doesn’t happen by ensuring that value added—a measure that is fairer to disadvantaged schools—be weighted more heavily while de-emphasizing or eliminating components that correlate with demographics (Gap Closing is one prime candidate for elimination). A school’s demography should not be ratings destiny.
In sum, value added must remain a critical component of school accountability, and policy makers shouldn’t backtrack on its use. Without a growth measure, we would rate schools almost solely on proficiency-based measures—a flawed method. Perhaps most important is what growth measures communicate to parents, educators, and the public: Every student matters and all kids can learn, no matter their starting point.