As 2015 comes to a close, the long-awaited reauthorization of the Elementary and Secondary Education Act will likely soon become a reality. Among many proposed changes is the jettisoning of the federal waiver requirement mandating teacher evaluations. Before critics rejoice and demand an immediate end to the Ohio Teacher Evaluation System (OTES), it would be wise to remember why evaluations were instituted in the first place: Several research studies indicate that while teacher quality isn't the only factor affecting student achievement, it is a significant one. Ensuring that all students have a good teacher is a worthy and important goal; without a system to evaluate and differentiate effective teachers from ineffective ones, though, it is impossible to achieve. It’s also worth noting that many of the evaluation systems that existed prior to federal waivers—those that were solely observation-based—failed to get the job done. Teacher evaluations have come a long way.
That being said, Ohio’s system needs some serious work. Fortunately, fixing evaluation policies isn’t without precedent: In 2012, only 30 percent of Tennessee teachers felt that teacher evaluations were conducted fairly. In 2015, after the Tennessee Department of Education worked to refine the system, that number rose to 68 percent. Sixty-three percent believe the evaluation system has improved student learning, while 77 percent say they now understand how to use assessment data to improve their teaching. The Tennessee Department of Education reports that evaluation has made a significant positive impact on education outcomes: Since the launch of teacher evaluations, proficiency levels have grown at the elementary level in every subject area, and end-of-course exam performance has grown steadily since 2009–10 (except in English III). Additional data out of the state found that one of the working conditions associated with the retention of highly effective teachers is a functional evaluation. Fortunately, Ohio can accomplish the same thing. New flexibility from the feds should inspire Buckeye policy makers not to ditch the evaluation system, but instead to refine it into something that’s fairer for teachers, less burdensome for principals, and better at differentiating effectiveness. We’ve written before about how these systems must change and models that Ohio could adopt, but let’s take another look at two ideas that policy makers should consider when improving Ohio’s evaluation system.
Abandon SLOs and shared attribution
Ohio requires that all teacher evaluations include a student growth component, which consists of test results. For teachers with a valid grade- and subject-specific assessment, that means value-added measures. Unfortunately, only 34 percent of Ohio teachers actually fall into this category.[1] The remaining 66 percent are evaluated based on locally developed measures like Student Learning Objectives (SLOs) and shared attribution—both of which are poor ways to measure teacher effectiveness. Research shows that implementing SLOs in a consistent and rigorous manner is extremely difficult. A recent report from NCTQ found that they fail to effectively differentiate teacher performance. Apart from their questionable effectiveness, SLOs are also time-intensive; they require teachers to set long-term academic growth targets, measure progress toward those targets, and submit data in addition to their many responsibilities. Even worse, SLOs aren’t just a time-suck for teachers; a January report on testing in Ohio indicated that SLOs contribute as much as 26 percent of total student test-taking time in a single year.
Shared attribution, meanwhile, is the practice of evaluating teachers based on test scores from subjects other than those they teach. Despite what the name implies, it doesn’t actually ensure that teachers share accountability—just that core teachers with value-added data are responsible for the evaluation scores of non-core teachers (like gym, art, and music) in addition to their own. Talk about unfair.
So if SLOs are a time-intensive burden, shared attribution is unfair, and both fail to effectively differentiate teachers, what can policy makers replace them with in order to ensure that multiple measures are still used? The answer is to move this group of teachers to a fully observational evaluation. But in so doing, policy makers must also insist that observational practices are rigorous and not just the pro forma reviews that have too often occurred in the past. (This would apply not only to the teachers currently in the SLO/shared attribution system, but also to the observational component for the teachers in the value-added and vendor assessment systems.) Here are a few ideas to strengthen teacher observations:
Peer observations: Some of the best feedback I received as a teacher was from colleagues. Peers who work in the classroom, are familiar with the student population, and share a content/grade-level background are an untapped resource for effective evaluations. In fact, the Measures of Effective Teaching (MET) project found that administrators’ ranking of their own teachers was similar to those produced by peer observers. To ensure honest appraisals and protect teachers from feeling like they have to give positive reviews, districts could utilize another untapped resource: video-recorded lessons. Teachers from across a district (or even across the state) could be given extra planning time on a pre-selected date to watch videos of their peers teaching, study lesson plans and student work, and submit feedback in an anonymous fashion.
Student surveys: Some will argue that asking students to evaluate their teachers is a recipe for disaster; but it would be a serious oversight not to include feedback from those most affected by teachers, especially when research shows its benefits. An MET project brief found that student surveys are more likely than achievement gain measures or observations to demonstrate consistent results for teachers. In addition, the brief shows that student survey results are predictive of student achievement gains. It’s also worth noting that student surveys are already part of the alternative OTES framework.
Multiple observers: One the biggest complaints about OTES is the burden it imposes on principals, who are typically responsible for conducting observations. Employing multiple observers should lessen that burden—in addition to adding reliability. A policy brief from the MET project found that whenever a given number of observations was split between multiple observers, reliability was greater than what could be achieved by a single observer. In the case of teachers who received two observations, reliability increased more than twice as much when the second observation was conducted by a different administrator than the first.
Require outside observers
The second change Ohio policymakers should pursue is to require the use of outside observers. (Ohio law currently allows for, but does not require, evaluators other than the principal). A report from Brookings indicates that observations conducted by outside observers are more valid than observations conducted by school administrators. Using multiple observers is a commonsense approach: It decreases the chance for subjectivity and bias that is present with only one evaluator, and it gives schools a chance to capitalize on the skills of content experts, instructional coaches, curriculum coordinators, hybrid teachers, and assistant principals. Making use of outside observers (and peer observations) means that decreasing the number of principal observations won’t decrease the amount of feedback teachers receive or the reliability of their evaluation scores. That being said, it’s still vitally important for principals, as the instructional leaders of their schools, to take part in the observation and evaluation process.
***
If the ESEA reauthorization is a success, Ohio should be ready to reboot its teacher evaluation system. Waivers that required all teachers—core and non-core alike—to have an objective measure of growth caused a lot of headaches, and ESEA changes give Ohio a chance to make its system better. Ditching SLOs and shared attribution and replacing them with measures like peer observations and student surveys is a good place to start. So is the requirement of outside observers, which should lighten the load on principals and make the observational system more rigorous and more reliable. Of course, there are other reforms that policymakers could also consider. Selecting different measures for experienced versus inexperienced teachers; making sure that announced observations are balanced with unannounced ones; allowing principals to have more authority in determining both the benefits of excellent evaluations and the consequences of poor ones; and weighting measures equally are all ideas worth pursuing. The bottom line is that complaints about teacher evaluations in Ohio can be addressed without throwing away the entire system.
[1] The 34 percent is made up of teachers whose scores are fully made up of value-added measures (6 percent); teachers whose scores are partially made up of value-added measures (14 percent); and teachers whose scores can be calculated using a vendor assessment (14 percent).