How to fix teacher evaluations
Ohio can learn an important lesson on teacher evaluation without descending into the same fight going on in New York
Ohio can learn an important lesson on teacher evaluation without descending into the same fight going on in New York
It may not be obvious at first blush, but the political fight happening in New York right now over teacher evaluations has implications for Ohio. Governor Cuomo has proposed increasing the weight of a student’s test scores to 50 percent of a teacher’s evaluation, made possible by a proposed decrease in the weight of a principal’s observations. Ohio Governor John Kasich hasn’t proposed any significant changes to teacher evaluations this year, but consider this: both Ohio and New York do a poor job of objectively evaluating teachers who don’t have grade- and subject-specific assessments, both states allow the unfair option of shared attribution, and stakeholders in each are questioning whether teacher evaluations give rise to extra hours of assessments that aren’t meaningful for students. This leads to a big question: Is there a way to fix these problems?
Enter Educators 4 Excellence (E4E) and their alternative teacher evaluation framework. E4E is an organization comprised of former and current teachers. Its mission is to magnify teacher voices in policy and legislative arenas where educator views are often overlooked—despite the fact that ensuing decisions significantly impact the day-to-day lives of teachers. E4E supports teacher evaluations that are “fair and rigorous,” and the organization published a paper in 2013 that made suggestions for how to improve the evaluation system in New York. In light of Governor Cuomo’s proposed reforms, they’ve recommended an alternative framework—which just so happens to solve most of the issues with Ohio’s system. The framework looks like this:
[[{"fid":"114195","view_mode":"default","fields":{"format":"default"},"type":"media","link_text":null,"attributes":{"class":"media-element file-default"}}]]
The framework includes two pathways. The first is for teachers who have a grade- and subject-specific state assessment, such as a fifth-grade math teacher or an eighth-grade English language arts teacher. In Ohio, this pathway could also include teachers who do not have a state assessment but do have, under current law, options from the Ohio Department of Education’s list of approved vendor assessments. Since both state assessments and ODE’s approved vendor assessments are subject- and grade-specific, these teachers’ scores include a measure of student growth that is calculated via test scores.
The second pathway is for teachers without a valid grade- and subject-specific assessment. There isn’t a clear way to evaluate teachers in subjects like music, physical education, or art. Without state or other valid assessments that can be used to measure student growth, districts are left to decide between student learning objectives or shared attribution. Neither of these options is ideal, since student learning objectives add more hours of testing and more local administrative burden, and shared attribution holds teachers responsible for subjects that they don’t teach and test scores they can’t directly impact. E4E’s proposed pathway for these teachers replaces student learning objectives and shared attribution with an evaluation by a peer or independent evaluator, and also adds student surveys. (These methods will be discussed in more detail below).
While no evaluation system will ever be perfect, E4E’s framework offers a real chance at a fair, rigorous, and meaningful evaluation for all teachers, whether in New York or Ohio. Here’s a brief summary of some of the best aspects of the framework:
Equal weighting works
The pathway that includes student growth as measured by test scores only weights that growth at 35 percent of the teacher’s total score. Undoubtedly many will argue that this percentage is too low. However, a policy brief from the Measures of Effective Teaching (MET) project examining various ways to weight teacher evaluation systems found that, of the four options they considered, the most reliable model was one that gave equal weight to gains on state tests, student surveys, and observations. In other words, a model with multiple measures that attributes only 33 percent of an evaluation score to state tests will produce the most consistent results for the same teachers from year to year. Furthermore, an equally weighted model is better at predicting gains on tests of higher-order thinking skills than a model that heavily emphasizes state test scores.
To be fair, E4E’s framework is not perfectly equally weighted. And the MET’s equally weighted model includes student surveys—which one pathway of E4E’s framework does not include. That being said, there’s no reason that Ohio can’t alter the framework to fit its needs.
The importance of a second evaluator
One promising aspect of E4E’s framework is that it requires a second evaluator. While the percentage of the overall score attributed to a second evaluator is dependent on which pathway is used, both paths require that teachers be evaluated by either a peer or independent evaluator. This has some important implications: It lessens the burden on principals by sharing the evaluation workload; it decreases the chance for subjectivity and bias that is present with only one evaluator; it gives highly effective teachers a leadership opportunity; and it strengthens peer and mentoring relationships within a school building. The previously mentioned policy brief from the MET project argues that adding another observer increases evaluation reliability significantly more than having the same observer score an additional observation. Some Ohio cities already use peer assistance and review, but districts could choose to exclusively use independent evaluators instead. In fact, Ohio Revised Code already allows for observations to be conducted by someone other than a principal, as long as that person meets certain criteria and has successfully passed ODE-sponsored training. It wouldn’t be a stretch for Ohio to move toward including a second evaluator, whether that’s a trained and highly effective peer or an independent evaluator (perhaps an instructional coach, a curriculum director, or an assistant principal).
Empowering principals
Another great thing about E4E’s framework is that it adds an extra set of eyes (and therefore reliability) to the evaluation process without diminishing the authority of the principal. In both pathways of the framework, a principal’s observations account for 45 percent of the teacher’s score. This is key because it allows for context and growth: Who better to determine teachers’ strengths and areas for growth—and consequently whether or not they should be considered effective—than the instructional leader who sees them every day and is responsible for the overall achievement of the building’s student population? As one teacher pointed out in an E4E press conference, “Principals are the instructional leaders of their schools. To diminish their role in the evaluation process diminishes their ability to manage their school.” Indeed, if we expect principals to lead their schools effectively, we must allow them to evaluate and assign ratings to the teachers that they manage—to do anything less is to silence their voices, to question their authority, and to remove an aspect of local control that is vital to understanding the entire picture of a teacher’s performance.
Student surveys matter
In the United States, students have very little voice in regard to their educational experience. Despite the fact that policies are targeted at improving student achievement and experience, policymakers rarely seek out the thoughts and ideas of the very population they intend to help. E4E’s teacher evaluation framework offers a way to change this by making student surveys worth 20 percent of the evaluation score for teachers who do not have a valid subject- or grade-specific assessment. Some would argue that student surveys are a poor substitute for student growth as measured by assessment. An additional policy brief from the MET project, however, shows that student surveys actually produce more consistent results than classroom observations or achievement gain measures. In fact, student survey results are predictive of student achievement gains. If survey results are predictive of achievement gains, it makes sense to use them as a substitute where student achievement measures are unavailable. Student surveys also offer unique, targeted feedback for teachers with regards to strengths and weaknesses—feedback that is arguably more powerful than test scores or a principal’s constructive criticism, since it comes directly from the individuals most affected by the teacher’s effectiveness. While student surveys certainly shouldn’t make up a majority of a teacher’s evaluation score, they are an important measure. One could even argue that student surveys should be part of both framework pathways—not just the pathway of teachers without a valid and specific assessment.
***
The good news is that Ohio already allows districts to use certain aspects of E4E’s model, such as student surveys or peer evaluations. The bad news is that these are only options for districts that choose to use the alternative teacher evaluation framework—and districts can’t choose to use both options in significant percentages. If states like Ohio and New York want to improve their teacher evaluation systems, E4E’s framework offers an excellent example (created by teachers!) of how it can be done.
Rick Hess opens his book, The Same Thing Over and Over, by asking readers to imagine the following scenario:
How would you respond if asked for a plan to transform America’s schools into a world-class, twenty-first century system?
Then imagine that there is one condition: you must retain the job descriptions, governance arrangements, management practices, compensation strategies, licensure requirements, and calendar of the existing system.
Hopefully, you would flee just as fast as you possibly could.
Red tape stifles innovation, dynamism, and entrepreneurship in public schooling, while creating a culture of risk aversion and defensiveness. These latter two are hardly the features of nimble organizations that can adapt to a changing world; rather, they are the marks of decaying institutions.
Here in Ohio, state leaders are taking note. On several occasions, both Governor John Kasich and Senate President Keith Faber have expressed their desire to “deregulate” public education. That is great news. Yet the task of deregulation is not a simple one. It requires carefully distinguishing the areas where the state has a valid regulatory role from those where it should defer to local, on-the-ground decision making.
The regulatory framework that we at Fordham have advocated is “tight-loose.” In a state policy context, this implies that the state, vis-à-vis districts, should be tight on districts’ results but loose on how they achieve them. In other words, Ohio policymakers should set rigorous academic goals for schools, assess whether they are meeting them, hold them accountable for results—and then back off in virtually all other realms.
In this piece, I analyze the deregulation proposals currently contained in House Bill 64 (HB 64), the governor’s budget bill, and Senate Bill 3 (SB 3), a high-priority bill recently passed by the upper house. The proposals are promising and could provide a starting point for more significant deregulation in the days ahead. Both HB 64 and SB 3 include provisions that would free “high-performing” districts from certain mandates without undermining accountability.
Comparison of HB 64 and SB 3
The bills are mostly similar in the exemptions that they provide, but they diverge on the criteria for identifying “high-performing” districts. For more details about the state report card measures referenced in the table below, see here or here. The table below compares the two proposals.
[[{"fid":"114225","view_mode":"default","fields":{"format":"default"},"type":"media","link_text":null,"attributes":{"class":"media-element file-default"}}]]
The Pros and Cons
The big step forward is the permission to hire non-licensed teachers, which is contained in both bills. Teacher licensure is a classic “barrier-to-entry” regulation that shrinks the pool of potential teachers that schools may hire. With a smaller pool of candidates, a school’s ability to hire top-flight teachers, especially in harder-to-staff areas like special education, science, and math, could be impaired. Think of it this way: If talent acquisition is paramount in improving the performance of public schools, why should we bind schools’ hands when it comes to hiring? Especially if some of the licensure requirements, as a precondition for hiring, don’t relate to effectiveness?
The other provisions are praiseworthy too: For too long, regional Education Service Centers (ESCs) have been guaranteed state funding regardless of the demand for their services. The rigid class-size requirements infringe upon school leaders’ flexibility in classroom organization. For instance, one could imagine a high school class being taught in a 150 student lecture hall. Or as we discovered in our Right-sizing the Classroom report, principals could assign more students to highly effective teachers, while assigning fewer students to developing ones—and potentially lift overall achievement.
The downside of these provisions is their narrowness in scope. First, they signal that the state only entrusts a few high-performing districts—and, it must be noted, largely high-wealth districts—to have greater managerial discretion. (The governor’s proposal is especially narrow.) Why not allow a broader set of districts to have the same management rights—particularly if we think greater flexibility can lead to higher performance? Second, the bills only deal with a small number of regulatory matters. Many more regulations should be on the table for debate.
The bottom line is this: These provisions are an excellent start to what should become a broader deregulation discussion.
Looking Ahead
The responsibility of the state is to ensure rigorous statewide standards, assessments, and accountability systems—as well as basic health and safety rules and a certain level of funding and accounting for those dollars. Arguably, that’s about it. That means that the state should consider the many other areas where it interferes with local decision making.
If state lawmakers can maintain strict accountability for schooling outcomes, major deregulatory efforts could potentially unwind any number of laws and regulations that go beyond the scope of the “high-performing” district exemptions in HB 64 and SB 3. They include rolling back state mandates in matters like minimum instructional time; curricular requirements; governance structures (e.g., prescriptions on the number of board members and term lengths); principal, superintendent, and treasurer licensure; public employee labor law, including collective bargaining; teacher evaluation; teacher tenure (i.e., “continuing contracts”); teacher salaries; retirement and health care benefits; school bus specifications; disciplinary policies and dress codes; business advisory councils; and so much more.
A Chinese proverb states, “A journey of a thousand miles begins with a single step.” The high-performing district provisions in HB 64 and SB 3 provide small steps—baby steps—forward. Ohio lawmakers should continue to examine state statute and regulation to uncover the areas deserving of “deregulation.”
In Ohio and across the nation, charters have struggled to obtain adequate, appropriate space in which to operate. As competitors, districts have been reluctant to allow charters to operate in buildings that they own, whether through co-location in an open district school or taking residence in a shuttered school. But according to the latest report from the National Charter School Resource Center (NCSRC), a few states and cities have been proactive in helping charters access district facilities. The report, using charter survey data across fourteen states from 2007 to 2014, reveals that charters in California and New York—New York City, in particular—were most likely to operate in district-owned space. In California, nearly half (45 percent) of charters operated in district facilities, while 31 percent of New York charters did so. In New York City, 62 percent of the city’s charters operated in a district facility, undoubtedly encouraged by the $1 rental fee that the district was permitted to charge charters (an innovation of former Mayor Michael Bloomberg’s). The study also reported some variation in the financial arrangements between districts and charters: Of the charters that operated in a district-owned facility, 46 percent of them reported paying no fee to the district, 41 percent reported paying the district an amount equivalent to the cost of operating the building (a median facility cost of $118,500), and 13 percent reported paying the district an amount above the cost of maintaining the building (a median cost of $540,068). Ohio is not included in the report, although mention is made of the special circumstances in Cleveland, where the first charter/district co-location arrangement in the state exists and has recently been renewed. To encourage more district/charter facility arrangements, the authors point to state policies such as requirements that districts publicly list unutilized and vacant space or provide charters with a “right of first refusal.” (That is, when selling a facility, districts must offer it to charter operators first at a price that is not higher than the market value of the property.) A state law to this effect has borne fruit in Columbus recently, yielding benefits both to growing charters and a space-rich, cash-poor district. City leaders can also help charters access district facilities, as was the case in New York City. As charters grow in Ohio, state and city-level policymakers will need to address the charter facility issue. To ensure cost effectiveness, they should insist that traditional districts make available their facilities to brick-and-mortar charters at a reasonable or zero cost. After all, the facilities should be considered public assets—paid for at taxpayer expense—and ought to be used for public education purposes, whether by a district or public charter school.
SOURCE: Jim Griffin, Leona Christy, and Jody Ernst, “Finding Space: Charter Schools in District-Owned Facilities”, National Charter School Resource Center (March 2015).
Here’s the top-line takeaway from the Center for Research on Education Outcomes’s (CREDO) comprehensive Urban Charter Schools Report, which is meant to measure the effectiveness of these schools of choice: For low-income urban families, charter schools are making a significant difference. Period.
CREDO looked at charter schools in forty-one urban areas between school years 2006–07 and 2011–12. Compared to traditional public schools in the same areas, charters collectively provide “significantly higher levels of annual growth in both math and reading”—the equivalent of forty days of additional learning per year in math and twenty-eight additional days in reading. As a group, urban charters have been particularly good for black, Hispanic, and English language learner (ELL) subpopulations. Indeed, putting the word “urban” before the phrase “charter school” is becoming somewhat redundant. As Sara Mead recently pointed out, urban students comprise only a quarter of students nationally, but more than half (56 percent) of those enrolled in charters. Thus, perhaps the most encouraging finding in the study is that the learning gains associated with urban charter schools seem to be accelerating. In the 2008–09 school year, CREDO found charter attendance producing an average of twenty-nine additional days of learning for students in math and twenty-four additional days of learning in reading. By 2011–12, it was fifty-eight additional days of math and forty-one of reading.
Not all that glitters is gold, of course. There’s no inherent magic to the word “charter” on the front door of a school. The relative success of urban charters in the aggregate makes all the more frustrating the failure of some charters in places like El Paso, Fort Worth, Las Vegas, and Phoenix, which not only fail to match the results of their district counterparts, but significantly underperform them. The new study underscores several challenges and suggests that the sector’s weaker performers form “sister city” relationships with stronger near-neighbors. Orlando and Fort Myers, for example, might want to emulate the work of Miami’s charter sector with ELL students, “who see the equivalent of 112 additional days of learning per year in math relative to their peers in TPS.” Another question to be asked—especially in places like San Francisco, Boston, Newark, Washington, D.C., and New York, where charter pupils seem to do particularly well compared to district schools—is the degree to which charters’ apparent success is a function of comparisons to weak traditional schools.
But if you are the low-income parent of a child in one of those places, such questions might not interest you very much. If your child is black, Hispanic, or an ELL in particular, here’s what you need to know: Charter schools are making a significant difference. Period.
SOURCE: “Urban Charter School Study Report on 41 Regions,” Center for Research on Education Outcomes (March 2015).
We recently looked at an analysis of New Orleans school leaders’ perceptions of competition and their responses to it. The top response was marketing—simply shouting louder to parents about a school’s existing programs, or adding bells and whistles. If schools are academically strong, this is probably fine. But if academically weak schools can pump up their enrollment (and their funding streams) by simply touting themselves to parents more effectively than competing schools, then the intended effect of competition—improved performance among all players in the market—will be blunted or absent all together.
In New Orleans, it appears that the more intense competition is perceived to be, the more likely schools are to improve academic quality as a means of differentiation. Is a similar thing happening in the Buckeye State? Here’s a look at some anecdotal evidence on quality-centered competition effects.
New school models
Large urban school districts in Ohio have long decried the students “stolen” from them by charter schools, and nothing rankles diehard traditionalists like online schools. So it was a little surprising to find that Akron City Schools’ proposed 2015–16 budget contains a huge technology component, including plans to start an in-house online charter school. This is being done in collaboration with Reynoldsburg City Schools, a district that knows a thing or two about innovation for improvement. Akron is aiming to recruit three hundred elementary students who are currently either home-schooled or attending a charter school, as well as forty in-district high school students who have “fallen behind on graduation credits.” It doesn’t really matter that the motivation is likely financial. If that’s what it takes to shake loose the status quo and try to create something better for kids already looking for something else, then so be it.
Cleveland Metropolitan School District (CMSD) is far down the path of new school models, including a push to incorporate newcomers from outside the traditional district. Bard College High School, a highly anticipated recent arrival on the West Side, was recruited by CMSD. Menlo Park Academy, Ohio’s only charter school for gifted students, will soon receive local tax revenue from the district as part of a partnership of excellence; Clevelanders will benefit from the school’s impending relocation/expansion, also on the West Side. What’s next? How about a charter boarding school for at-risk youth?
Parental choice
Already far ahead of many Ohio districts in offering options to families in urban areas, CMSD is also making strides through the Transformation Alliance (TA) in centralizing information for parents. What’s more, the TA received a grant in September to develop a universal enrollment system. A January report using New Orleans parental choice data indicated that these two steps were key in driving parents to choose the highest-quality schools available to them.
In Cincinnati City Schools, every district high school is a school of choice. Not all those schools are good ones, but increased access to those that are is a step forward. Additionally, the 7–12 grade span of all Cincinnati high schools is unique among urban districts in Ohio and could help accelerate and smooth the transition from middle to high school.
Inter-district open enrollment, an overlooked avenue of parental choice, is at its widest reach ever, with 81.5 percent of all districts in the state opening their doors in some form to students from outside their borders. A recent study conducted by the Mahoning County Education Services Commission looked at the funding and student achievement effects of open enrollment on both sending and receiving districts in the county. Far more winners than losers emerged. This isn’t news to most parents utilizing open enrollment, but it’s likely a revelation to district administrators.
Right-Sizing the district
Columbus City Schools, at its peak, enrolled over 110,000 students. That was in 1971. Just five years later, when the district’s newest high school opened, enrollment had fallen by nearly 14 percent. Today, enrollment stands around 51,000 students, stabilizing after decades of steady decline. Mayor Michael Coleman’s Education Commission urged “right-sizing” the district among its recommendations in 2013. That means not holding on to surplus school buildings in the hope that students and families will return. Instead, it means better serving those students who have chosen to stay even in the face of increasing high-quality options. It means that charter schools (hopefully the best of them) get much-needed facilities. If it also means lower maintenance expenses as well as a few extra million dollars in district coffers, that seems a change for the better all around.
**************
Anecdotally, it seems that the response to competition—charters, vouchers, e-schools, open enrollment—in Ohio is not much different than in New Orleans. Districts create schools to attract families, provide parents more public school choices and information on those choices, and right-size the district to best use its resources to educate the remaining students. Anything that education reformers in the Buckeye State can to do encourage more of the best kind of competition should be done.
In his proposed budget , Governor John Kasich calls for the creation of a competency-based education pilot program. Competency-based education is premised on the idea that students only move on to more complex concepts and skills after they master simpler ones. While that sounds somewhat negative at first blush, it also means that mastering current content quickly leads to advancing sooner than the standard march from grade to grade. Kasich’s proposal would provide grants to ten districts or schools that were selected through an application process created by the Ohio Department of Education to pilot the program.
The competency-based model goes by different names in different places. In Ohio, there are schools that already utilize it but call it something different: mastery grading. (Be sure to check out how schools like Metro Early College School and MC²STEM high school, as well as districts like Pickerington, make it work.) Mastery grading assesses students based on whether or not they’ve mastered specific skills and concepts. Instead of an overall grade that takes homework completion, daily assignments, class participation, and test grades that cover multiple standards into account to formulate an average, mastery grading breaks down a student’s performance on individual skills and concepts. Teachers or districts can determine a scale that works for their teaching styles and students. For example, a math teacher might use a scale in which 90 percent equals mastery, 70 percent equals developing mastery, and anything less than 70 percent indicates no mastery. An English teacher might utilize a rubric that looks like this:
[[{"fid":"114221","view_mode":"default","fields":{"format":"default"},"type":"media","link_text":null,"attributes":{"class":"media-element file-default"}}]]
Regardless of the system, mastery grading allows teachers to make it clear to parents and students precisely what the student has and hasn’t mastered. Instead of interpreting what Tyrone’s B in algebra means, Tyrone and his parents know that he understands polynomials at 97 percent mastery and two-variable equations at 90 percent mastery; but he has trouble with inequalities and the quadratic equation, where his mastery hovers at 65 percent. Similarly, Jasmine and her parents would know that, in her biology course, she’s mastered DNA and RNA and their processes at 92 percent mastery, evolution at 94 percent mastery, and cells at 96 percent mastery.
The key to mastery grading is that if the student doesn’t master the concept or skill, they don’t move on to the next topic. Instead, the student receives additional instruction and remediation, practice, and support (like tutoring, group work, or blended learning models). For instance, let’s imagine that Tyrone needs additional help to master the quadratic equation. This extra help can take multiple forms: Tyrone could log in to Khan Academy. He could receive one-on-one tutoring from his teacher during or outside of class. Or he could work in a group of similarly struggling students to complete a project on the real-life applications of quadratic equations. There are dozens of support options, but the end result is the same. After receiving remediation for the material he hasn’t mastered, Tyrone retakes the assessment. If he achieves mastery, he moves on (say, to exponents and factoring). If he doesn’t achieve mastery, he receives more support.
Mastery grading has profound implications for three reasons. First, it teaches children about growth mindsets. Failing to master something the first time is not a harbinger of failure, but a checkpoint that signals a need for more hard work, time commitment, and help. Tyrone doesn’t fail algebra (or feel like a failure) because he hasn’t mastered inequalities. Instead, he knows he’s mastered some parts of algebra but needs more help with other parts. Thus, failure becomes a learning experience instead of a death knell; students receive another chance to master a concept outside of the traditional, weekly, or unit test that comes and goes but once. Second, by targeting support in specific problem areas for individual kids, teachers can circumvent the boredom that often plagues advanced students. Jasmine, for example, doesn’t have to waste her time being bored with evolution or DNA review. Instead, she moves on to ecology and genetics while the student sitting next to her receives additional help with DNA. Third—and most importantly for parents—it promises the confidence of knowing exactly what kids have and haven’t learned. No more last-minute surprises that kids have fallen drastically behind and no more glossing over struggles because the class is moving on to the next thing regardless of whether all students have mastered the content. It’s progress demystified.
Another unique aspect of mastery grading is that it frees students from the antiquated notion of seat time in favor of content. Teachers and parents know instinctively and professionally that all students are unique, one-size-fits-all does not work, and education must be personalized. This need for personalization, however, doesn’t seem to apply to class schedules or grade bands. Why must an advanced student sit through sixty minutes of geometry every day for an entire year if she mastered the concepts after a single semester? Why must struggling students feel like an extra few days on a difficult concept equates to failure and falling tragically behind, rather than an opportunity to work hard and overcome an obstacle on the way to mastery? Those who would envision high school classrooms with a lone baby genius or middle school classrooms with a struggling nineteen year old are missing out on the fact that students can be in one building while learning the content that (supposedly) belongs to another. In fact, that happens in Ohio schools already—there are seventh graders taking algebra in their middle school building, and there are juniors taking remedial English in their high school building. All mastery grading does is make content king on the road to an ultimate goal—a goal that remains the same for all students even as the path to get there varies.