Should Ohio cut any of its tests?
After much criticism, state superintendent Paolo DeMaria decided to delay Ohio’s submission of its ESSA plan until September.
After much criticism, state superintendent Paolo DeMaria decided to delay Ohio’s submission of its ESSA plan until September.
After much criticism, state superintendent Paolo DeMaria decided to delay Ohio’s submission of its ESSA plan until September. One of the chief complaints was that the plan did not propose any cutbacks on the number of state assessments students take, and a committee is now forming to examine whether any could be culled.
The committee will find that most state assessments must be given to comply with federal law. ESSA, like No Child Left Behind before it, requires annual exams in grades 3-8 in math and English language arts (ELA); science exams once in grades 3-5 and 6-8; and one high school math, ELA, and science exam. This leaves just seven of twenty four state exams on the table for discussion: four social studies assessments, two high school end-of-course exams, and the fall third-grade ELA exam. Ohio students spend less than 2 percent of their time in school taking these state tests.
While eliminating any of these assessments would slightly reduce time on testing, doing so also comes at a steep price. Let’s take a closer look.
Social Studies Exams
Ohio currently administers exams in grades 4 and 6 social studies and end-of-course assessments in US history and US government. The Buckeye State has a relatively long history of exams in social studies (previously called “citizenship”). Ohio’s old ninth grade citizenship tests go back to 1990, and tests in grades four and six were added in the mid-1990s. In 2009, the state suspended social studies testing due to budget cuts in grades four and six, but they resumed in 2014-15. The state uses the results from social studies exams in its school accountability system.
One of the central missions of education is to mold young people into the knowledgeable citizens needed for informed participation in democratic life. Over time, though, American schools have crowded out social studies. Based on studies of instructional time, Harvard’s Martin West writes “Test-based accountability can result in a narrowing of the curriculum to focus on tested subjects at the expense of those for which schools are not held accountable.” Abandoning Ohio’s social studies tests could encourage further narrowing. Meanwhile, as several analysts (including Fordham’s Robert Pondiscio) have argued, in today’s raucous political environment, students need solid civics instruction now more than ever.
Of course, testing alone can’t cure all that ails social studies instruction. For instance, a mere 18 percent of American eighth graders reached proficiency on NAEP’s 2014 US history exam; in civics, the proficiency rate was just 23 percent. But ensuring its place among Ohio’s assessments counterbalances the incentive for schools to concentrate on ELA and math at the expense of social studies. It would also signal a clear commitment that social studies, American history, and US government are an integral part of students’ education.
End of Course Exams (EOC)
Ohio recently implemented two sets of EOCs in math and ELA at the high school level and could, under federal law, drop one in each subject.[1] The exams are Algebra I and Geometry (or Integrated Math I and II) along with ELA I and II. Starting with the class of 2018, the EOCs replace the Ohio Graduation Tests (OGTs) as the exams taken in high school. The OGTs were widely considered to be low-level exams assessing eighth grade content. The EOCs raise the bar for students, as they test content from their current high school courses—not stuff they were supposed to have learned years ago. EOC implementation is also part Ohio’s move towards college and career ready standards, including test alignment to the state’s new learning standards in high school math and ELA. Like Ohio, many states—also shifting to higher standards themselves—have decided to move towards EOCs in high school.
Commitment to higher expectations and more challenging high school assessments are needed in Ohio. Post-secondary data show that too many Buckeye students are not prepared for college. For example, roughly one-third of Ohio’s college freshman needs remedial English or math. Too few young people make it to college completion: Based on ODE’s post-secondary statistics, just one in three of the class of 2009 obtained an Associate degree or higher six years after post-secondary matriculation. Employers have repeatedly indicated that many young people are not ready for the demands of today’s workplaces. Testing twice in high school in math and ELA should keep high schools—and their students—focused on the goal of readiness for college or career.
Dropping a set of EOCs could also place at risk an important advancement in Ohio’s accountability system. With EOC implementation, the state recently began to calculate value added (or growth) for high schools. This has been a step forward, as high schools had previously been judged on the basis of graduation rates and simple test scores—poor measures due to their close link with demographics and prior achievement. While it may be possible to calculate value added based on just one set of EOCs, a second assessment yields results in which we can have more confidence. It increases the sample size, which in turn allows for more precise statistical estimates of student growth. Additionally, since a second EOC covers a larger number of students attending a particular high school, the results better portray overall school performance. The results from just one grade may not reflect the performance of a school with four grade levels.
Third Grade ELA (Fall Administration)
This is the first of two ELA state tests that third-graders take, the other being the normal spring exam. It may benefit school leaders and educators to have the early results, especially with the Third Grade Reading Guarantee’s retention provisions in effect. For instance, they may want to know which students are most in need of immediate attention before taking the spring exams.
***
Calls to ditch state exams are sure to be loud as the superintendent’s committee starts its work. Its members—and the legislature, which would ultimately make decisions about state testing—should think carefully about the consequences of abandoning any of these exams.
E-schools, a.k.a. virtual charter schools, have been so thoroughly mired in controversy that they’ve become radioactive in most education discussions. Or in most discussions, period. The current dispute in Ohio is largely technical and centers on the extent to which e-schools provide learning opportunities to students rather than merely offering them. This is much more than semantics; how to track attendance and student log-ins for funding purposes is at the heart of a year-long lawsuit against the Ohio Department of Education (ODE) by one of the state’s largest and most politically influential e-schools. Hundreds of millions of public dollars are at stake.
There have also been broad concerns about e-schools’ lagging performance in Ohio as well as nationally. Last year, a trio of education groups, including long-time charter advocacy organizations, began to share their concerns more publicly, offering policy recommendations to base funding on performance and consider creating enrollment criteria for students. These bold suggestions were embraced shortly thereafter by Ohio’s Auditor of State, Dave Yost, who recently ordered a statewide examination of how online charters collect learning and log-in data.
So it’s no surprise that Senator Joe Schiavoni, a long-time advocate for charter accountability, is back at it with another bill. His latest proposal (SB 39) would add new rules for Ohio’s virtual schools, which collectively serve 38,000 students—well, some of them. E-schools overseen by school districts would get a pass. If Schiavoni hopes to be a serious champion for quality, he needs to drop the legislation’s double standards, carve-outs, and special exemptions, which are precisely what Ohio’s latest charter reforms were meant to eliminate. Now is not the time for new loopholes or favoritism.
Quick glance at SB 39
Let’s set aside whether the bill’s main provision—requiring e-schools to keep record of the number of hours students spend engaged in learning opportunities—is necessary. (In our view, Ohio’s charter reform law, HB 2, put adequate provisions in place to allow ODE to require log-in and attendance data.) What stands out most is the bill’s brazen partiality for district-sponsored e-schools and an attempt to hold them to a lower standard despite the fact that they presumably face the same issues with attendance tracking.
Take a look at some of SB 39’s provisions below, which would only apply to e-schools not sponsored by school districts. (This is not meant to be a comprehensive analysis of SB 39.)
The nine e-schools sponsored by school districts (out of 23 total in the state) would be unaffected by these changes. This begs all sorts of questions. If Schiavoni believes detailed log-in records are necessary to adequately gauge learning time in a virtual setting, why wouldn’t it also be applied to district e-students? If a student misses 12 school days, wouldn’t all parents of e-students appreciate notification? If transparency in governing board meetings (which are already public) are so important, why not require live streaming for all e-schools? Why not all publicly funded schools, for that matter? Why should the state change its funding calculation, but only for some e-schools? If test scores from e-students shouldn’t mar school districts in the event that students transfer back, is it fair (or even legal) to count scores in some instances but not others? What would happen if one of Ohio’s large e-schools switched to a district sponsor; would these provisions no longer apply to them?
Wanted: consistency
Despite most aspects of SB 39 being unnecessary and/or overboard, the bill’s intention to improve e-school accountability is reasonable given their performance history and the amount of funding at stake. Still, the carve-outs for district-affiliated schools have no place in an accountability bill. SB 39 also shines light on hypocrisy among some members of the General Assembly, who may be motivated by antipathy for Ohio’s big virtual charter networks, allegiance to traditional public schools, or both. Plain Dealer reporter Patrick O’Donnell covered this inconsistency in December with a particularly poignant headline: “A few online schools want special treatment to avoid paying money back to state.” Several district-sponsored e-schools serving at-risk students—faced with the threat of having to pay back public funds—made “emotional pleas” to the state legislature. They won the hearts of some Democrats—several of whom are typically unabashed in calls for more accountability for e-schools and charters broadly.
Perhaps those same lawmakers don’t realize that 40 percent of Ohio’s e-schools are sponsored by school districts. Many are low-performing and post similar scores (sometimes lower scores) compared to other e-schools. If SB 39 is sound policy, it should be applied across the board. On the flipside, if SB 39 would create undue compliance burdens, jeopardize the education of hard-to-serve youth, or impose unreasonable standards, Schiavoni and his fellow Democrats are entitled to be concerned. The problem lies in caring about only some schools and students, while thousands of others would be disparately impacted for the sole reason that they opted out of the traditional public school system.
It’s that time of year when many of us are searching desperately for a local Girl Scout troop in order to buy some cookies. (Helpful hint: It’s super easy to find a cookie booth near you.) But the Girl Scouts aren’t just the bearers of thin mint goodness—the organization also has a research arm, which recently published The State of Girls 2017, an examination of national and state-level trends related to the health and well-being of American girls.
The report analyzes several indicators including demographic shifts, economic health, physical and emotional health, education, and participation in extracurricular/out-of-school activities. Data were pulled from a variety of national and governmental sources, including the U.S. Census Bureau and the U.S. Centers for Disease Control and Prevention. Trends were analyzed from 2007 through 2016.
American girls are growing more racially and ethnically diverse along with the rest of the country’s population. The report notes that the percentage of white school-age girls (ages five to seventeen) decreased from 57 percent in 2007 to 51 percent in 2016. Meanwhile the percentage of Hispanic/Latina girls increased from 20 to 25 percent while the percentage of Black girls decreased from 15 to 14 percent. Approximately 26 percent of all school-age girls are first- or second-generation immigrants, up from 23 percent in 2007. 34 percent of girls live in single-parent homes, and 41 percent of girls live in low-income families. Both these percentages are slightly higher than they were in 2007.
For girls’ physical and emotional health, there’s both good news and bad. Most risky behaviors—such as smoking cigarettes and alcohol use—have declined. Fewer girls report being bullied, though there has been a slight increase in the number of girls who report being victims of cyberbullying. But there’s worrisome data surrounding emotional health: In 2015, 23 percent of high school girls reported seriously considering suicide, compared to 19 percent in 2007. The rate was highest among ninth-graders (27 percent). In addition, approximately 13 percent of low socioeconomic-status girls reported being depressed compared to 9 percent of more affluent girls. The report’s authors concluded that these data demonstrate the need for “better mental health assessments and interventions for youth in schools and communities.”
Speaking of school, the data related to high school completion and reading and math proficiency should already be familiar to those in the education world. The high school dropout rate has decreased for girls, but it’s significantly higher among low-income girls than among their higher income peers—6 percent compared to 2 percent, respectively. Using NAEP as its basis, the report also notes that although reading and math proficiency has generally improved for girls, achievement gaps based on race and income persist.
Perhaps the most interesting aspect of this report are the data on extracurricular and out-of-school activities. It’s a widely accepted fact that enrichment and extracurricular opportunities matter. Unfortunately, consistent school athletic participation is significantly lower for low-income girls: 17 percent participated regularly, compared to 31 percent of higher income girls. And it’s not just sports, either. Low-income girls also have lower levels of extracurricular participation in areas like community affairs or volunteer work and student council/government.
These statistics on America’s girls serve as a solid reminder that schools and nonprofit groups have a big role to play in ensuring that all young women have the opportunity to succeed.
SOURCE: “The State of Girls 2017: Emerging Truths and Troubling Trends,” The Girl Scout Research Institute (2017).
When the Ohio Teacher Evaluation System (OTES) went into effect in 2011, it was the culmination of a process that began back in 2009 with House Bill 1. This bill was a key part of Ohio’s efforts to win the second round of Race to the Top funding, which, among other things, required states to explain how they would improve teacher effectiveness.
Beyond bringing home the bacon, Ohio’s evaluation system aimed to accomplish two goals: First, to identify low-performing teachers for accountability purposes, and second, to help teachers improve their practice. Unfortunately, as we hurtle toward the end of the fourth year of OTES implementation, it’s become painfully clear that the current system hasn’t achieved either goal.
To be fair, there have been some extenuating circumstances that have crippled the system. Thanks to its ever-changing assessments, Ohio has been in safe harbor since the 2014-15 school year, which means that the legislature prohibited test scores from being used to calculate teacher evaluation ratings. As a result, the full OTES framework hasn’t been used as intended since its first year of implementation in 2013-14. But even back then, OTES didn’t offer much evidence of differentiation—approximately 90 percent of Ohio teachers were rated either accomplished or skilled (the two highest ratings) during the first year, and only 1 percent were deemed ineffective.
Despite the fact that most teachers earn the same ratings, their experience with the system can vary wildly depending on the grade and subject taught. To understand why, it’s important to understand how the current system works: In Ohio, there are two teacher evaluation frameworks that districts choose between. The original framework assigns teachers a summative rating based on teacher performance (classroom observations) and student academic growth (student growth measures), with both components weighted equally at 50 percent. The alternative framework also assigns a summative rating based on teacher performance and student academic growth, but changes the weighting and adds an additional component: 50 percent on teacher performance, 35 percent on student growth, and 15 percent based on alternative components, such as student surveys. Under both frameworks, there are three ways to measure student growth: value added data (based on state tests and used for math and reading teachers in grades 4-8), approved vendor assessments (used for grade levels and subjects for which value added cannot be used), and local measures (reserved for subjects that are not measured by traditional assessments, such as art or music). Local measures include shared attribution, which evaluates non-core teachers based on test scores from the core subjects of reading and math, and Student Learning Objectives (SLOs), which are long-term academic growth targets set by teachers and measured by teacher-chosen formative and summative assessments.
Results from these frameworks have left many teachers feeling that the system—and the student growth component in particular—is unfair. They’re not wrong. As our colleague Aaron Churchill wrote back in 2015, Ohio teachers with student growth evaluated based on value added measures[1] were less likely to earn a top rating than teachers using other methods. A 2015 report from the Ohio Educational Research Center (OERC) found that 31 percent of Ohio teachers used shared attribution to determine their student growth rating—meaning nearly a third of teachers’ ratings were dependent on another teacher’s performance rather than their own. SLOs, meanwhile, are extremely difficult to implement consistently and rigorously, they often fail to effectively differentiate teacher performance, and they’re a time-suck: A 2015 report on testing in Ohio found that SLOs contribute as much as 26 percent of total student test-taking time in a single year. In essence, OTES doesn’t just fail to differentiate teacher performance—it fails to evaluate teachers fairly period.
As far as professional development goes, the results probably haven’t been much better. A quick glance at the ODE template for a professional growth plan, which is used by all teachers except those who are rated ineffective or have below-average student growth, offers a clue as to why practice may not be improving: It’s a one-page, fill-in-the-blank sheet. Furthermore, the performance evaluation rubric by which teachers’ observation ratings are determined doesn’t clearly differentiate between performance levels, offer examples of what each level looks like in practice, or outline possible sources of evidence for each indicator. In fact, in terms of providing teachers with actionable feedback, Ohio’s rubric looks downright insufficient compared to other frameworks like Charlotte Danielson’s Framework for Teaching.
In short, OTES has been unfair and unsuccessful in fulfilling both of its intended purposes. Luckily, there’s a light at the end of the tunnel: ESSA has removed federal requirements for states related to teacher evaluations. This makes the time ripe for Ohio to improve its teacher evaluation system. We believe that the best way to do this is to transform OTES into a system with one specific purpose—to give quality feedback to teachers to help them improve their craft.
A series of new recommendations from Ohio’s Educator Standards Board (ESB) contains some promising proposals that could accomplish this, including a recommendation to end Ohio’s various frameworks and weighting percentages by embedding student growth measures directly into a revised observational rubric.[2] Ohio teachers would then have their summative rating calculated based only on a revised observation rubric rather than a combination of classroom observations and student growth components. Specifically, ESB recommends that five of OTES’ ten rubric domains incorporate student growth and achievement as evidence of a teacher mastering that domain. These domains include knowledge of students, differentiation, assessment of student learning, assessment data, and professional responsibility.
Not only would teachers be required to “use available high-quality data[3] illustrating student growth and achievement as evidence for specific indicators in the OTES rubric,” they would also be required to use these data “reflectively in instructional planning and in other applicable areas of the revised OTES rubric.” This will go a long way toward convincing teachers that assessment data can help improve their practice rather than just unfairly “punish” them. Most importantly, though, it reflects a solid understanding of how good teachers use assessments and data already.
The only problem with this idea is that ESB recommends including value added measures based on state tests as part of the new system. State tests weren’t designed for and weren’t intended to measure teacher effectiveness. So rather than carrying these assessments into a revised system, we propose that the role of state tests in teacher evaluations cease completely. Removing state tests from consideration and letting districts select formative and summative assessments with real classroom purposes is a far better way to fulfill the ESB’s call to “promote the use of meaningful data by teachers and districts that reflects local needs and contexts.”
As with many policy proposals, there are some implementation issues that could undermine the potential of this recommendation. The revision of the rubric—and how assessments are incorporated into it—will be hugely important. If the use of student achievement and growth becomes just one of many evidence boxes to check off rather than a deciding factor for both performance ratings and which professional development opportunities to explore, then the rubric won’t be honest and won’t lead to effective professional development.
To be clear, this suggestion isn’t an attempt to rollback teacher accountability. Rather, it’s an acknowledgement that Ohio’s current system—before and during safe harbor—doesn’t actually hold anyone accountable. Ohio doesn’t have a statewide law that permits the dismissal of teachers based solely on teacher evaluation ratings, so even for the small number of teachers who are identified as ineffective there aren’t meaningful consequences. Moreover, the testing framework built specifically for OTES has created its own bureaucracy and helped feed the anti-testing backlash.
Data show that well-designed evaluation systems based solely on rigorous observations can impact the quality of the teacher workforce. By transforming OTES into a system that focuses on teacher development, we don’t just get improvement for teachers and better learning experiences for kids; we could also end up effectively differentiating teachers without high stakes testing. What’s not to like about that?
[1] According to our previous calculations, approximately 34 percent of Ohio teachers are evaluated based on value added.
[2] A separate ESB recommendation not explored in this piece advises that the OTES rubric be updated in collaboration with a national expert in rubric design and the assessment of teaching. This revision process is likely how student growth measures would be embedded into the rubric.
[3] The ESB notes that “ODE will establish high-quality criteria which all growth and achievement data must meet.”
A recent report from Education Northwest extends previous research by the same lead researcher, drilling down into the same dataset in order to fine-tune the original findings. That earlier study (June 2016) intended to test whether incoming University of Alaska freshmen were incorrectly placed in remedial courses when they were actually able to complete credit-bearing courses. It found that high school GPA was a stronger predictor of success in credit-bearing college courses in English language arts and math than college admissions test scores. The follow-up study deepens this examination by breaking down the results for students from urban versus rural high schools, and for students who delay entry into college.
In general, the latest study’s findings were the same. Except for the students who delayed college entry, GPA was generally found to be a better predictor of success in college coursework than were standardized test scores. It stands to reason that admissions test scores would better represent the current abilities of students who delayed entry into college (call it the final “summer slide” of one’s high school career), and indeed the previous study showed that students who delayed entry were several times more likely to be placed into developmental courses than were students who entered college directly after high school graduation. But does this mean that colleges err when they use such test scores to place incoming students? The Education Northwest researchers believe so, arguing that colleges should use high school GPAs in combination with test scores, with the former weighted more highly since GPAs can more effectively measure non-cognitive skills they deem more relevant to college success.
But it is worth noting that both of their studies are limited by a few factors: First, there are only about 128,000 K–12 students in all of Alaska, and its largest city, Anchorage, is about the same size as Cincinnati. A larger, more diverse sample (Baltimore, New York, Atlanta, or even the Education Northwest’s hometown of Portland, Oregon) could yield different results. Second, there is no indication that the University of Alaska students were admitted or placed solely on the basis of admissions test scores. Sure they’re important, but not every school puts Ivy League emphasis on test scores to weed out applicants. Third, the “college success” measured here is only a student’s first credit-bearing class in ELA and math. That seems like a limited definition of success for many students; depending on one’s major, math 102 is harder than math 101. Fourth, “success” in these studies merely means passing the class, not getting an A. If a student’s high school GPA of 2.5 was better at predicting his final grade in the college class (a D) than was his SAT score (in the 50th percentile), only Education Northwest’s statisticians should be happy about that. A more interesting and useful analysis would look at the difference in success rates between students with high versus low GPA, students with high versus low test scores, or students who earned As versus Ds in the college courses.
Previous studies have shown correlation between high GPA and high ACT scores. There’s lots of talk that test scores are (but shouldn’t be) the most important factor when it comes to college admissions decisions, and the “who needs testing?” backlash at the K–12 level appears to have reached upward to colleges. This study is not the silver bullet that’s going to slay the admissions testing beast, but more care must be taken at the college level to avoid incorrect and money-wasting developmental placements. It is to be hoped that at least part of the answer is already in development at the high school level (high standards, quality curricula, well aligned tests, remediation/mastery) and that colleges will be able to jump aboard and calibrate their admissions criteria to maximize high levels of performance, persistence, and ultimately degree attainment.
SOURCE: Michelle Hodara, Karyn Lewis, “How well does high school grade point average predict college performance by student urbanicity and timing of college entry?” Institute of Education Sciences, U.S. Department of Education (February, 2017).