Skip to main content

Mobile Navigation

  • National
    • Policy
      • High Expectations
      • Quality Choices
      • Personalized Pathways
    • Research
    • Commentary
      • Gadfly Newsletter
      • Flypaper Blog
      • Events
    • Scholars Program
  • Ohio
    • Policy
      • Priorities
      • Media & Testimony
    • Research
    • Commentary
      • Ohio Education Gadfly Biweekly
      • Ohio Gadfly Daily
  • Charter Authorizing
    • Application
    • Sponsored Schools
    • Resources
    • Our Work in Dayton
  • About
    • Mission
    • Board
    • Staff
    • Career
Home
Home
Advancing Educational Excellence

Main Navigation

  • National
  • Ohio
  • Charter Authorizing
  • About

National Menu

  • Topics
    • Accountability & Testing
    • Career & Technical Education
    • Charter Schools
    • Curriculum & Instruction
    • ESSA
    • Evidence-Based Learning
    • Facilities
    • Governance
    • High Achievers
    • Personalized Learning
    • Private School Choice
    • School Finance
    • Standards
    • Teachers & School Leaders
  • Research
  • Commentary
    • Gadfly Newsletter
    • Flypaper Blog
    • Gadfly Podcast
    • Events
  • Scholars Program
Flypaper

Would capturing student growth in grades K–2 lead to different school ratings?

Amber M. Northern, Ph.D.
12.22.2022
Getty Images/DGLimages

In the wake of dismal NAEP reading scores released earlier this year, calls for stronger education policies have grown louder. Test scores from third grade (which is likely already too late for intervention) are the earliest indication of whether students are on track or falling behind. But do third grade test scores serve as a true indication of the quality of students’ early education? Obviously, since federally-mandated standardized testing begins in grade three, school accountability ratings don’t account for student progress in grades K–2. The question is whether that fact unfairly penalizes schools that are making commendable gains with youngsters in those early grades. A recent study from Mathematica’s Walter Herring investigates.

The study leverages NWEA MAP Growth test scores from ten states from the 2013–14 through 2018–19 school years, containing millions of test events across those five years. Still, it is not a nationally-representative sample, in part because education leaders choose to opt in and partner with NWEA for testing. Within those limits, Herring has access to spring MAP test scores and school-level demographic data and is able to calculate separately achievement and growth scores for each school. NWEA uses student growth percentiles, or SGPs, which assess the growth that a student exhibits in a given year relative to his or her peers with a similar test history. Although SGPs do not adjust for differences in student characteristics beyond prior achievement, they are commonly used in state accountability systems.

Herring calculates a school’s average test scores in grades K–2 on MAP and their average test scores in grades 3–5 on standardized assessments. He evenly weights achievement and growth to produce a combined score. To compare the combined scores in grades 3–5 with the scores they would have received if the reporting included K–2 results, he calculates achievement and growth scores based on MAP Growth results in grades K–5, then ranks schools based on their combined scores across two different grade bands (3–5 and K–5) within each state and year. Finally, he assesses the degree to which school rankings changed based on the proportion of schools that changed their quintiles (dependent on the grade levels included in the achievement and growth distributions). He also looks at whether the rankings changed based on the bottom 5 percent of achievement ratings, which are the schools in which (per ESSA) many states intervene in order to turn them around.

Results show that achievement scores in grades 3–5 were highly correlated with achievement scores in grades K–2. Not surprisingly, schools serving more low-income students tend to have lower average test scores, including in the early grades: Specifically, a 10 percentage-point increase in the proportion of students receiving free and reduced-price lunch is associated with one-tenth of a standard deviation decrease in average test scores in grades K–2. By contrast, schools’ growth scores in the upper elementary grades tended to be very different than their scores in the lower elementary grades, revealing a much weaker relationship between SGP measures across grade levels. But that’s not good news for high-poverty schools either, as schools with more low-income students had lower growth scores in the untested early grades after controlling for their growth scores in the tested grades.

In terms of how rankings might change, Herring compares the combined achievement/growth measure in grades 3–5 against schools’ rankings after incorporating scores for all students in grades K–5. Results show that 42 percent of schools change quintiles after accounting for test scores in K–2, with 5 percent moving multiple quintiles. And 38 percent of schools that fall in the bottom 5 percent based on results in grades 3–5 no longer appear in that lowest level when grades K–2 are accounted for. Schools that decreased quintiles served larger proportions of low-income students and Black children; likewise, schools that fell below the 5 percent threshold after including early elementary scores served more Black children.

So yes, including K–2 results could make a difference in school ratings. But the more unfortunate takeaway is that, because most high-poverty schools see slower growth in grades K–2 than other schools do, including these early grades in state accountability systems would tend to exacerbate the ratings gap between rich and poor schools. Is this an argument for more testing in the early grades? Or for putting the very best teachers in the lowest grade levels along with solid curricula? Or for identifying low-performing schools early and providing them with strong supports for their youngest learners? How about trying all of the above?

SOURCE: Walter Herring, “The Other Half of the Story: Does Excluding the Early Grades from School Ratings Matter?” Annenberg Institute at Brown University (August 2022).

Policy Priority:
High Expectations
Topics:
Accountability & Testing
Evidence-Based Learning
Governance
Teachers & School Leaders

Amber Northern is senior vice president for research at the Thomas B. Fordham Institute, where she supervises the Institute’s robust research portfolio and…

View Full Bio

Sign Up to Receive Fordham Updates

We'll send you quality research, commentary, analysis, and news on the education issues you care about.
Thank you for signing up!
Please check your email to confirm the subscription.

Related Content

view
High Expectations

How much education is a public responsibility?

Chester E. Finn, Jr. 2.2.2023
NationalFlypaper
view
High Expectations

Will ESAs change America’s definition of “public education?”: An interview with Ashley Berner

Robert Pondiscio 2.2.2023
NationalFlypaper
view
High Expectations

Schools have been adding teachers and student support staff, even as they serve fewer students

Chad Aldeman 2.2.2023
NationalFlypaper
Fordham Logo

© 2020 The Thomas B. Fordham Institute
Privacy Policy
Usage Agreement

National

1015 18th St NW, Suite 902 
Washington, DC 20036

202.223.5452

[email protected]

  • <
Ohio

P.O. Box 82291
Columbus, OH 43202

614.223.1580

[email protected]

Sponsorship

130 West Second Street, Suite 410
Dayton, Ohio 45402

937.227.3368

[email protected]