When students take tests and score at the “basic” level, we tend to assume that—if the test is a good one—this means they’ve attained a relatively low level of skill, short of proficiency and far from mastery of the material.
But two recent studies have used clever methodologies to complicate this picture. They’ve found that, although assessments gauge students’ level of “cognitive skills”—the knowledge of the material and ability to perform academic tasks that assessments are meant to assess—they also capture an important factor that mediates between a student’s skills and the eventual score: effort on the test.
Student effort is not equal in all contexts, meaning it contributes to differences in test scores for different countries (or states, districts, schools, and students). These studies have focused on the consequences of varying effort for making international comparisons, but the implications of their conclusions are wide-ranging, both for testing and for education policy more broadly.
The research
The studies’ methodologies are clever in how they separate the effects of cognitive and non-cognitive skills, like effort and persistence.
In one, researchers at the University of Arkansas used data from the PISA international assessment to estimate the extent to which scores are driven by differences in non-cognitive factors. To identify the role of testing effort, they took advantage of the fact that PISA’s test questions are randomly ordered and gauged students’ lack of effort with three measures:
- Performance decline: Students perform better on the first ten questions of the test than on the last ten questions, when they are relatively fatigued.
- Item non-response: Students simply don’t answer some questions.
- Careless answering: Students answer questions in contradictory ways on the PISA student survey.
The results are fascinating. Performance decline and non-response explain between 19 and 41 percent of the difference among country scores on PISA, depending on which effort measure and PISA subject the researchers analyzed. This implies that some top countries earn their ranks because of students’ effort, not necessarily students’ mastery of the content.
But if some countries’ scores are driven down by their students’ lack of effort, what’s causing their indifference? One reason could be that certain students are less likely to care about a test that doesn’t affect them. They may think, “If performance on this makes no difference to my life, why even try?”
The Arkansas study doesn’t try to answer this, but another study conducted by a team of researchers from the U.S. and China found that students disagree about the appropriate level of effort on no-stakes tests. The analysts set up a simple experiment wherein they compared a sample of American students to a sample of Chinese students.
Each country’s cohort was split into two groups: one group in each country was given cash incentives for their test performance; and the control group in each country was given nothing.
Cash-motivated American students earned much higher marks than the Americans who didn’t have the incentive. But the promise of money made no difference to the Chinese students. They tried hard regardless.
The researchers estimated that if American students tried as hard as the Chinese students on the PISA math assessment, the U.S. would move from its current rank of thirty-sixth in the world to nineteenth.
Although these two studies employed different approaches, they both suggest that student effort on low-stakes tests varies across countries.
Wide-ranging implications
The most obvious upshot of this research is that the international academic rankings from assessments like PISA may be misleading. Countries whose students earn top marks might not know much more after all. Yet the research’s implications are far broader.
Effort matters. These studies only discuss differences in effort on the tests themselves, but such variances likely exist in other academic contexts. A European research team found that effort measures on PISA (like the ones discussed above) are correlated with countries’ economic growth, implying that these measures of effort are getting at something deeper than simply test-taking skills. If effort on the test is correlated with effort in school more generally, then it’s safe to say that many Americans aren't trying their hardest in school. Considering that some researchers have said that student effort may be “the most important input in the education process,” getting students to work harder in school should be a more important part of the education policy conversation.
Culture influences effort. By demonstrating variation in effort across countries, these studies point to the importance of culture. Researchers and advocates should not shy away from investigating these connections and studying what schools can do to shape pro-social and pro-academic norms. Culture may be hard to change, but when families choose “no excuses”-style schools, their strict pro-academic culture may be part of the appeal. With this new evidence that effort varies across communities, we should support educators who try to build school cultures that value achievement.
Risks of corruption. Score manipulation is a major concern when tests are high-stakes for teachers and schools but low-stakes for students. For example, the thread “Celebrating Test Scores” on the educator bulletin board of ProTeacher.net shows numerous examples of educators throwing pizza and ice cream parties for classrooms that meet certain testing benchmarks and rewarding individual students with material and non-material incentives for strong performance. When some schools pressure their students in these ways and others don’t, students of similar cognitive ability earn different scores, which distorts conclusions that can be drawn from the data. State officials should take note of these potential strategies and consider whether to encourage all schools to motivate the students in these ways or prohibit any of them from doing it.
Germany did something similar to throwing a countrywide ice cream party in the wake of a poor debut on the 2001 iteration of PISA. They massively raised the test’s profile in the country—leading to, among other things, a new TV quiz program called the “PISA show”—and scores rose. Education researchers and PISA officials have acknowledged Germany's efforts but have ignored the way prompting students to try harder on the tests themselves biases comparisons.
***
Looking under the hood of these tests to consider the effects of student effort may be uncomfortable for some. But when researchers make invalid comparisons, they taint the whole idea of independently and effectively assessing student performance. It also contributes to the growing backlash against testing. Better understanding these limitations and vulnerabilities will improve our use and design of tests, prompt new insights into student behavior, and—in the long run—make assessments more legitimate to the public.