Concerns over the increased potential for cheating are front and center in debates over testing while students are learning remotely. A new report from a group of researchers at Rensselaer Polytechnic Institute (RPI) in New York details their cheat-resistant online exam protocol, an innovation that could fill an immediate need and pave the way for the future of testing.
Methods of online proctoring exist, but they are often expensive, riddled with privacy concerns, and draconian, forcing students to, for example, keep microphones on and remaining in frame for an hour. They also signal to students, perhaps unintentionally, that adults don’t trust their honesty. Text-recognition software can discreetly detect plagiarism, but it’s useless on multiple-choice or calculation questions and younger students’ written work. Using a huge bank of test items to randomly deliver different questions to different students could also limit remote cheating opportunities, but it requires an extraordinary amount of work for educators and goes against best educational practices.
The RPI team sought to address the drawbacks of each of these models by creating a simple, cost-effective, and privacy-conserving solution that would help educators administer a valid remote assessment with minimal efforts. The key component of their model, called a distanced online test (DOT), is timing. Rather than having all students start the DOT at the same time, the test is broken down into sections which are given to different groups of students at various times. Those at the lowest-mastery levels of the content—as determined by midterm scores, current GPAs, SAT scores, or other class grades received prior to the DOT—start the test first. Once that lowest-mastery group has completed the first section, they move on to the next—with no option to return to previous sections—while the next-highest-mastery group starts the first section. And so on.
Without live proctoring, the main cheating concerns are internet searches for answers and collusion with others. Previous research into online testing found that nearly 80 percent of cheating events occurred via collusion, 42 percent via copying from the internet, and 21 percent falling into both categories. Statistical evidence suggested that collusion would be strongly suppressed by the DOT model’s staggered start. With no ability to return to closed sections, students wishing to collude would have to do so in real time. But the higher-mastery students—from whom help would most likely be solicited—would not be working on the same set of questions. Internet copying, meanwhile, could be addressed through question construction and a slightly larger question pool.
The main benefit promised by the DOT method was simplicity. No additional equipment required, no random question generators needed, and no violations of student privacy. While more test questions were required in the optimal DOT method—so that the test sections received by each cohort would not be exactly the same in content—the RPI team determined that a maximum pool just 1.5 times larger than the number of total test items would do the trick, especially if questions were mainly of a type that “require intellectual efforts [rather] than factual recalls.”
The RPI team honed their model and then tested it as a fully-remote, non-proctored final exam in a class where the midterm exam had been given fully in person earlier in the semester. Seventy-eight students took both exams, which each consisted of forty graded items. All were multiple choice questions. The DOT final was broken into two sections of twenty questions each. There were two mastery cohorts for the DOT final, and thus two starting times. The results of the midterm served as a control to which the DOT exam results were compared. Both exams produced the typical bell-shaped curve of a normal distribution of scores, and analysis showed that both distributions had the same mean value, demonstrating consistent evaluative results of the same population via the two very different types of testing methods. Their analysis also found random patterns of incorrect answer matches between any given pair of students—a traditional means of testing for evidence of cheating—and an approximately equal distribution of correct answers between the two test sections. This latter test was DOT-specific and attempted to account for the fact that more collusion was to be expected in one half of the exam versus the other. Given these findings, the RPI team determined that the DOT reduced the possible point gain due to collusion to less than 0.09 percent. Post-exam surveys indicated general approval of the DOT structure by students, reasonable assessment of question difficulty, and positive disposition toward ease of use.
The RPI team concluded that their DOT model not only met the criteria of an easy, cost-effective, non-proctored remote testing platform, but also that, when students knew they could not collaborate with others, they were more motivated to actually study the material to achieve correct answers themselves. These are all important aspects of good online exams, but it cannot be overlooked that the tested version of DOT was developed for college students, where the possibility of expulsion for cheating is a real concern, and that it included only multiple-choice questions and covered just two mastery cohorts. Whether this approach to online testing will work at scale in K–12 education is hard to know. But RPI’s simple strategy to stagger testing times might just be a way to lessen the potential for cheating.
SOURCE: Mengzhou Li, et. al., “Optimized collusion prevention for online exams during social distancing,” npj Science of Learning (March 2021).