Skewed Grading Algorithms Fuel Backlash Beyond the Classroom

London has seen many protests in its 2,000 years, but a chant that rang out in front of the Department for Education this past Sunday was likely a first. “Fuck the algorithm,” yelled a crowd of impassioned teens, many of them masked against a pandemic virus.

The crowd was protesting the statistical calculus that assigned final grades in A-levels, which determine college places in the UK, a fallback after Covid-19 canceled end-of-year exams. About 40 percent of students received grades lower than their teachers had projected earlier in the year.

“Many, like me, who come from working-class backgrounds, had their dreams completely crushed by an algorithm,” says Ema Hannan, who attended Sunday’s protest. Lower-than-anticipated grades in two of three subjects may have cost her a place at the London School of Economics.

Remarkably, the protest Hannan attended was the second teen uprising against educational algorithms this summer. Last month, more than 100,000 students, most in the US, were assigned final grades on a high school qualification called the International Baccalaureate using a similar process after in-person tests were canceled. As in the UK, many students and teachers complained of grades that were sharply lower than expected, and college places were lost.

The UK government and the organization behind the IB both yielded to the protests this week, abandoning their original calculations in favor of letting prior assignments or teachers’ predictions determine students’ final grades.

The algorithmic grading scandals of 2020 may resonate beyond students, by highlighting the extent to which algorithms now rule our lives, and the hazards of applying these formulas to people. Researchers and activists have revealed skewed calculations at work in criminal justice, health care, and facial recognition. But the grading scandals have earned unusually high public interest and political attention, particularly in the UK, where the government was forced into an embarrassing U-turn.

Data scientist Cathy O’Neil helped start the movement to hold algorithms accountable with her 2016 book Weapons of Math Destruction. She says the A-level and IB grading algorithms fit her criteria for such WMDs, because they are important, opaque, and destructive. “They tick all the boxes,” she says.

The grading algorithms are perceived as particularly unfair because they assigned individual grades in part based on data from past students at the same school. That could make students’ college plans dependent on factors outside their control, including some linked to economic inequality such as school resources.

O’Neil says questionable inferences like that are woefully common in areas such as insurance, credit, or job applicant screening. Reuters reported in 2018 that Amazon scrapped an automated résumé filter that excluded women because it was trained on past data.

The skewed results of such systems are usually hard to see. Job applicants expect not to get most jobs, and they don’t get to compare results with other job seekers, as students could compare grades this summer. That the grading algorithms affected a nationwide cohort of bright, relatively well-off kids headed to college helped win public and political attention.

“When I get the ear of a policymaker, I say we eventually figured out car safety because there were so many dead people at the side of the road,” O’Neil says. “With algorithms, the dead people, or those being discriminated against, are invisible for the most part.”

The visibility of the grading snafus also shows how algorithmic problems are mostly about people—not math. A-level and IB administrators didn’t intentionally derive an equation calibrated to screw up students’ summers. They hastily crafted systems to substitute for their usual in-person tests in the face of a deadly pandemic.

Inioluwa Deborah Raji, a fellow at NYU’s AI Now Institute, which works on algorithmic fairness, says people reaching for a technical solution often embrace statistical formulas too tightly. Even well-supported pushback is perceived as highlighting a need for small fixes, rather than reconsidering whether the system is fit for the purpose.

article image

The WIRED Guide to Artificial Intelligence

Supersmart algorithms won’t take all the jobs, But they are learning faster than ever, doing everything from medical diagnostics to serving up ads.

That pattern is seen in how some authorities using facial recognition have responded to concerns from communities of color by saying that accuracy on darker skin tones is improving. Raji saw it again in how the organizations behind the IB and A-level algorithms initially directed protesting students to file individual appeals, with attendant fees. That made students from poorer households less likely to take the gamble. “The appeal system wasn’t built for all communities either, just like technology wasn’t built for every part of the population,” Raji says.

Leave a Reply

Your email address will not be published. Required fields are marked *