What Happens When Your School Thinks AI Helped You Cheat

Published Oct 18, 2024, 3:57 PM

The education system has an AI problem. As students have started using tools like ChatGPT to do their homework, educators have deployed their own AI tools to determine if students are using AI to cheat.

But the detection tools, which are largely effective, do flag false positives roughly 2% of the time. For students who are falsely accused, the consequences can be devastating. 

On today’s Big Take podcast, host Sarah Holder speaks to Bloomberg’s tech reporter Jackie Davalos about how students and educators are responding to the emergence of generative AI and what happens when efforts to crack down on its use backfire.

Read more: AI Detectors Falsely Accuse Students of Cheating—With Big Consequences

Bloomberg Audio Studios, podcasts, radio news.

Moira Olmsted dreams of becoming an elementary school teacher, so last year she enrolled in an online program at Central Methodist University, working toward a degree while taking care of for toddler. For one of her classes, Moira had to turn in weekly writing assignments summarizing news articles, which was easy enough, but just weeks into the fall semester, she got an unexpected grade back.

Early on in the semester, in one of her courses, she had received a zero.

Jackie Devallos is a tech reporter for Bloomberg. She spoke to Moira over the phone.

And I've might completely freaked out there those summary articles. Mainly, what we're doing is taking a bunch of information and just summarizing in maybe two three paragraph smacks.

She didn't really know what had happened, and she just kind of saw it pop up on her student portal. She raised this with the professor, and the professor I told her that she had been flagged for using AI.

She was like, hey, I run everybody's work through an AI detector, and yours has been flagged many times.

It's just getting out of hand.

But Moira says she never used generative AI tools like chat GPT to write her assignments.

To her, it was being blindsided by this technology that she had never really seen before, and Moira immediately followed up asking for additional details as to how this could have happened.

I was just like, okay, thanks for bringing that to my attention, Like I'm actually you know, I'm a future educator. I really am super against the use of AI, specifically in opinion and like thought pieces.

She had to raise it to several administrators at her school. She had multiple meetings emails that showed you this back and forth between her expressing this confidence in her work that just was being put into question.

Basically, her grade was ultimately changed, but Moira started to take extra precautions with her work, putting it through AI checkers, herself, screen recording her progress, and attaching the recordings to her assignments, anything to prove her work was original. But Jackie's reporting found there was another reason Moira's work may have been mistakenly flagged as AI generated.

She's on the autism spectrum and she's always written with somewhat of a formulaic kind of style, and so Moira, understanding that this might be perhaps one of the gaps that AI detectors have, she knew she wanted to be armed with proof that she had completed her work in case this came up again. Students who fall into this category, either their neuro divergent or English as their second language, they tend to get picked up more than their peers who might not fall into these categories.

Moira is just one student grappling with the challenges presented by this new frontier in education, and those challenges are playing out at schools and universities all over the country. I'm Sarah Holder, and this is the big take from Bloomberg News today. On the show, how universities and students are adapting to the emergence of generative AI and what happens when efforts to crack down on its use backfire. Jackie Moira wasn't using generative AI to do her homework. She's adamant that she wasn't, But other students are using tools like chat GPT to help write their papers. Can you just give us a sense of how big of a thing is this really? Right now?

It's huge. Some students like to use tools just for the spell check just for the syntax, and then a level up to help me rewrite this one section to all the way to just help me write my entire essay. And this is where you've seen this other cottage industry of startups and tools crop up to help basically detect against that.

I want to know more about how these AI detectors come to determine whether and how much a student has used AI and their writing or in their homework assignments. How do these tools work at a basic level?

AI detection software like turn it in, copy Leaks, and GPT zero, which were some of the other startups that we had looked at, basically use technology not so dissimilar from that of a chat GBT. They train their systems off of just a lot of text in the same way that chat gipd does. However, AI writing detectors look at what's called perplexity, and this is just a fancy term for a measure of how complex the words are in any given submission or a sentence or a paragraph. We speak with a lot of variety. We vary our sentence structure and diction throughout a particular sentence or passage. If word choices are a little bit more generic and formulaic that's going to have a higher chance of being flagged by an AI detector. And it basically spits out percentage of how much it believes the assignment is AI generated, and so in Moiris's case, it was a majority of It doesn't highlight which passages. It also doesn't give you an answer for how it got there. It's kind of this black box.

So educators are using these AI powered detectors to root out AI powered papers, but how well do these tools actually work?

We found that they're actually highly accurate. So we tested GPT zero and copy leaks on a random sample of five hundred college application essays that were submitted to Texas A and M in the summer of twenty twenty two. This is important because, as we know, chat GPT was released in the fall of twenty twenty two, so we know that these essays were not AI generated because chat GPT hadn't even been released yet. After running the analysis, we found that these startups falsely flagged about one to two percent of the essays as likely written by AI, and in some cases they claim to have near one hundred percent certainty. But the problem is one to two percent of essays is still high in some ways, and that small error rate can add up just given how many student assignments are submitted throughout the year across the country.

Yet two out of every one hundred students running the risk of being mistakenly accused of plagiarism, maybe getting expelled still feels pretty bad. Who is this affecting most?

We found two groups and that can be particularly vulnerable to some of the flaws in AI detection software. One is if you're neurodivergent like Moira, if you're on the spectrum, for example. Another is if English is your second language.

How disproportionately are these kinds of students being impacted by these false flags?

Stanford researchers found that AI detectors were almost perfect when checking essays written by US born eighth grade students, but they flagged over half of those essays written by non native English speakers as AI generated. So the false flag there is extremely high.

What about the impact on professors themselves? Is it making them more skeptical and more paranoid of the work that students are turning in.

On the whole, professors are still a little bit on the fence about how exactly AI should be used in the classroom. You have some who are wanting to incorporate it into aspects of the curriculum, like using it to help you brainstorm or do some of the initial research. Other professors are telling me, we don't mind if you want to have chadgbt write this aspect of your essay, just cite it appropriately. Professors are trying to figure out at what point does AI kind of erode the experience of learning and at what point does it actually help it. But if they're there's one thing that professors agree on, it's that it's not going anywhere.

AI isn't going anywhere, But how can students and educators use the technology responsibly? That's after the break, We're back. I've been speaking with Bloomberg reporter Jackie Devalis about the shortcomings of software that colleges and universities are using to detect and root out AI generated work. Are they trying to set new policies to incorporate the understanding that these AI detection tools have some blind spots.

Definitely. You know, you're seeing some schools put down firmer policies around what's considered plagiarism if you use chatchipt for a part of your essay and don't cite it, that can be considered plagiarism. But if you do cite it, then it's okay. And others are basically allowing professor to use these AI detection tools however they please without actually saying if your essay is fifty percent AI generated or ninety eight percent AI generated, then you will face a consequence. So it's left up to the professor to what's acceptable. But some universities are really mindful of the fact that these AI detectors aren't completely accurate.

What are students doing to make sure that their original work isn't mistaken for AI?

Students are really starting to get creative with how they can protect themselves. Many of them told me that, like Moira, they're starting to do their work in Google docs and tracking everything to create this digital paper trail. Others tell me that they're using other tech tools out there that are almost created as a way to humanize your text. I had a conversation with a student who went to school in California and telling me how he tweaks his wording in some parts of an essay to actually sound worse because he's afraid that if it sounds too good then that it might get caught by an AI detector.

This all sounds like so much work for students and for educators to kind of work around the blind spots that this technology has. What are companies trying to do to improve their models.

We spoke to almost all of the companies that we looked at, and what they told us is that they actually intentionally oversample underrepresented groups like students who might not be native English speakers, and that because of that, it's kind of this ever evolving process of iterating and making it more accurate. We also spoke to the copy leaks co founder and CEO who told us that they're ninety nine percent accurate, but still a small number of errors can occur from time to time. GPT zero was another company who told us that they're actually coming out with another tool that is almost like a tool that students will be able to write into, and it not only tracks your work, it has time stamps around when you entered the document, when you exited.

So these companies are creating the problem and then offering solutions for the problem.

In some ways, yes, it's funny because it also shows that they acknowledge that the detection software itself is imperfect. The thing that these companies emphasize is that they're now trying to get that professor feedback and relate to them too, that this isn't the end all, be all tool that you should use to grade your student's work.

Jackie, My last question for you is just about Moira. How is she doing now? Has she finished her studies and is she becoming a teacher herself.

She's on track to continue her coursework this semester, a mom of two now, and she's really excited about what's ahead. It's an ever evolving world, and she tells us that despite this overwhelming incident, which was unfortunate, she's still looking forward to being an educator in the future.

Thank you so much, Jackie, Thank you. This is the Big Take from Bloomberg News. I'm Sarah Holder. This episode was produced by Thomas lou and Jessica Beck. It was edited by Aaron Edwards and Seth Fiegerman. It was mixed by Alex Suguia. It was fact checked by Adrianna Tapia. Our senior producer is Naomi Shavin, who also edited this episode. Our senior editor is Elizabeth Ponso, Our executive producer is Nicole Beamster Boor Sage Bouman is Bloomberg's head of podcasts. If you liked this episode, make sure to subscribe and review The Big Take wherever you get your podcasts. It helps people find the show. Thanks for listening, we'll be back next week.

Hello,

In 1 playlist(s)

  1. Big Take

    628 clip(s)

Big Take

The Big Take from Bloomberg News brings you inside what’s shaping the world's economies with the sma 
Social links
Follow podcast
Recent clips
Browse 630 clip(s)