How AI-generated content is upping the workload for Wikipedia editors

As AI-generated content continues to flood user-generated platforms across the Internet, thanks to advancements in large language models (LLMs) like OpenAI’s GPT, one group of unsung heroes is feeling the pressure more than most: Wikipedia editors. While they’ve long been tasked with rooting out human error, vandalism, and bias, these dedicated volunteers are now facing a new challenge—combating the rise of AI-produced content that’s clogging up the site with inaccuracies and filler.

In recent years, the ability of LLMs to churn out high volumes of seemingly coherent text has made them attractive tools for those looking to contribute to Wikipedia. However, the consequences of this influx of AI-generated content are becoming more apparent, especially as much of this machine-written material comes with significant drawbacks. One of the biggest issues? Improper sourcing. Unlike human contributors, who generally understand the importance of citations, AI often produces passages without providing accurate or relevant sources. This lack of proper attribution turns otherwise legitimate-sounding content into misinformation traps, presenting a formidable challenge to Wikipedia’s mission of delivering reliable, verifiable knowledge.

The impact of this growing problem has been so severe that a group of Wikipedia editors, led by Ilyas Lebleu, have created a dedicated initiative called the “WikiProject AI Cleanup.” The project is focused on identifying and removing AI-generated contributions, while also developing best practices to prevent such content from slipping through the cracks. In an interview with 404 Media, Lebleu emphasized the difficulty of detecting AI-written content, noting that ironically, AI tools themselves are not very effective at spotting their own handiwork.

“AI content can sound incredibly plausible at first glance,” said Lebleu. “But when you dig into it, it often lacks the rigor and careful sourcing that Wikipedia depends on. That’s a big problem.”

What makes AI-generated content particularly insidious is its ability to create entire fabricated entries that seem legitimate at first glance. These entries may mimic the tone and structure of authentic Wikipedia articles but are riddled with inaccuracies, half-truths, or even complete hoaxes. In the hands of bad actors, these tools can be used to try and sneak false information into Wikipedia—something that editors are now finding themselves constantly on the lookout for.

The problem isn’t just that these entries exist, but that they create a massive new workload for the editors. Instead of simply fixing a typo or removing a biased statement, they’re now tasked with evaluating whole paragraphs or pages to determine whether AI wrote them, and then if the information provided is trustworthy. With the explosion of AI-generated content, this responsibility is only increasing, putting more pressure on an already strained community of volunteers.

Despite the efforts of projects like WikiProject AI Cleanup, the fight is far from over. AI-generated content is often subtle enough to slip past initial scrutiny, requiring editors to become even more vigilant in their work. Unfortunately, the same technology that’s empowering a wave of new digital creators is also giving rise to an unprecedented wave of misinformation.

As large language models continue to evolve, the challenge for Wikipedia editors grows. The task of protecting the platform’s integrity—always a difficult job—has now become even more complicated. For those behind the scenes, Wikipedia is still very much a battleground in the fight for reliable, credible information. And as AI’s influence on the web expands, Wikipedia’s editors are standing on the frontlines.