AI-powered web crawlers have become the scourge of the internet, likened to digital cockroaches by many software developers. While web scraping has long been a reality, the rise of AI-driven bots has taken things to a new level—ignoring ethical boundaries, overloading servers, and extracting vast amounts of data without consent. Frustrated with this relentless invasion, open-source developers have begun fighting back with humor, creativity, and, in some cases, outright trickery.
The Open Source Dilemma
While websites across the internet deal with rogue AI scrapers, open-source projects suffer disproportionately. By design, Free and Open Source Software (FOSS) projects make their infrastructure widely accessible, allowing anyone to view, contribute, and download their work. Unfortunately, this openness makes them prime targets for AI bots that consume resources without giving back.
Unlike traditional search engine crawlers that follow ethical guidelines like the Robots Exclusion Protocol (robots.txt), AI-driven scrapers often disregard these rules entirely. The result? Websites and Git repositories get hammered by unrelenting bot traffic, sometimes leading to Distributed Denial-of-Service (DDoS)-level outages.
FOSS developer Niccolò Venerandi, known for his work on the Plasma Linux desktop, describes the situation as dire, with limited resources making it difficult to fend off AI bots that behave like parasites. In a particularly egregious case, developer Xe Iaso recounted how AmazonBot—a web crawler operated by Amazon—repeatedly hammered a Git server despite explicit instructions in robots.txt to stay away. The bot disguised its identity, rotated IP addresses, and relentlessly scraped content until the server crashed.
“They Will Scrape Until It Falls Over”
In a widely shared blog post, Iaso vented their frustration:
“It’s futile to block AI crawler bots because they lie, change their user agent, use residential IP addresses as proxies, and more. They will scrape your site until it falls over, and then they will scrape it some more.”
These AI bots don’t just scrape—they exploit. Some follow every single link they find, over and over again, sometimes clicking the same link multiple times per second. The sheer volume of requests can make maintaining a stable site a nightmare.
Faced with an unwinnable war against AI crawlers, developers have begun employing creative countermeasures. Enter Anubis, an ingenious solution that fights back in a way befitting of mythology.
Anubis: Weighing the Souls of Web Requests
Rather than simply blocking AI bots (which often evade traditional filters), Iaso built Anubis, a proof-of-work system that must be completed before a request is allowed to reach a Git server. This ensures that only legitimate human users gain access, while bots are left in the dust.
The inspiration? The Egyptian god Anubis, who judged the souls of the dead. In mythology, if a soul’s heart was heavier than a feather, it was devoured. Iaso’s digital version operates on a similar principle: if a request fails the challenge, it’s denied.
For added flair, successful requests receive a cute anime-style drawing of Anubis as a reward. Since launching on GitHub, Anubis has become a viral hit, amassing thousands of stars, contributions, and forks in mere days. The rapid adoption highlights just how severe the problem has become within the FOSS community.


When All Else Fails: Vengeance as a Defense
Anubis isn’t the only tool in the growing arsenal against AI scrapers. Other developers have embraced more mischievous tactics, adopting an “offense is the best defense” mindset.
- Nepenthes: A tool designed to trap AI crawlers in an infinite loop of useless data, inspired by the carnivorous plant of the same name. Once a bot enters, it gets lost in a maze of fake content, wasting its own resources.
- AI Labyrinth: A commercial solution from Cloudflare, designed to slow down and confuse AI crawlers by feeding them irrelevant content rather than allowing them to extract meaningful data.
- Honeypot Strategies: Some developers have gone so far as to serve bots pages filled with misleading information—like fake articles on absurd topics—poisoning the data they scrape.
Drew DeVault, the founder of SourceHut, has openly described the battle against AI crawlers as exhausting. He revealed that he spends anywhere from 20% to 100% of his workweek mitigating hyper-aggressive AI scraping activity. Other developers have been forced to take drastic measures, such as banning entire countries from accessing their projects due to overwhelming bot traffic.
Jonathan Corbet, the maintainer of Linux industry news site LWN, also reported bot-induced slowdowns, while Fedora project sysadmin Kevin Fenzi had to block all traffic from Brazil at one point just to stabilize their servers.
The Future: Fighting Back or Accepting Defeat?
The war between AI crawlers and developers shows no signs of slowing down. Some argue that ethical guidelines need stricter enforcement, while others believe AI companies should be held accountable for the damage their bots cause. Meanwhile, developers in the open-source community continue to innovate ways to protect their projects, proving that necessity truly is the mother of invention.
Drew DeVault, exhausted by the constant battle, issued a desperate plea:
“Please stop legitimizing LLMs or AI image generators or GitHub Copilot or any of this garbage. I am begging you to stop using them, stop talking about them, stop making new ones—just stop.”
But with AI development advancing at breakneck speed, it’s unlikely that DeVault’s wish will be granted. Instead, developers will have to keep fighting back—whether through clever technical solutions, creative deterrents, or outright trickery.
One thing is certain: the war against AI crawlers is just getting started. And if recent tactics are any indication, open-source developers are more than ready to fight fire with fire.