As the Free Software Foundation (FSF) prepares to mark its 40th anniversary, the nonprofit finds itself confronting an unexpected and intensifying digital threat: LLM-driven scraping bots and coordinated denial-of-service (DDoS) attacks. These aren’t the same old brute-force attacks or political hacks — they’re AI-powered, distributed, and relentless.
Founded in 1985 by Richard Stallman, the FSF has long championed free and open-source software, licensing standards like GPL, and user freedom. But the new wave of automated traffic generated by large language models (LLMs) and automated crawlers threatens not just the performance of FSF infrastructure — but also the very principles of consent, transparency, and autonomy in open web publication.
In recent months, the FSF has reported increasingly frequent DDoS-like behavior from traffic patterns that appear non-malicious on the surface. This traffic includes:
Much of this traffic is consistent with automated systems training LLMs or enriching commercial AI products. And while this data is technically public, it was not published with unrestricted high-volume extraction in mind.
The open web has historically benefited both knowledge seekers and developers of free software. But LLMs have introduced a structural imbalance: commercial AI tools can ingest massive swaths of public documentation without attribution, compensation, or even acknowledgment. This undermines the collaborative intent of projects like GNU or Emacs — where community contributions are assumed to be used respectfully, not silently harvested into proprietary models.
While the FSF is still evaluating a formal policy on LLMs, its leadership has voiced concerns that indiscriminate data harvesting:
In the words of a foundation spokesperson: “We’re not against AI, but against extractive AI that takes without returning value or freedom.”
To combat these trends, the FSF has begun implementing both technical and ethical deterrents. These include:
Some of these steps are also precautionary against outright DDoS attempts — which FSF suspects may be mixed with AI scraping traffic, either deliberately or opportunistically.
The FSF isn’t alone. Projects like Debian, Arch Linux, and even open-access journals have reported abnormal traffic spikes from LLM-tuned bots. The growing concern is this: if open documentation becomes too costly to serve (due to hosting strain or abuse), organizations may be forced to restrict access or introduce CAPTCHAs, which runs counter to their mission of universal accessibility.
Moreover, there is a philosophical risk: AI models trained on free software communities without respecting their norms may end up promoting code and concepts out of context — eroding the values of transparency, attribution, and freedom.
The FSF has begun calling for an ethical framework for AI that respects the unique expectations of free software and open documentation. Their proposed tenets include:
These ideas are being discussed in academic and technical circles, but there is no enforcement yet. Until then, the FSF must rely on network-level defenses and public awareness to protect its content and values.
As the FSF enters its fifth decade, it faces a paradox: the more valuable and accessible its contributions become, the more vulnerable they are to silent misuse. Whether through DDoS attacks, LLM crawlers, or derivative works that never cite GNU origins, the foundation must adapt to defend freedom — not only in source code, but in how that code is read, remixed, and consumed by machines.
The FSF’s work continues to be vital — and increasingly symbolic — in this new digital era. Its stand against extractive AI may help define the future of open access, and what it means to share knowledge freely but responsibly.