Wikipedia Pushes Back Against AI Scraping, Urges Companies to Use Paid API

Reviewed byNidhi Govil

3 Sources

Share

The Wikimedia Foundation calls on AI companies to stop scraping Wikipedia content and instead use its paid Enterprise API, citing declining human traffic and increased bot activity that threatens the platform's sustainability.

Wikipedia Takes Stand Against AI Scraping

The Wikimedia Foundation has issued a direct appeal to artificial intelligence companies, urging them to abandon website scraping practices and instead utilize its paid Enterprise API to access Wikipedia content. The Monday blog post represents a significant shift in the organization's approach to AI companies that have been freely harvesting its data to train large language models

1

.

Source: TechCrunch

Source: TechCrunch

The nonprofit organization, which operates the world's seventh-most visited website, emphasized that AI companies require high-quality, human-curated information to maintain their model performance. Wikipedia's extensive volunteer network of editors ensures well-sourced content across more than 300 languages, making it an invaluable resource for AI training

2

.

Bot Traffic Surge Reveals Scraping Scale

Recent monitoring by the Wikimedia Foundation uncovered concerning patterns in website traffic. After updating its bot detection systems, the organization discovered that unusually high traffic during May and June was primarily driven by AI bots attempting to evade detection by mimicking human user behavior. This automated scraping activity coincided with an 8% year-over-year decline in authentic human page views, highlighting the growing impact of AI on traditional web browsing patterns

1

3

.

Source: CNET

Source: CNET

The foundation's concerns extend beyond technical issues to fundamental sustainability questions. As people increasingly turn to AI chatbots for information instead of visiting Wikipedia directly, the platform faces reduced exposure to donation requests that appear on its homepage, potentially threatening its funding model

2

.

Enterprise API as Sustainable Solution

Wikimedia's proposed solution centers on its Enterprise API, a paid service designed to provide scalable access to Wikipedia content without overwhelming the platform's servers. The organization argues that this approach offers multiple benefits: it reduces server strain, provides sustainable funding for Wikipedia's nonprofit mission, and ensures proper attribution to the human contributors whose work powers the encyclopedia

1

3

.

Source: PYMNTS

Source: PYMNTS

The foundation specifically called for generative AI developers to provide clear attribution when using Wikipedia content in their outputs. "For people to trust information shared on the internet, platforms should make it clear where the information is sourced from and elevate opportunities to visit and participate in those sources," the organization stated. This transparency requirement aims to maintain the connection between AI-generated responses and their human-created sources

1

.

Industry-Wide Content Licensing Tensions

Wikipedia's stance reflects broader tensions between content creators and AI companies over data usage rights. Major publishers including The New York Times and News Corp have filed copyright infringement lawsuits against AI firms, while others like the Associated Press and Reuters have negotiated licensing agreements. Google previously signed a commercial access deal with Wikimedia in 2022, demonstrating that such arrangements are feasible

2

.

The financial stakes are substantial, with Wikipedia's operational costs reaching $179 million for the 2023-2024 fiscal year. The platform relies primarily on donations rather than advertising revenue, making sustainable funding arrangements with AI companies particularly important for its long-term viability

2

.

Notably, the Wikimedia Foundation stopped short of threatening legal action against companies that continue scraping practices, instead focusing on collaborative solutions that benefit both parties while preserving Wikipedia's mission and sustainability

3

.

Today's Top Stories

TheOutpost.ai

Your Daily Dose of Curated AI News

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

Β© 2025 Triveous Technologies Private Limited
Instagram logo
LinkedIn logo