6 Sources
6 Sources
[1]
Wikipedia urges AI companies to use its paid API, and stop scraping | TechCrunch
Wikipedia on Monday laid out a simple plan to ensure its website continues to be supported in the AI era, despite its declining traffic. In a blog post, the Wikimedia Foundation, the organization that runs the popular online encyclopedia, called on AI developers to use its content "responsibly" by ensuring its contributions are properly attributed and that content is accessed through its paid product, the Wikimedia Enterprise platform. The opt-in, paid product allows companies to use Wikipedia's content at scale without "severely taxing Wikipedia's servers," the Wikimedia Foundation blog post explains. In addition, the product's paid nature allows AI companies to support the organization's nonprofit mission. While the post doesn't go so far as to threaten penalties or any sort of legal action for use of its material through scraping, Wikipedia recently noted that AI bots had been scraping its website while trying to appear human. After updating its bot detection systems, the organization found that its unusually high traffic in May and June had come from AI bots that were trying to "evade detection." Meanwhile, it said that "human page views" had declined 8% year-over-year. Now, Wikipedia is laying out its guidelines for AI developers and providers, saying that generative AI developers should provide attribution to give credit to the human contributors whose content it uses to create its outputs. "For people to trust information shared on the internet, platforms should make it clear where the information is sourced from and elevate opportunities to visit and participate in those sources," the post reads. "With fewer visits to Wikipedia, fewer volunteers may grow and enrich the content, and fewer individual donors may support this work." Earlier this year, the organization released its AI strategy for editors, which said it would use AI to help editors with workflows around tedious tasks, automating translation, and other tools that help its editors, not replace them.
[2]
Wikipedia Asks AI Companies to Stop Scraping Data and to Start Paying Up
Imad is a senior reporter covering Google and internet culture. Hailing from Texas, Imad started his journalism career in 2013 and has amassed bylines with The New York Times, The Washington Post, ESPN, Tom's Guide and Wired, among others. The Wikimedia Foundation, the nonprofit organization that hosts Wikipedia, wants AI companies to stop scraping its data to train AI models and to begin paying to use its Application Programming Interface instead, the foundation said in a blog post on Monday. Wikimedia says AI companies need high-quality human-curated information to keep their models working. Wikipedia's extensive volunteer network of editors ensures that its information remains well-sourced, and its content is available in over 300 languages. At the same time, running Wikipedia is a costly endeavor. It's currently the seventh-most visited website in the world, according to Semrush. It cost $179 million to run Wikipedia for the 2023-2024 fiscal year, according to a Wikimedia Foundation audit. Wikimedia keeps Wikipedia afloat primarily through donations and doesn't run advertising. Don't miss any of our unbiased tech content and lab-based reviews. Add CNET as a preferred Google source. But AI is changing people's research habits. Instead of researching subjects on Wikipedia, people are turning to AI to answer their questions. Although Wikipedia is free to use, if people circumvent it by using ChatGPT, they won't see donation requests at the top of the Wikipedia home page, and the site could lose money. Wikimedia is asking AI companies to pay to use its Enterprise API, which will allow them "to use Wikipedia content at scale and sustainably without severely taxing Wikipedia's servers, while also enabling them to support our nonprofit mission." Representatives for Google, OpenAI, Meta, Perplexity, Anthropic, Microsoft, DeepSeek and xAI didn't immediately reply to requests for comment, and a representative for Wikimedia also didn't immediately respond to a request for comment. Google did agree to a deal with Wikimedia in 2022 to commercially access Wikipedia content. Wikimedia's request comes as online content creators are pushing back against AI companies using online data without permission or payment. Online publishers, such as Penske, the New York Times and News Corp, are suing AI companies for copyright infringement. Other companies, such as the Associated Press and Reuters, have signed licensing deals with AI firms. (Disclosure: Ziff Davis, CNET's parent company, in April filed a lawsuit against OpenAI, alleging it infringed Ziff Davis copyrights in training and operating its AI systems.) During the AI boom, Big Tech stocks have soared to stratospheric heights. Nvidia briefly became the world's first $5 trillion company late last month, with Microsoft and Google's parent company, Alphabet, breaking the $4 trillion barrier earlier this year.
[3]
Wikipedia helped train your favorite AI, now the Wiki foundation wants a cut
Serving tech enthusiasts for over 25 years. TechSpot means tech analysis and advice you can trust. In context: In their never-ending quest to build profitable AI services, Big Tech and startups are constantly scanning the internet for fresh, human-made content. Wikipedia, the free encyclopedia at the center of this unprecedented infrastructure abuse, is now asking AI companies to pay for its content. The Wikimedia Foundation has underlined the irreplaceable value of human-created knowledge in the era of chatbots and generative AI services. The organization overseeing Wikipedia and other major "Wiki" projects is also outlining its ideas about how AI companies should earn the right to access the encyclopedia's content - now and in the years to come. Many studies have explained that generative AI cannot possibly exist in a vacuum. Developing and improving new large language models (LLM) requires textual content created by actual humans, as training LLMs on LLM-generated content will quickly make the entire AI infrastructure collapse because of the feedback loop effect. As one of the largest projects centered around human-written content, Wikipedia has been one of the most important sources for AI development from the start. Modern chatbots would not exist without Wikipedia, Wikimedia said, but AI companies are not being "responsible" in how they use the encyclopedia's content. Earlier this year, the foundation revealed that AI bots have become a significant issue for Wikipedia's reliability. The service has experienced a 50 percent increase in bandwidth used for multimedia downloads since January 2024, with clear signs of AI scraping detected since December 2024. Traffic has now doubled, and Wikimedia blames bots for at least 65 percent of the most resource-intensive network loads. Therefore, the foundation is now asking AI companies to start acting responsibly toward Wikipedia. They can perform two "straightforward" actions to achieve this goal: attribution and financial support. Generative AI services should give credit to Wikipedia's human contributions when using the encyclopedia's content, so that the service can keep attracting new volunteers in the future. Furthermore, AI developers should go through the Wikimedia Enterprise services to access the platform's datasets with its paid API. The new API should provide AI companies with all the quality content they need, with no excessive impact on Wikipedia's server infrastructure. Thanks to proper attribution and financial compensation, the platform will continue to be a sustainable venture in the future, Wikimedia said. Wikipedia is written and managed by volunteer communities, while the Wikimedia Foundation provides technology and legal support to the encyclopedia's infrastructure. The free online encyclopedia will be 25 years old in January 2026, with Wikimedia planning to use the money provided by AI companies to fund the next 25 years and beyond.
[4]
Wikipedia tells AI companies to stop scraping and start paying
The Wikimedia Foundation is asking AI developers to credit contributors and use its paid Enterprise platform instead of scraping. The Wikimedia Foundation, the nonprofit organization that operates Wikipedia, laid out a new plan on Monday for how artificial intelligence developers should interact with its content. The move comes as the online encyclopedia faces declining human traffic and a massive increase in automated scraping by AI bots. In a blog post, the foundation called on AI companies to use its content "responsibly." It outlined two main guidelines: first, that AI developers must provide proper attribution to credit the human volunteer contributors whose work is used in AI-generated outputs. Second, it urged companies to access Wikipedia's content at scale by using its paid product, the Wikimedia Enterprise platform, rather than by "scraping" the live website, which "severely taxes Wikipedia's servers." The foundation noted that the paid platform allows AI companies to use the content they need while also financially supporting Wikipedia's nonprofit mission. The post did not threaten legal action against scrapers but framed the guidelines as a matter of sustainability. This new plan follows a recent discovery by Wikipedia that its unusually high traffic in May and June was not from human readers. After updating its bot-detection systems, the organization found the traffic came from AI bots "trying to evade detection." During that same period, Wikipedia's "human page views" had declined by 8% year-over-year. The foundation warned that this trend threatens its operating model. "With fewer visits to Wikipedia, fewer volunteers may grow and enrich the content, and fewer individual donors may support this work," the post stated. Earlier this year, Wikipedia also released its internal AI strategy, which focuses on using AI as a tool to assist its human editors with tedious tasks and automation, not to replace them.
[5]
Wikipedia Urges AI Companies to Use Its Paid API Instead of Website Scraping | PYMNTS.com
By completing this form, you agree to receive marketing communications from PYMNTS and to the sharing of your information with our sponsor, if applicable, in accordance with our Privacy Policy and Terms and Conditions. A Monday (Nov. 10) blog post explained that making Wikimedia Enterprise a paid service sustains Wikipedia's global volunteer editor base and nonprofit funding model. The post highlighted that Wikimedia Enterprise enables organizations to efficiently use Wikipedia's vast repository of human-generated knowledge without overburdening its servers. The foundation asked AI developers to provide proper attribution to Wikipedia content contributors when incorporating information into AI answer outputs to maintain transparency and respect for the human effort behind its content curation. Another blog post from the foundation said that in recent months, there was a significant spike in automated bot traffic from AI companies scraping Wikipedia's pages, many of which attempted to evade detection by mimicking human users. Per the post, updated monitoring revealed that peak traffic in May and June 2025 was largely driven by AI bots, while also indicating an 8% decline in authentic human page views year-over-year. Wikipedia, however, hasn't written off AI completely. Earlier this year, the Wikimedia Foundation launched a strategy for editors to leverage AI tools to automate routine tasks and improve editorial workflows. It should be noted that the Wikimedia Foundation stopped short of threatening legal action against AI developers for site scraping. Other businesses are also taking measures to curb data scraping, execs told PYMNTS. The site scraping woes come on the heels of a legal defeat for the Wikimedia Foundation this summer. The U.K. High Court dismissed a legal challenge by the foundation against parts of the U.K.'s Online Safety Act, which could impose strict compliance regulations on Wikipedia. These could include being required to verify the identities of volunteer contributors and limiting access to the number of U.K. visitors able to access or edit the site.
[6]
Wikipedia Urges AI Firms to Credit and Pay for Content
Wikipedia is asking artificial intelligence (AI) developers to support the non-profit information portal through proper attribution and financial contributions, according to a blog post by the Wikimedia Foundation. The post calls on "AI developers and other content reusers who access our content to use it responsibly and sustain Wikipedia." The Foundation explains that when AI models give due credit to Wikipedia, it creates a "virtuous cycle" that sustains human contributions, the very inputs that eventually become part of AI training datasets. In this context, the Wikimedia Foundation notes: "For people to trust information shared on the internet, platforms should make it clear where the information is sourced from and elevate opportunities to visit and participate in those sources." Furthermore, the blog post talks about AI developers offering adequate financial support to Wikipedia through the Wikimedia Enterprise Platform -- a paid, opt-in service that allows companies to use Wikipedia's content sustainably and at scale without "severely taxing Wikipedia's servers," while supporting the organisation's non-profit mission. "Through proper attribution of information sources and better financial support for AI's technological impacts on Wikipedia, AI developers can secure both their own long-term futures and Wikipedia's," the Foundation adds. Human Knowledge Is Paramount, Says Wikimedia Foundation The Wikimedia Foundation, the non-profit entity that provides Wikipedia with technological and legal support, strongly advocates for human-generated content, stressing that AI's knowledge ultimately derives from human-created information. While generative AI tools may synthesise or summarise existing knowledge, the Foundation argues they cannot actively engage in discussions or reach a consensus. To illustrate, the blog post cites a science communicator on YouTube who questioned whether AI would eventually "eat itself," given the fact that it relies extensively on information and knowledge that humans provide in the first place. The Foundation credits Wikipedia's volunteer editors for continuously improving the platform's content: "This human-centered approach to knowledge creation provides high-quality and reliable information that, through regular editorial collaboration and disagreements, leads to more neutral and comprehensive articles." It concludes, "The more humans take part in Wikipedia, the better the internet's knowledge becomes." Why It Matters Although Wikipedia's latest appeal seems sensible, urging AI developers to give credit and access content through the authorised Wikimedia Enterprise Platform, the actual implementation of these measures remains uncertain. The key question is whether the Foundation seeks voluntary participation or plans to enforce compliance through legal or financial means. For now, the Wikimedia Foundation has chosen the voluntary route. It's also worth noting that Wikipedia has seen an 8% year-on-year (YoY) decline in human visitors. "We believe that these declines reflect the impact of generative AI and social media on how people seek information, especially with search engines providing answers directly to searchers, often based on Wikipedia content," said Marshall Miller, Senior Director of Product at the Wikimedia Foundation. In this light, how successful the voluntary opt-in safeguard, namely the Wikimedia Enterprise Platform, proves to be is an open-ended question. Who is to say AI developers won't ignore Wikipedia's initiatives and continue training their models on the platform's vast repository of human knowledge? And what measures does the Wikimedia Foundation currently have to prevent or stop such practices? At the end of the day, building narratives around the sanctity of human-created knowledge and the importance of humans feels all well and good, but there must be some active deterrents to safeguard these people when push comes to shove.
Share
Share
Copy Link
The Wikimedia Foundation is urging AI companies to stop scraping Wikipedia and instead use its paid Enterprise API, as bot traffic has surged while human visits declined 8% year-over-year.
The Wikimedia Foundation has issued a direct appeal to artificial intelligence companies, asking them to stop scraping Wikipedia's content and instead pay for access through its Enterprise API platform. The move comes as the nonprofit organization faces mounting pressure from AI bots that are overwhelming its servers while human traffic continues to decline
1
.
Source: PYMNTS
In a Monday blog post, the foundation outlined a two-pronged approach for "responsible" AI development: proper attribution to Wikipedia's volunteer contributors and financial support through its paid services. The organization revealed that after updating its bot detection systems, it discovered that unusually high traffic in May and June came from AI bots "trying to evade detection" while human page views declined 8% year-over-year
4
.Wikipedia has experienced a dramatic surge in automated traffic, with bandwidth usage for multimedia downloads increasing by 50% since January 2024. The foundation now attributes at least 65% of its most resource-intensive network loads to bot activity, causing overall traffic to double
3
.This unprecedented level of scraping activity has put significant strain on Wikipedia's infrastructure. The organization, which operates as the seventh-most visited website globally, spent $179 million during the 2023-2024 fiscal year to maintain its services
2
.Source: TechSpot
The Wikimedia Foundation is promoting its Enterprise platform as the preferred method for AI companies to access Wikipedia's content at scale. This paid service allows organizations to use Wikipedia's extensive database without "severely taxing Wikipedia's servers" while providing financial support for the nonprofit's mission
1
.The foundation emphasized that proper attribution remains crucial for maintaining Wikipedia's volunteer-driven model. "For people to trust information shared on the internet, platforms should make it clear where the information is sourced from and elevate opportunities to visit and participate in those sources," the organization stated
1
.
Source: CNET
Related Stories
While the Wikimedia Foundation has not threatened legal action against scrapers, the broader content industry is increasingly pushing back against unauthorized AI training data usage. Publishers including The New York Times and News Corp have filed copyright infringement lawsuits against AI companies, while others like the Associated Press and Reuters have signed licensing agreements
2
.Google has already established a precedent by signing a commercial access deal with Wikimedia in 2022. However, representatives from major AI companies including OpenAI, Meta, Anthropic, and Microsoft have not yet responded to requests for comment regarding Wikipedia's new guidelines
2
.Despite concerns about scraping, Wikipedia has not rejected AI technology entirely. Earlier this year, the organization released its internal AI strategy, which focuses on using artificial intelligence to assist human editors with routine tasks, translation automation, and workflow improvements rather than replacing human contributors
5
.Summarized by
Navi