10 Sources
[1]
Pay up or stop scraping: Cloudflare program charges bots for each crawl
Cloudflare is now experimenting with tools that will allow content creators to charge a fee to AI crawlers to scrape their websites. In a blog Tuesday, Cloudflare explained that its "pay-per-crawl" feature is currently in a private beta. A small number of publishers -- including AdWeek, The Associated Press, The Atlantic, BuzzFeed, Fortune, Gannett, and Ars Technica owner Condé Nast -- will participate in the experiment. Each publisher will be able to set their own prices that bots must pay before scraping content, Cloudflare said. Matthew Prince, CEO of Cloudflare, said the feature would ensure that the Internet as we know it will survive "the age of AI." "Original content is what makes the Internet one of the greatest inventions in the last century, and it's essential that creators continue making it," Prince said. "AI crawlers have been scraping content without limits. Our goal is to put the power back in the hands of creators, while still helping AI companies innovate. This is about safeguarding the future of a free and vibrant Internet with a new model that works for everyone." Some participating publishers expressed optimism in the press release that Cloudflare's pay-per-crawl feature could potentially stop the endless scraping that publishers defending copyrights have alleged represents wide-scale theft. Any content creators interested in joining the beta can sign up, Cloudflare noted, and perhaps eventually, they too can "be compensated for their contributions to the AI economy." In the meantime, only the publishers involved in the beta will be able to choose which bots can access which parts of their sites, experimenting with blocking all bots or allowing certain bots to access certain content. Cloudflare's program also gives them the flexibility to charge some bots while letting other bots scrape for free. This lets publishers that have negotiated deals with AI companies to allow approved scraping while still protecting their content from companies that have not yet struck licensing deals. AI companies must buy in For Cloudflare's plan to work, AI companies must sign up, too. However, while some AI companies may not see the incentive, Cloudflare has confirmed that it has partnered with AI companies on the initiative, which may benefit from having a simple interface to negotiate with content creators. Cloudflare suggested its AI partners benefit from "long-term collaboration" with creators whose updated content will help AI products stay relevant. They also can stop wasting money scraping poor quality data sources, a Cloudflare blog said. "Without ongoing contributions from content creators, AI systems risk becoming outdated, biased, or less reliable -- ultimately diminishing user trust and the value of AI products," the blog said. "Cloudflare is working with AI companies to give them more signals, and ultimately improve the quality and relevance of content they can access. A healthy, sustainable ecosystem of original content is critical for AI innovation and relevance." However, Cloudflare's gamble seems to depend on AI companies agreeing to pay the prices set by publishers, and that could potentially scramble the experiment if bidding wars reduce rates to the point that they alienate publishers. It also hinges on Cloudflare detecting the AI bots, which, for now, relies on user reports and Cloudflare's analysis of mass traffic patterns. "In the early days, price discovery will play a key role -- as creators gain data on whoʼs paying for what, a transparent market will emerge that reflects the true value of original content," Cloudflare said. Looking to the future, Cloudflare suggested that its pay-per-crawl system would "evolve significantly." Perhaps one day publishers could use it to "charge different rates for different paths or content types," potentially even introducing dynamic pricing in the AI scraping environment. In that future, Cloudflare predicted that AI companies would possibly be incentivized to create agents that would crawl the web, seeking the best content deals to support specific AI products. "Imagine asking your favorite deep research program to help you synthesize the latest cancer research or a legal brief, or just help you find the best restaurant in Soho -- and then giving that agent a budget to spend to acquire the best and most relevant content," Cloudflare said, promising that "we enable a future where intelligent agents can programmatically negotiate access to digital resources." AI crawlers now blocked by default Cloudflare's announcement comes after rolling out a feature last September, allowing website owners to block AI crawlers in a single click. According to Cloudflare, over 1 million customers chose to block AI crawlers, signaling that people want more control over their content at a time when Cloudflare observed that writing instructions for AI crawlers in robots.txt files was widely "underutilized." To protect more customers moving forward, any new customers (including anyone on a free plan) who sign up for Cloudflare services will have their domains, by default, set to block all known AI crawlers. This marks Cloudflare's transition away from the dreaded opt-out models of AI scraping to a permission-based model, which a Cloudflare spokesperson told Ars is expected to "fundamentally change how AI companies access web content going forward." In a world where some website owners have grown sick and tired of attempting and failing to block AI scraping through robots.txt -- including some trapping AI crawlers in tarpits to punish them for ignoring robots.txt -- Cloudflare's feature allows users to choose granular settings to prevent blocks on AI bots from impacting bots that drive search engine traffic. That's critical for small content creators who want their sites to still be discoverable but not digested by AI bots. "AI crawlers collect content like text, articles, and images to generate answers, without sending visitors to the original source -- depriving content creators of revenue, and the satisfaction of knowing someone is reading their content," Cloudflare's blog said. "If the incentive to create original, quality content disappears, society ends up losing, and the future of the Internet is at risk." Disclosure: Condé Nast, which owns Ars Technica, is a partner in Cloudflare's pay-per-crawl beta.
[2]
Cloudflare launches a marketplace that lets websites charge AI bots for scraping | TechCrunch
Cloudflare, a cloud infrastructure provider that serves 20% of the web, announced Tuesday the launch of a new marketplace that reimagines the relationship between website owners and AI companies -- ideally giving publishers greater control over their content. For the last year, Cloudflare has launched tools for publishers to address the rampant rise of AI crawlers, including a one-click solution to block all AI bots, as well as a dashboard to view how AI crawlers are visiting their site. In a 2024 interview, Cloudflare CEO Matthew Prince told TechCrunch these products were laying a foundation for a new type of marketplace in which publishers could distribute their content to AI companies and be compensated for it. Now, Cloudflare is bringing that marketplace to life. It's called Pay per Crawl, and Cloudflare is launching the "experiment" in private beta on Tuesday. Website owners in the experiment can choose to let AI crawlers, on an individual basis, scrape their site at a set rate -- a micropayment for every single "crawl." Alternatively, website owners can choose to let AI crawlers scrape their site for free, or block them altogether. Cloudflare claims its tools will let website owners see whether crawlers are scraping their site for AI training data, to appear in AI search responses, or for other purposes. At scale, Cloudflare's marketplace is a big idea that could offer publishers a potential business model for the AI era -- and it also places Cloudflare at the center of it all. The launch of the marketplace comes at a time when news publishers are facing existential questions about how to reach readers, as Google Search traffic fades away and AI chatbots rise in popularity. There's not a clear answer for how news publishers will survive in the AI era. Some, such as the New York Times, have filed lawsuits against tech companies for training their AI models on news articles without permission. Meanwhile, other publishers have struck multi-year deals to license their content for AI model training and to have their content appear in AI chatbot responses. Even so, only large publishers have struck AI licensing deals, and it's still unclear whether they provide meaningful sources of revenue. Cloudflare aims to create a more durable system where publishers can set prices on their own terms. The company also announced Tuesday that new websites set up with Cloudflare will now, by default, block all AI crawlers. Site owners will have to grant certain AI crawlers permission to access their site -- a change Cloudflare says will give every new domain "the default of control." Several large publishers, including Conde Nast, TIME, The Associated Press, The Atlantic, ADWEEK, and Fortune, have signed on with Cloudflare to block AI crawlers by default in support of the company's broader goal of a "permission-based approach to crawling." The business model that many of these publishers relied on for decades is slowly becoming unreliable. Historically, online publishers have allowed Google to scrape their sites in return for referrals in Google Search, which translated to traffic to their sites, and ultimately, ad revenue. However, new data from Cloudflare suggests that publishers may be getting a worse deal in the AI era than in the Google Search era. While some websites cite ChatGPT as a major traffic source, that doesn't appear to be the case broadly. This June, Cloudflare says it found that Google's crawler scraped its websites 14 times for every referral it gave them. Meanwhile, OpenAI's crawler scraped websites 17,000 times for every one referral, while Anthropic scraped websites 73,000 times for every referral. Meanwhile, OpenAI and Google are building AI agents that are designed to visit websites on behalf of users, collect information, and deliver it back to users directly. A future in which these tools are mainstream has huge implications for publishers that rely on readers visiting their sites. Cloudflare notes that the "true potential" of Pay per Crawl may emerge in an "agentic" future. "What if an agentic paywall could operate at the network edge, entirely programmatically? Imagine asking your favorite deep research program to help you synthesize the latest cancer research or a legal brief, or just help you find the best restaurant in Soho -- and then giving that agent a budget to spend to acquire the best and most relevant content," Cloudflare said in a blog post. To participate in Cloudflare's experimental marketplace, AI companies and publishers must both be set up with Cloudflare accounts. In their accounts, both parties can set rates at which they'd like to buy and sell a "crawl" of the publisher's content. Cloudflare acts as the intermediary in these transactions, charging the AI company and distributing the earnings to the publisher. Cloudflare spokesperson Ripley Park tells TechCrunch there are no stablecoins or cryptocurrency involved in Pay per Crawl at this time, even though many have suggested digital currency would be perfect for something like this. Cloudflare's marketplace feels like a bold vision for the future that requires a lot of publishers and AI companies to get on board. Still, there's no guarantee publishers will get a good deal, and convincing AI firms to participate could be tough, given they're currently scraping content for free. Nevertheless, Cloudflare seems like one of the few companies in a position to make a marketplace like this happen.
[3]
Cloudflare will now, by default, block AI bots from crawling its clients' websites
However, such systems don't provide the same opportunities for monetization and credit as search engines historically have. AI models draw from a great deal of data on the web to generate their outputs, but these data sources are often not credited, limiting the creators' ability to make money from their work. Search engines that feature AI-generated answers may include links to original sources, but they may also reduce people's interest in clicking through to other sites and could even usher in a "zero-click" future. "Traditionally, the unspoken agreement was that a search engine could index your content, then they would show the relevant links to a particular query and send you traffic back to your website," Will Allen, Cloudflare's head of AI privacy, control, and media products, wrote in an email to MIT Technology Review. "That is fundamentally changing." Generally, creators and publishers want to decide how their content is used, how it's associated with them, and how they are paid for it. Cloudflare claims its clients can now allow or disallow crawling for each stage of the AI life cycle (in particular, training, fine-tuning, and inference) and white-list specific verified crawlers. Clients can also set a rate for how much it will cost AI bots to crawl their website. In a press release from Cloudflare, media companies like the Associated Press and Time and forums like Quora and Stack Overflow voiced support for the move. "Community platforms that fuel LLMs should be compensated for their contributions so they can invest back in their communities," Stack Overflow CEO Prashanth Chandrasekar said in the release. Crawlers are supposed to obey a given website's directions (provided through a robots.txt file) to determine whether they can crawl there, but some AI companies have been accused of ignoring these instructions. Cloudflare already has a bot verification system where AI web crawlers can tell websites who they work for and what they want to do. For these, Cloudflare hopes its system can facilitate good-faith negotiations between AI companies and website owners. For the less honest crawlers, Cloudflare plans to use its experience dealing with coordinated denial-of-service attacks from bots to stop them. "A web crawler that is going across the internet looking for the latest content is just another type of bot -- so all of our work to understand traffic and network patterns for the clearly malicious bots helps us understand what a crawler is doing," wrote Allen. Cloudflare had already developed other ways to deter unwanted crawlers, like allowing websites to send them down a path of AI-generated fake web pages to waste their efforts. While this approach will still apply for the truly bad actors, the company says it hopes its new services can foster better relationships between AI companies and content producers.
[4]
Cloudflare Is Blocking AI Crawlers by Default
The age of the AI scraping free-for-all may be coming to an end. At least if Cloudflare gets its way. Last year, internet infrastructure firm Cloudflare launched tools enabling its customers to block AI scrapers. Today the company has taken its fight against permissionless scraping several steps further. It has switched to blocking AI crawlers by default for its customers and is moving forward with a Pay Per Crawl program that lets customers charge AI companies to scrape their websites. Web crawlers have trawled the internet for information for decades. Without them, people would lose vitally important online tools, from Google Search to the Internet Archive's invaluable digital preservation work. But the AI boom has produced a corresponding boomlet in AI-focused web crawlers, and these bots scrape web pages with a frequency that can mimic a DDoS attack, straining servers and knocking websites offline. Even when websites can handle the heightened activity, many do not want AI crawlers scraping their content, especially news publications that are demanding AI companies to pay to use their work. "We've been feverishly trying to protect ourselves," says Danielle Coffey, the president and CEO of the trade group News Media Alliance, which represents several thousand North American outlets. So far, Cloudflare's head of AI control, privacy, and media products, Will Allen, tells WIRED, over 1 million customer websites have activated its older AI-bot-blocking tools. Now millions more will have the option of keeping bot blocking as their default. Cloudflare also says it can identify even "shadow" scrapers that are not publicized by AI companies. The company noted that it uses a proprietary combination of behavioral analysis, fingerprinting, and machine learning to classify and separate AI bots from "good" bots. A widely used web standard called the Robots Exclusion Protocol, often implemented through a robots.txt file, helps publishers block bots on a case-by-case basis, but following it is not legally required, and there's plenty of evidence that some AI companies try to evade efforts to block their scrapers. "Robots.txt is ignored," Coffey says. According to a report from the content licensing platform Tollbit, which offers its own marketplace for publishers to negotiate with AI companies over bot access, AI scraping is still on the rise -- including scraping that ignores robots.txt. Tollbit found that over 26 million scrapes ignored the protocol in March 2025 alone. In this context, Cloudflare's shift to blocking by default could prove a significant roadblock to surreptitious scrapers and could give publishers more leverage to negotiate, whether through the Pay Per Crawl program or otherwise. "This could dramatically change the power dynamic. Up to this point, AI companies have not needed to pay to license content, because they've known that they can just take it without consequences," says Atlantic CEO (and former WIRED editor in chief) Nicholas Thompson. "Now they'll have to negotiate, and it will become a competitive advantage for the AI companies that can strike more and better deals with more and better publishers." AI startup ProRata, which operates the AI search engine Gist.AI, has agreed to participate in the Pay Per Crawl program, according to CEO and founder Bill Gross. "We firmly believe that all content creators and publishers should be compensated when their content is used in AI answers," Gross says. Of course, it remains to be seen whether the big players in the AI space will participate in a program like Pay Per Crawl, which is in beta. (Cloudflare declined to name current participants.) Companies like OpenAI have struck licensing deals with a variety of publishing partners, including WIRED parent company Condé Nast, but specific details of these agreements have not been disclosed, including whether the agreement covers bot access. Meanwhile, there's an entire online ecosystem of tutorials about how to evade Cloudflare's bot blocking tools aimed at web scrapers. As the blocking default rolls out, it's likely these efforts will continue. Cloudflare emphasizes that customers who do want to let the robots scrape unimpeded will be able to turn off the blocking setting. "All blocking is fully optional and at the discretion of each individual user," Allen says.
[5]
Cloudflare launches tool to help website owners monetize AI bot crawler access
NEW YORK, July 1 (Reuters) - Cloudflare has launched a tool that blocks bot crawlers from accessing content without permission or compensation to help websites make money from AI firms trying to access and train on their content, the software company said on Tuesday. The tool allows website owners to choose whether artificial intelligence crawlers can access their material and set a price for access through a "pay per crawl" model, which will help them control how their work is used and compensated, Cloudflare said. With AI crawlers increasingly collecting content without sending visitors to the original source, website owners are looking to develop additional revenue sources as search traffic referrals that once generated advertising revenue decline. The initiative is supported by major publishers including Condé Nast and Associated Press, as well as social media companies such as Reddit and Pinterest. Cloudflare's Chief Strategy Officer Stephanie Cohen said the goal of such tools was to give publishers control over their content, and ensure a sustainable ecosystem for online content creators and AI companies. "The change in traffic patterns has been rapid, and something needed to change," Cohen said in an interview. "This is just the beginning of a new model for the internet." Google, for example, has seen its ratio of crawls to visitors referred back to sites drop to 18:1 from 6:1 just six months ago, according to Cloudflare data, suggesting the search giant is maintaining its crawling but decreasing referrals. The decline could be a result of users finding answers directly within Google's search results, such as AI Overviews. Still, Google's ratio is much higher than other AI companies, such as OpenAI's 1,500:1. For decades, search engines have indexed content on the internet directing users back to websites, an approach that rewards creators for producing quality content. However, AI companies' crawlers have disrupted this model because they harvest material without sending visitors to the original source and aggregate information through chatbots such as ChatGPT, depriving creators of revenue and recognition. Many AI companies are circumventing a common web standard used by publishers to block the scraping of their content for use in AI systems, and argue they have broken no laws in accessing content for free. In response, some publishers, including the New York Times, have sued AI companies for copyright infringement, while others have struck deals to license their content. Reddit, for example, has sued AI startup Anthropic for allegedly scraping Reddit user comments to train its AI chatbot, while inking a content licensing deal with Google. Reporting by Krystal Hu in New York; Editing by Kate Mayberry Our Standards: The Thomson Reuters Trust Principles., opens new tab Suggested Topics:Artificial Intelligence Krystal Hu Thomson Reuters Krystal reports on venture capital and startups for Reuters. She covers Silicon Valley and beyond through the lens of money and characters, with a focus on growth-stage startups, tech investments and AI. She has previously covered M&A for Reuters, breaking stories on Trump's SPAC and Elon Musk's Twitter financing. Previously, she reported on Amazon for Yahoo Finance, and her investigation of the company's retail practice was cited by lawmakers in Congress. Krystal started a career in journalism by writing about tech and politics in China. She has a master's degree from New York University, and enjoys a scoop of Matcha ice cream as much as getting a scoop at work.
[6]
Web giant Cloudflare to block AI bots from scraping content by default
Internet firm Cloudflare will start blocking artificial intelligence crawlers from accessing content without website owners' permission or compensation by default, in a move that could significantly impact AI developers' ability to train their models. Starting Tuesday, every new web domain that signs up to Cloudflare will be asked if they want to allow AI crawlers, effectively giving them the ability to prevent bots from scraping data from their websites. Cloudflare is what's called a content delivery network, or CDN. It helps businesses deliver online content and applications faster by caching the data closer to end-users. They play a significant role in making sure people can access web content seamlessly every day. Roughly 16% of global internet traffic goes directly through Cloudflare's CDN, the firm estimated in a 2023 report. "AI crawlers have been scraping content without limits. Our goal is to put the power back in the hands of creators, while still helping AI companies innovate," said Matthew Prince, co-founder and CEO of Cloudflare, in a statement Tuesday.
[7]
Cloudflare experiment will block AI bot scrapers unless they pay a fee
New customers will also be asked if they want to block AI crawlers by default. Cloudflare has rolled out a couple of new measures meant to keep AI bot crawlers at bay. To start with, every new domain customer that signs up with the company to manage their website traffic will now be asked if they want to allow AI crawlers or to block them altogether. The company released a free tool in 2024 to block AI bots, but with this change, users can block them by default without having to tinker with their settings. Several big publishers, including Condé Nast, TIME and The Associated Press have already signed up to block crawlers. In addition, Cloudflare has launched a private beta experiment called "pay per crawl," which would only allow crawlers to access a website's content if they pay for it. Matthew Prince, Cloudflare's CEO, recently went on record to say that publishers are facing an existential threat, because people aren't clicking on chatbots' source links. If users don't visit those sources, the websites don't get the ad revenue they need to be able to keep running. "Original content is what makes the Internet one of the greatest inventions in the last century, and it's essential that creators continue making it," Prince said in a statement released with the company's latest updates. "AI crawlers have been scraping content without limits. Our goal is to put the power back in the hands of creators, while still helping AI companies innovate. This is about safeguarding the future of a free and vibrant Internet with a new model that works for everyone." Cloudflare believes publishers should be able to charge AI bots for access if they want to, and pay per crawl is its first experiment for that particular purpose. "Each time an AI crawler requests content, they either present payment intent via request headers for successful access (HTTP response code 200), or receive a 402 Payment Required response with pricing," Cloudflare explained. The company records those transactions and provides the underlying technical infrastructure. Publishers will be able to allow certain crawlers to access their content for free if they want to, and they can define a flat, per-request price across its websites for other crawlers. The company says pay per crawl is still in its very early stages, and it expects the tool to evolve in the future. It also says that it supports the development of other marketplaces and ways to charge AI crawlers for content. A marketplace could, for instance, allow dynamic pricing that enable publishers to charge different rates for different types of content.
[8]
Publishers, website owners to block AI 'crawlers' that don't pay
Gift 5 articles to anyone you choose each month when you subscribe. Huge artificial intelligence companies that train their models on data harvested without permission from commercial websites will face a new roadblock, with global internet infrastructure giant Cloudflare making it easier for publishers and other website owners to lock their virtual doors to "web crawlers". Publishers including TIME, Condé Nast, Sky News, Quora, The Atlantic, Fortune, Associated Press, Pinterest and BuzzFeed all put their names to an announcement of the new "permission-based" model of directing internet traffic to websites being offered by CloudFlare, which makes it the default setting to block AI bots without explicit approval.
[9]
New Cloudflare tool lets sites charge AI crawlers
Cloudflare announced the launch of "Pay per Crawl," a private beta marketplace enabling website owners to charge AI crawlers for content access, fulfilling a vision articulated by CEO Matthew Prince in 2024. Cloudflare has, over the past year, introduced various tools designed to manage AI crawler interactions for publishers. These include a one-click solution for blocking all AI bots and a dashboard providing visibility into how AI crawlers are accessing their sites. Cloudflare CEO Matthew Prince stated in a 2024 interview that these products were foundational steps toward creating a marketplace where publishers could monetize their content by distributing it to AI companies. Cloudflare launched this marketplace, named "Pay per Crawl," as a private beta experiment. Within this experiment, website owners can establish a specific micropayment rate for each "crawl" of their site by individual AI crawlers. Alternatively, website owners retain the option to permit AI crawlers to scrape their content without charge or to block them entirely. Cloudflare asserts that its tools provide website owners with the ability to discern whether crawlers are accessing their site for AI training data, for inclusion in AI search responses, or for other purposes. Cloudflare additionally announced that new websites configured with Cloudflare will, by default, block all AI crawlers. This modification requires site owners to explicitly grant permission for specific AI crawlers to access their content, a change Cloudflare describes as providing "the default of control" to every new domain. Several prominent publishers, including Conde Nast, TIME, The Associated Press, The Atlantic, ADWEEK, and Fortune, have committed to blocking AI crawlers by default in alignment with Cloudflare's objective of promoting a "permission-based approach to crawling." Cloudflare fights AI scrapers with a maze of useless content Data collected by Cloudflare in June 2025 indicates a disparity in crawling frequency between AI companies and traditional search engines. Cloudflare reported that Google's crawler accessed websites 14 times for every referral it provided. In contrast, OpenAI's crawler accessed websites 17,000 times for each referral, while Anthropic's crawler accessed websites 73,000 times per referral. Cloudflare envisions "Pay per Crawl" potentially evolving into an "agentic paywall." This concept could involve AI agents, acting on behalf of users, being allocated a budget to acquire relevant content. Cloudflare stated in a blog post, "What if an agentic paywall could operate at the network edge, entirely programmatically? Imagine asking your favorite deep research program to help you synthesize the latest cancer research or a legal brief, or just help you find the best restaurant in Soho -- and then giving that agent a budget to spend to acquire the best and most relevant content." Participation in Cloudflare's experimental marketplace necessitates that both AI companies and publishers maintain Cloudflare accounts. Within these accounts, both parties can establish rates for the purchase and sale of a "crawl" of the publisher's content. Cloudflare functions as an intermediary in these transactions, charging the AI company and subsequently distributing the earnings to the publisher. Ripley Park, a Cloudflare spokesperson, confirmed that "Pay per Crawl" does not currently involve stablecoins or cryptocurrency.
[10]
Cloudflare launches tool to help website owners monetise AI bot crawler access - The Economic Times
The tool allows website owners to choose whether artificial intelligence crawlers can access their material and set a price for access through a "pay per crawl" model, which will help them control how their work is used and compensated, Cloudflare said.Cloudflare has launched a tool that blocks bot crawlers from accessing content without permission or compensation to help websites make money from AI firms trying to access and train on their content, the software company said on Tuesday. The tool allows website owners to choose whether artificial intelligence crawlers can access their material and set a price for access through a "pay per crawl" model, which will help them control how their work is used and compensated, Cloudflare said. With AI crawlers increasingly collecting content without sending visitors to the original source, website owners are looking to develop additional revenue sources as search traffic referrals that once generated advertising revenue decline. The initiative is supported by major publishers including Conde Nast and Associated Press, as well as social media companies such as Reddit and Pinterest. Cloudflare's Chief Strategy Officer Stephanie Cohen said the goal of such tools was to give publishers control over their content, and ensure a sustainable ecosystem for online content creators and AI companies. "The change in traffic patterns has been rapid, and something needed to change," Cohen said in an interview. "This is just the beginning of a new model for the internet." Google, for example, has seen its ratio of crawls to visitors referred back to sites drop to 18:1 from 6:1 just six months ago, according to Cloudflare data, suggesting the search giant is maintaining its crawling but decreasing referrals. The decline could be a result of users finding answers directly within Google's search results, such as AI Overviews. Still, Google's ratio is much higher than other AI companies, such as OpenAI's 1,500:1. For decades, search engines have indexed content on the internet directing users back to websites, an approach that rewards creators for producing quality content. However, AI companies' crawlers have disrupted this model because they harvest material without sending visitors to the original source and aggregate information through chatbots such as ChatGPT, depriving creators of revenue and recognition. Many AI companies are circumventing a common web standard used by publishers to block the scraping of their content for use in AI systems, and argue they have broken no laws in accessing content for free. In response, some publishers, including the New York Times, have sued AI companies for copyright infringement, while others have struck deals to license their content. Reddit, for example, has sued AI startup Anthropic for allegedly scraping Reddit user comments to train its AI chatbot, while inking a content licensing deal with Google.
Share
Copy Link
Cloudflare introduces a new tool allowing website owners to charge AI companies for content scraping, aiming to balance content creation and AI innovation.
Cloudflare, a leading cloud infrastructure provider, has launched a groundbreaking "Pay-Per-Crawl" feature, currently in private beta. This new tool allows content creators to charge AI crawlers for scraping their websites, potentially reshaping the relationship between publishers and AI companies 12.
Source: Reuters
The system enables website owners to set their own prices for AI bot access, block certain crawlers, or allow free scraping. Cloudflare acts as an intermediary, charging AI companies and distributing earnings to publishers 2. This approach aims to give content creators more control over their data while still fostering AI innovation.
Several major publishers, including Condé Nast, TIME, The Associated Press, and The Atlantic, have joined the initiative 25. On the AI side, companies like ProRata (operating Gist.AI) have agreed to participate, believing in fair compensation for content creators 4.
Cloudflare's data reveals a significant shift in web crawling patterns. For instance, Google's crawler now scrapes websites 14 times for every referral, while OpenAI's crawler does so 17,000 times per referral 2. This trend highlights the growing disparity between content usage and compensation in the AI era.
In a bold move, Cloudflare has also announced that all new accounts will have AI crawlers blocked by default 34. This shift from an opt-out to a permission-based model aims to give website owners more control over their content's usage in AI training and applications.
Source: MIT Technology Review
The initiative could significantly alter the power dynamic between publishers and AI companies. Nicholas Thompson, CEO of The Atlantic, suggests that this could lead to more negotiations and licensing deals, potentially creating a competitive advantage for AI companies that secure better agreements with publishers 4.
While the Pay-Per-Crawl system shows promise, its success depends on widespread adoption by both publishers and AI companies. Some experts predict that this could lead to a new "agentic" future, where AI agents negotiate access to digital resources on behalf of users 12.
Source: Wired
The move has garnered support from various sectors of the tech and publishing industries. Stephanie Cohen, Cloudflare's Chief Strategy Officer, emphasized that this is "just the beginning of a new model for the internet" 5. However, it remains to be seen how major AI players will respond to this initiative.
As the AI landscape continues to evolve, Cloudflare's Pay-Per-Crawl feature represents a significant step towards balancing the interests of content creators and AI innovators. Its implementation and adoption could play a crucial role in shaping the future of content monetization and AI development.
Summarized by
Navi
[2]
[3]
MIT Technology Review
|Cloudflare will now, by default, block AI bots from crawling its clients' websitesApple is reportedly in talks with OpenAI and Anthropic to potentially use their AI models to power an updated version of Siri, marking a significant shift in the company's AI strategy.
29 Sources
Technology
19 hrs ago
29 Sources
Technology
19 hrs ago
Elon Musk's AI company, xAI, has raised $10 billion in a combination of debt and equity financing, signaling a major expansion in AI infrastructure and development amid fierce industry competition.
5 Sources
Business and Economy
11 hrs ago
5 Sources
Business and Economy
11 hrs ago
Google announces a major expansion of AI tools for education, including Gemini for Education and NotebookLM, aimed at enhancing learning experiences for students and supporting educators in classroom management.
8 Sources
Technology
19 hrs ago
8 Sources
Technology
19 hrs ago
NVIDIA's upcoming GB300 Blackwell Ultra AI servers, slated for release in the second half of 2025, are poised to become the most powerful AI servers globally. Major Taiwanese manufacturers are vying for production orders, with Foxconn securing the largest share.
2 Sources
Technology
11 hrs ago
2 Sources
Technology
11 hrs ago
Oracle has signed a mysterious cloud contract worth over $30 billion annually, set to begin in fiscal year 2028. This deal could potentially double Oracle's current cloud revenue, with speculation pointing towards AI-related infrastructure, possibly involving OpenAI's Project Stargate.
4 Sources
Business and Economy
11 hrs ago
4 Sources
Business and Economy
11 hrs ago