Perplexity AI Accused of Stealthy Web Scraping, Igniting Debate on AI Crawlers' Ethics

Reviewed byNidhi Govil

23 Sources

Cloudflare alleges that AI search engine Perplexity is using stealth tactics to bypass website crawling restrictions, sparking a debate on the ethics of AI web crawlers and the future of internet content access.

Cloudflare's Allegations Against Perplexity AI

Cloudflare, a leading network security and optimization service, has accused AI search engine Perplexity of employing "stealth tactics" to circumvent websites' no-crawl directives. According to a blog post by Cloudflare researchers, Perplexity continued to access content from sites that had explicitly disallowed its crawlers through robots.txt files and Web Application Firewalls (WAF) 1.

The researchers claim that when Perplexity's known crawlers encountered blocks, the company deployed a stealth bot that:

  1. Utilized multiple IPs not listed in Perplexity's official IP range
  2. Rotated through these IPs to evade restrictions
  3. Used requests from different Autonomous System Numbers (ASNs)
  4. Impersonated Google Chrome on macOS to mask its identity

This activity was reportedly observed across tens of thousands of domains and millions of requests per day 2.

Perplexity's Response and Defenders

Source: The Verge

Source: The Verge

Perplexity has dismissed Cloudflare's blog post as a "sales pitch" and denied ownership of the bot named in the report 2. In a subsequent blog post, Perplexity claimed that the behavior was from a third-party service it occasionally uses 3.

Interestingly, some defenders of Perplexity argue that an AI accessing a website on behalf of its user should be treated like a human making the same request, not as a bot. This perspective highlights the growing complexity of distinguishing between legitimate AI assistants and potential threats 3.

Implications for Internet Norms and Content Access

If true, Perplexity's actions would violate long-standing Internet norms, particularly the Robots Exclusion Protocol proposed in 1994 and formalized as an Internet Engineering Task Force standard in 2022 1.

This controversy raises important questions about the future of web crawling and content access in the age of AI:

  1. Should AI agents accessing websites on behalf of users be treated differently from traditional web crawlers?
  2. How can website owners protect their content while still allowing legitimate AI-assisted user access?
  3. What are the implications for online business models, particularly for content creators and publishers?

Industry Response and Measures

Source: Decrypt

Source: Decrypt

In response to these findings, Cloudflare has taken several actions:

  1. De-listed Perplexity as a verified bot
  2. Added heuristics to block the alleged stealth crawling
  3. Launched a marketplace allowing website owners to charge AI scrapers for access
  4. Introduced a free tool to prevent unauthorized AI scraping 4

Other companies, like OpenAI, are cited as examples of AI firms that respect robots.txt directives and do not attempt to evade network-level blocks 5.

Broader Context and Future Implications

Source: Ars Technica

Source: Ars Technica

This incident is part of a larger trend where AI companies' data collection practices are coming under scrutiny. With bot activity now outstripping human activity online for the first time in internet history, the debate over AI crawlers' ethics and rights is likely to intensify 3.

As AI agents become more prevalent in everyday internet use, the industry will need to grapple with balancing the benefits of AI-assisted browsing against the rights and preferences of content creators and website owners. This controversy may well shape the future of how AI interacts with the open web.

Explore today's top stories

OpenAI Releases Open-Weight AI Models, Marking Shift in Strategy

OpenAI launches two open-weight AI reasoning models, gpt-oss-120b and gpt-oss-20b, marking a significant shift in the company's strategy and responding to the growing dominance of Chinese open-source AI.

TechCrunch logoMIT Technology Review logoWired logo

23 Sources

Technology

7 hrs ago

OpenAI Releases Open-Weight AI Models, Marking Shift in

DeepMind's Genie 3: A Leap Towards Interactive AI Worlds and AGI

Google DeepMind unveils Genie 3, an advanced AI world model capable of generating real-time, interactive 3D environments, marking a significant step towards artificial general intelligence (AGI).

Ars Technica logoTechCrunch logoThe Verge logo

12 Sources

Technology

7 hrs ago

DeepMind's Genie 3: A Leap Towards Interactive AI Worlds

ChatGPT Introduces Break Reminders Amid Mental Health Concerns

OpenAI implements new features in ChatGPT to address mental health concerns, including break reminders and improved detection of emotional distress.

CNET logoThe Verge logoZDNet logo

17 Sources

Technology

23 hrs ago

ChatGPT Introduces Break Reminders Amid Mental Health

ChatGPT Surges to 700 Million Weekly Users as OpenAI Prepares for GPT-5 Launch

OpenAI's ChatGPT is set to reach 700 million weekly active users, marking a significant milestone in AI adoption. This growth comes as the company prepares to launch GPT-5, integrating advanced reasoning capabilities into its flagship model.

TechCrunch logoCNBC logoVentureBeat logo

5 Sources

Technology

23 hrs ago

ChatGPT Surges to 700 Million Weekly Users as OpenAI

Google Mocks Apple's AI Delays in Cheeky Pixel 10 Ad

Google takes a jab at Apple's delayed AI features in a new Pixel 10 ad, highlighting the ongoing competition in AI-powered smartphone assistants.

The Verge logoPC Magazine logoTechSpot logo

13 Sources

Technology

23 hrs ago

Google Mocks Apple's AI Delays in Cheeky Pixel 10 Ad
TheOutpost.ai

Your Daily Dose of Curated AI News

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

© 2025 Triveous Technologies Private Limited
Instagram logo
LinkedIn logo