AI-Generated Websites Hit 35% of New Sites

AI Takeover of the Web Accelerates at Unprecedented Pace

The Dead Internet Theory—once dismissed as conspiracy—is inching toward reality. A comprehensive study titled "The Impact of AI-Generated Text on the Internet" reveals that 35.3% of all newly published websites by mid-2025 were created with AI assistance, with 17.6% being completely AI-generated 1

. This figure represents a dramatic shift from essentially zero before ChatGPT launched in November 2022 3

Source: Decrypt

Researchers from Stanford University, Imperial College London, and the Internet Archive partnered to examine this transformation of the digital landscape. "I find the sheer speed of the AI takeover of the web quite staggering," Jonáš Doležal, an AI researcher at Stanford and co-author of the paper, told 404 Media. "After decades of humans shaping it, a significant portion of the internet has become defined by AI in just three years" 2

How Researchers Tracked the AI-Generated Websites Phenomenon

The research team used the Internet Archive's Wayback Machine to pull samples of websites from 33 months between August 2022 and May 2025. For each sampled URL, they retrieved the oldest available archived snapshot and extracted raw HTML for analysis 2

. The team then deployed AI detection tools, ultimately selecting Pangram v3 for its superior detection rate in identifying AI-created content.

Source: 404 Media

This finding aligns with previous reports on bot activity. Cloudflare reported in September 2025 that nearly one-third of all internet traffic was driven by bots, while data security company Imperva claimed that automated surfing surpassed human activity for the first time in 2024, making up roughly half of all web traffic 1

Reduction in the Diversity of Ideas Confirmed by Data

The researchers tested six common critiques about AI-generated text to understand its impact on the web. Only two hypotheses proved true under scrutiny. The first confirmed concern involves semantic contraction: AI-generated websites showed pairwise semantic similarity scores 33% higher than human-written ones 3

. Language models optimize for outputs close to their training distribution, causing the same ideas to be expressed in nearly identical ways. This represents a measurable reduction in the diversity of ideas and unique viewpoints across the internet.

Increase in Artificially Positive Content Reshapes Online Tone

The second validated hypothesis centers on what researchers call the "positivity shift." AI-generated content showed positive sentiment scores more than 107% higher than human content 3

. This sanitized and artificially cheerful writing stems from the well-documented sycophantic tendencies of language models, which are trained on human approval signals and produce friction-free, relentlessly upbeat text.

Even OpenAI CEO Sam Altman acknowledged this phenomenon. After the company released its AI coding agent, Altman admitted that intense praise on the subreddit r/Claudecode felt somewhat bot-driven 1

. An internet flooded with this increase in artificially positive content may marginalize human dissent at scale without deliberate intervention.

Source: MakeUseOf

Factual Accuracy Concerns Not Supported by Evidence

Surprisingly, the study found no statistically significant evidence that AI content is degrading factual accuracy on the internet. "The most surprising result was that our Truth Decay hypothesis wasn't confirmed," Doležal said. "We were specifically looking for an increase in verifiably untrue statements, which we didn't find" 2

. Researchers paid human fact-checkers to verify claims extracted from AI-generated websites and found no meaningful correlation between AI prevalence and factual error rates.

The team also tested whether AI was creating content without citing sources, whether it produced low semantic density text, and whether it forced writing styles into a monoculture. None of these hypotheses were confirmed by the data 1

Model Collapse Risk Shifts from Theory to Reality

At 35% AI prevalence, the model collapse risk—where future models degrade after training on AI-generated data—shifts from theoretical concern to empirical reality 3

. Future foundation models trained on contemporary web crawls will inevitably ingest data that is substantially AI-generated and measurably less semantically diverse. This creates a feedback loop where language models train on their own outputs, potentially degrading performance over time.

Continuous Monitoring Tool in Development

The research team plans to expand their work beyond this initial snapshot. "We're now working with the Internet Archive to turn this into a continuous tool that keeps providing this signal going forward, rather than a single fixed snapshot bounded by the static nature of a paper," Maty Bohacek, a student researcher at Stanford and co-author, told 404 Media 2

. The tool will track which kinds of websites are most affected, broken down by category or language, providing more nuanced understanding of where impacts are landing.

A U.S. survey conducted alongside the study found most Americans already believe all six negative hypotheses about AI content, including those the data doesn't support. People who use AI infrequently were 12% more likely to believe in the harms than frequent users . As AI detection tools improve and monitoring continues, readers seeking human-generated content will need to verify sources, check bylines for actual authors, and remain skeptical of sites producing large volumes of content 4

Dead Internet Theory edges closer as 35% of new websites are now AI-generated, study reveals

AI Takeover of the Web Accelerates at Unprecedented Pace

How Researchers Tracked the AI-Generated Websites Phenomenon

Reduction in the Diversity of Ideas Confirmed by Data

Increase in Artificially Positive Content Reshapes Online Tone

Factual Accuracy Concerns Not Supported by Evidence

Model Collapse Risk Shifts from Theory to Reality

Continuous Monitoring Tool in Development

References

Dead Internet Theory Is 17% of the Way to Becoming Reality, Study Finds

Study Finds A Third of New Websites are AI-Generated

Dead Internet? A Third of New Websites Are AI-Generated, Says Stanford - Decrypt

One-third of new websites are now AI-generated -- here's how to spot them

Related Stories

AI-Generated Content Surpasses Human-Written Articles Online, But Growth Plateaus

Sam Altman Raises Concerns About AI Bots Flooding Social Media

AI-Generated Content Threatens Accuracy of Large Language Models

Recent Highlights

Google Search gets its biggest AI overhaul in 25 years with agentic AI and intelligent search box

Google bets on AI agents with Gemini 3.5 Flash, Spark, and Omni at I/O 2026

Google Expands SynthID AI Detection to Chrome and Search With OpenAI and Nvidia Support

Recent Highlights

Today's Top Stories

AI passes the Turing Test as GPT-4.5 appears more human than actual people in landmark study

SpaceX files for IPO with $28.5 trillion market target as Elon Musk bets big on AI in space

Anthropic to pay SpaceX $1.25 billion monthly for computing power as it nears first profit

Google's Co-Scientist AI system generates novel hypotheses to accelerate scientific discovery