Chatbots obsessed with Elias Thorne reveal hidden flaws in AI training data and model collapse

Reviewed byNidhi Govil

3 Sources

Share

A mysterious character named Elias Thorne appears in 26.5% of AI-generated stories across ChatGPT, Claude, and Gemini. Cornell researchers analyzed 20,000 stories and found 88% share just 11 recurring words. The phenomenon traces back to alignment training and shared datasets like WildChat, raising concerns about AI inbreeding and model collapse as Elias escapes chatbots to flood Amazon books and YouTube.

Elias Thorne Dominates AI-Generated Stories Across Major Platforms

A peculiar pattern has emerged across large language models: ask ChatGPT, Claude, or Gemini to tell you a story, and there's a strong chance you'll meet Elias Thorne. Software engineer Daniel May first noticed this phenomenon earlier this year, observing that Google Trends showed virtually no searches for "Elias Thorne" until late 2025, with searches spiking dramatically in early 2026

2

. When May tested chatbots including Grok, Deepseek, and Gemini with the simple prompt "tell me a story," the models frequently generated narratives about lighthouses, clockmakers, or explorers—often featuring Elias as the protagonist

2

.

Source: Gizmodo

Source: Gizmodo

Researchers Sil Hamilton and David Mimno at Cornell University decided to investigate this strange fixation systematically. Their paper, "Elias in the Lighthouse, Again?" published on arXiv in late May, analyzed 20,000 stories generated by OpenAI's ChatGPT, Anthropic's Claude, Google's Gemini, and the Allen Institute for AI's chatbot

2

. The findings were striking: the name Elias appeared in 26.5% of all generated stories, while 88% shared the same 11 words—names like Elias, Mara, and Elara, alongside occupations such as lighthouse keeper, clockmaker, librarian, fisherman, baker, mayor, and conductor

1

. No combination appeared more frequently than Elias the lighthouse keeper, which showed up in two-thirds of all stories

1

.

AI Training Data and the WildChat Connection

The researchers initially considered whether pre-training data might explain the repetitive AI content, but quickly ruled this out when they found no evidence that "Elias the lighthouse keeper" appears with unusual frequency in literature or pre-training datasets

1

. Instead, they traced the issue to specific datasets that have become widely adopted across AI labs, particularly WildChat—an open-source collection of 1 million real conversations with a GPT-3.5-powered chatbot

2

.

Hamilton explained that "model development today is like a big family tree," with OpenAI's GPT-3.5 serving as the root because it generated WildChat, which has since been used to create other training sets

2

. Within WildChat, 166 conversations contain the name Elias written in the familiar "lighthouse" style

2

. The researchers believe alignment training designed to steer models away from copyrighted content and adult material inadvertently elevated "safe" alternatives like Elias the lighthouse keeper to unusual prominence

1

. Hamilton described the spread as viral: "Models trained on WildChat copied this style, and developers unwittingly replicated it when using those models to generate newer datasets. It's like a virus"

2

.

From Chatbots to Amazon: Elias Escapes Containment

The implications extend far beyond chatbot conversations. Daniel May discovered that Elias Thorne now appears as an author on Amazon, with bylines on books ranging from alternative cancer treatment handbooks to YouTube algorithm guides, Greek mythology texts, and psychological thrillers

2

. May noted that "no human writes all of those," highlighting how the mode-collapsed name has migrated from chat windows to bylines across genres, including content where "bad advice causes real harm"

2

.

Searches on Amazon reveal Elias as a protagonist in fantasy series—described as "a brilliant but cynical archaeologist"—and even as a musical artist producing ambient albums of birds and nature sounds

2

. The character has also infiltrated YouTube slop, appearing in videos like "83-year-old Sergeant Major Elias Thorne" and on AI-generated news sites with fabricated stories about snake museum owners and wealthy Ohioans

2

. This proliferation demonstrates how AI-generated books have flooded Amazon's self-publishing platform, creating serious problems with dangerous misinformation and making librarians' jobs increasingly difficult

2

.

Model Collapse and AI Inbreeding Threaten Future Quality

The Elias Thorne phenomenon points to a larger concern about model collapse, also known as AI inbreeding

3

. As more internet content becomes AI-generated, future models trained on this material will learn from increasingly low-quality data, producing even worse output in a degrading cycle

3

. The Cornell researchers' findings align with previous studies showing that AI creativity remains fundamentally limited—a 2024 study found image generation models repeatedly produce images falling into just 12 specific motifs regardless of prompt variety

1

.

The ubiquity of Elias Thorne across AI-generated stories reveals how AI safety measures and shared training datasets can create unexpected consequences. When models learn from each other and pull from limited pools of "safe" content to avoid copyrighted characters, quirks replicate rapidly across the AI ecosystem

3

. This matters because it demonstrates that large language models lack genuine AI creativity, instead recycling narrow patterns that spread like contagion through interconnected training pipelines. Watch for continued degradation in content quality as AI-generated material increasingly trains future models, and scrutinize self-published books and online content for telltale signs of algorithmic authorship spreading misinformation under recycled names.

Today's Top Stories

© 2026 TheOutpost.AI All rights reserved