Encyclopedia Britannica sues OpenAI for massive copyright infringement over ChatGPT training

Reviewed byNidhi Govil

20 Sources

Share

Encyclopedia Britannica and Merriam-Webster have filed a lawsuit against OpenAI, alleging the AI company scraped nearly 100,000 copyrighted articles without permission to train its large language models. The publishers claim ChatGPT generates verbatim reproductions of their content, cannibalizes web traffic, and falsely attributes AI-generated hallucinations to Britannica, threatening the public's access to trustworthy information.

Encyclopedia Britannica Takes Legal Action Against OpenAI

Encyclopedia Britannica and its subsidiary Merriam-Webster have launched legal action against OpenAI in Manhattan federal court, alleging massive copyright infringement in what marks another significant challenge to AI companies' training practices

1

. The lawsuit claims that OpenAI scraped nearly 100,000 copyrighted articles from the publisher's online encyclopedia and dictionary entries to train AI models, specifically its large language models (LLMs) powering ChatGPT, without authorization or compensation

5

. Britannica, which retains copyright to these extensive reference materials, argues that using copyrighted content in this manner constitutes unlawful appropriation of its intellectual property

1

.

Source: MediaNama

Source: MediaNama

ChatGPT Produces Verbatim Reproductions of Protected Content

The complaint includes side-by-side examples demonstrating how ChatGPT generates responses containing "full or partial verbatim reproductions" of Britannica's content, with entire passages appearing to match word for word

3

. According to the lawsuit, "ChatGPT then provides narrative responses to user queries that often contain verbatim or near-verbatim reproductions, summaries or abridgements of original content, including [Britannica's] copyrighted works"

2

. The publishers also accuse OpenAI of copyright violations when it uses their articles in ChatGPT's Retrieval Augmented Generation (RAG) workflow, which scans the web or other databases for newly updated information when responding to queries

1

.

Source: CXOToday

Source: CXOToday

Cannibalizing Web Traffic and Revenue Streams

Britannica alleges that ChatGPT is cannibalizing web traffic by generating responses that "substitute, and directly compete with" the publisher's content rather than directing users to its website as traditional search engines would

3

. "ChatGPT starves web publishers like [Britannica] of revenue by generating responses to users' queries that substitute, and directly compete with, the content from publishers like [Britannica]," the lawsuit states

1

. The complaint emphasizes that OpenAI reproduces "web publishers' copyrighted content without authorization or remuneration," thereby limiting the traffic and revenue Britannica generates from its online properties

4

.

Trademark Infringement and AI-Generated Hallucinations

Beyond copyright infringement, Britannica accuses OpenAI of trademark infringement under the Lanham Act when ChatGPT produces AI-generated hallucinations and falsely attributes them to the publisher

1

. According to the complaint, ChatGPT often omits portions of Britannica's explanations and wrongly attributes the publisher to incomplete and inaccurate responses

4

. Use of Britannica's trademarks in this manner "deceives users into believing that the hallucinations and/or undisclosed omissions" are approved by the publisher, the complaint adds

4

. Britannica warns that these hallucinations jeopardize "the public's continued access to high-quality and trustworthy online information"

1

.

Growing Wave of Publisher Lawsuits Against AI Companies

Britannica joins numerous publishers and writers pursuing legal action against OpenAI over copyright issues. The New York Times, Ziff Davis (owner of Mashable, CNET, IGN, PC Mag, and others), and more than a dozen newspapers across the US and Canada, including the Chicago Tribune, the Denver Post, the Sun-Sentinel, the Toronto Star, and the Canadian Broadcasting Corporation have sued OpenAI

1

. A similar lawsuit Britannica filed against Perplexity remains pending

1

. In September, Anthropic settled a class action lawsuit for using copyrighted books to train AI models, resulting in a $1.5 billion payout to authors

3

.

Fair Use Defense and Uncertain Legal Precedent

OpenAI maintains its position that using publicly available data to train AI models falls under fair use. An OpenAI spokesperson told CNET: "Our models empower innovation, and are trained on publicly available data and grounded in fair use"

2

. AI companies have consistently argued that their systems make fair use of copyrighted content by transforming it into something new

5

. However, strong legal precedent establishing whether using copyrighted content as training data constitutes infringement remains limited. In one notable case, Anthropic convinced federal judge William Alsup that using content as training data is transformative enough to be legal, though Alsup ruled Anthropic violated the law by illegally downloading millions of books rather than paying for them, warranting the $1.5 billion class action settlement

1

. Britannica seeks damages, restitution of profits, and an injunction blocking OpenAI's alleged unlawful activities

4

.

Source: The Next Web

Source: The Next Web

Today's Top Stories

TheOutpost.ai

Your Daily Dose of Curated AI News

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

© 2026 Triveous Technologies Private Limited
Instagram logo
LinkedIn logo