20 Sources
20 Sources
[1]
The dictionary sues OpenAI | TechCrunch
Encyclopedia Britannica and Merriam-Webster have filed a lawsuit against OpenAI, alleging in its complaint that the AI giant has committed "massive copyright infringement." Britannica, which owns Merriam-Webster, retains the copyright to nearly 100,000 online articles, which have been scraped and used to train OpenAI's LLMs without permission, the publisher alleges in the lawsuit. Britannica also accuses OpenAI of violating copyright laws when it generates outputs that contain "full or partial verbatim reproductions" of its content and when the AL lab uses its articles in ChatGPT's RAG (retrieval augmented generation) workflow. OpenAI's RAG tool is how the LLM scans the web or other databases for newly updated information when responding to a query. Britannica also alleges that OpenAI violates the Lanham Act, a trademark statute, when it generates made-up hallucinations and attributes them falsely to the publisher. "ChatGPT starves web publishers like [Britannica] of revenue by generating responses to users' queries that substitute, and directly compete with, the content from publishers like [Britannica]," the lawsuit reads. Britannica also alleges ChatGPT's hallucinations jeopardize "the public's continued access to high-quality and trustworthy online information." Britannica joins a number of other publishers and writers in pursuing legal action against OpenAI over copyright issues. The New York Times, Ziff Davis (owner of Mashable, CNET, IGN, PC Mag, and others), and more than a dozen newspapers across the US and Canada, including the Chicago Tribune, the Denver Post, the Sun-Sentinel, the Toronto Star, and the Canadian Broadcasting Corporation have sued OpenAI. A similar Britannica lawsuit against Perplexity is still pending. There is not a strong legal precedent that establishes whether or not using copyrighted content to train an LLM is copyright infringement or not. But in one particular instance, Anthropic successfully convinced federal judge William Alsup that this use case -- using the content as training data -- is transformative enough to be legal. However, Alsup argued that Anthropic violated the law by illegally downloading millions of books, rather than paying for them, which warranted a $1.5 billion class action settlement for impacted writers. OpenAI did not respond to TechCrunch's request for comment before publication.
[2]
Encyclopedia Britannica and Merriam-Webster Sue OpenAI
Two old-school reference works are going up against a very modern tech product. Encyclopedia Britannica and its subsidiary, Merriam-Webster, have sued OpenAI, alleging the tech company used Britannica's content to train AI models without permission. The lawsuit said OpenAI's chatbot, ChatGPT, has copied Britannica's copyrighted content to train its large language models. "ChatGPT then provides narrative responses to user queries that often contain verbatim or near-verbatim reproductions, summaries or abridgements of original content, including [Britannica's] copyrighted works," the lawsuit alleges. (Disclosure: Ziff Davis, CNET's parent company, in 2025 filed a lawsuit against OpenAI, alleging it infringed Ziff Davis copyrights in training and operating its AI systems.) The lawsuit said ChatGPT-based products' summaries of Encyclopedia Britannica's content cannibalize traffic, and that OpenAI reproduces "web publishers' copyrighted content without authorization or remuneration." The lawsuit from Encyclopedia Britannica and Merriam-Webster continues a trend of content owners suing AI companies for copyright infringement. Anthropic and Meta last year won lawsuits under the fair use exception that allows them to use copyrighted content without creators' permission. Britannica also filed a lawsuit against Perplexity last year, which is still pending. In relation to the new lawsuit, an OpenAI spokesperson told CNET via email: "Our models empower innovation, and are trained on publicly available data and grounded in fair use." Encyclopedia Britannica did not immediately respond to a request for comment.
[3]
Encyclopedia Britannica is suing OpenAI for allegedly 'memorizing' its content with ChatGPT
The lawsuit goes on to include examples of responses from OpenAI's models side by side with Britannica's text, in which entire passages appear to match word for word. Britannica also claims that OpenAI has been "cannibalizing" its web traffic by generating responses that "substitute, or directly compete" with Britannica's content, rather than directing users to its website the way a traditional search engine would. It's the latest in a growing series of copyright lawsuits from publishers aimed at AI companies over the past several years. The New York Times has made similar claims in its ongoing lawsuit against OpenAI, including accusing the AI company of copying mass amounts of its copyrighted content. In September, Anthropic settled a class action lawsuit for using copyrighted books to train its AI models, resulting in a $1.5 billion payout to the books' authors.
[4]
Encyclopedia Britannica Sues OpenAI Over Alleged Copyright Infringement
Encyclopedia Britannica has become the latest publisher to sue OpenAI for alleged copyright infringement. In a complaint filed last week, Britannica and its subsidiary, Merriam-Webster, have accused the ChatGPT-maker of scraping nearly 100,000 copyrighted articles for AI training. Britannica also claims that ChatGPT provides responses similar to its own content, thereby limiting the web traffic and revenue it generates. The publisher has included screenshots to show how closely ChatGPT's responses resemble its content. "ChatGPT starves web publishers like Plaintiffs of revenue by generating responses to users' queries that substitute, and directly compete with, the content from publishers like Plaintiffs," the lawsuit says. "To build its substitute products, Defendants engage in massive copying of Plaintiffs' and other web publishers' copyrighted content without authorization or remuneration." Similar claims can be found in lawsuits filed by The New York Times and PCMag's parent company, Ziff Davis, against OpenAI. Britannica also accuses OpenAI of trademark infringement. According to the complaint, ChatGPT often omits portions of Britannica's explanations and wrongly attributes the publisher to its incomplete and inaccurate responses. Use of Britannica's trademarks in this manner "deceives users into believing that the hallucinations and/or undisclosed omissions" are approved by the publisher, the complaint adds. With this lawsuit, Britannica seeks damages, restitution of profits, and a ban on OpenAI's alleged unlawful activities. The company has also filed a similar lawsuit against Perplexity. OpenAI, meanwhile, says it hasn't obtained content illegally. "Our models empower innovation, and are trained on publicly available data and grounded in fair use," a company spokesperson said in response to the lawsuit. Disclosure: Ziff Davis, PCMag's parent company, filed a lawsuit against OpenAI in April 2025, alleging it infringed Ziff Davis copyrights in training and operating its AI systems.
[5]
Encyclopedia Britannica sues OpenAI over AI training
March 16 (Reuters) - Encyclopedia Britannica and its Merriam-Webster subsidiary have sued OpenAI in Manhattan federal court for allegedly misusing their reference materials to train its artificial intelligence models. Britannica said in the complaint, opens new tab filed on Friday that Microsoft-backed OpenAI used its online articles and encyclopedia and dictionary entries to teach its flagship chatbot ChatGPT to respond to human prompts and "cannibalized" Britannica's web traffic with AI-generated summaries of its content. Spokespeople for the companies did not immediately respond to requests for comment on the complaint on Monday. The case is one of many high-stakes lawsuits filed by copyright owners including authors and news outlets against tech companies for using their material to train AI systems without permission. Britannica filed a related lawsuit against artificial intelligence startup Perplexity AI last year that is still ongoing. AI companies have argued that their systems make fair use of copyrighted content by transforming it into something new. Britannica's lawsuit said that OpenAI unlawfully copied nearly 100,000 of its articles to train GPT large language models. The complaint said that ChatGPT produces "near-verbatim" copies of Britannica's encyclopedia entries, dictionary definitions and other content, diverting users who would otherwise visit its websites. Britannica also accused OpenAI of infringing its trademarks by implying that it has permission to reproduce its material and wrongfully citing Britannica in false AI "hallucinations." Britannica requested an unspecified amount of monetary damages and a court order blocking the alleged infringement. Reporting by Blake Brittain in Washington; Editing by Kirsten Donovan Our Standards: The Thomson Reuters Trust Principles., opens new tab * Suggested Topics: * Artificial Intelligence * Data Privacy * Intellectual Property Blake Brittain Thomson Reuters Blake Brittain reports on intellectual property law, including patents, trademarks, copyrights and trade secrets, for Reuters Legal. He has previously written for Bloomberg Law and Thomson Reuters Practical Law and practiced as an attorney.
[6]
Encyclopedia Britannica sues OpenAI for copyright and trademark infringement
OpenAI has been hit with another lawsuit. This time, Encyclopedia Britannica took legal action against OpenAI, accusing the company of copyright and trademark infringements, as first reported by Reuters. More specifically, Britannica alleged that OpenAI illegally used its "copyrighted content at a massive scale" when training its AI models. Not just with training, the encyclopedia company claimed that ChatGPT's responses to user queries sometimes contain "full or partial verbatim reproductions of [Britannica's] copyright articles." Along with claims of copyright violations, Britannica argued that OpenAI was also responsible for trademark infringement. According to the lawsuit, ChatGPT generates "made-up content or 'hallucinations' and falsely attributes them" to Encyclopedia Britannica. The lawsuit doesn't specify an amount for monetary damages, but Britannica is also seeking an injunction to prevent OpenAI from repeating these accusations. When reached out for comment, a spokesperson for OpenAI told Engadget that, "ChatGPT helps enhance human creativity, advance scientific discovery and medical research, and enable hundreds of millions of people to improve their daily lives. Our models empower innovation, and are trained on publicly available data and grounded in fair use." It's not the first time that Britannica has filed a lawsuit against an AI company. In September, the company, which owns Merriam-Webster, also sued Perplexity for similar reasons. On the other side, OpenAI is still embroiled in a legal battle with The New York Times, which also sued the AI giant for copyright infringement.
[7]
Encyclopedia Britannica Sues OpenAI Over AI Training Data. Is Grokipedia Next?
Encyclopedia Britannica and Merriam-Webster say ChatGPT is 'cannibalizing' their traffic. Encyclopedia Britannica wants ChatGPT to stop copying its work. The publisher of the world's longest-running English-language encyclopedia, along with its subsidiary Merriam-Webster, is suing OpenAI, accusing the company behind ChatGPT of copyright infringement. Britannica filed the lawsuit in federal court in Manhattan on Friday. The complaint alleges that OpenAI and its AI products are free-riding on the encyclopedia's and dictionary's "trusted, high-quality content" while cannibalizing traffic to their websites. Britannica claims that OpenAI pilfered its content, including nearly 100,000 online articles, to train its AI models and generate answers that "copy or mimic, sometimes verbatim," its material. The company also takes issue with instances where the chatbot attributes AI hallucinations to Britannica. The encyclopedia is seeking damages and restitution of profits, and is asking the court to stop OpenAI from engaging in the alleged unlawful conduct. This is far from the first copyright fight OpenAI has found itself in. The company is already facing similar lawsuits from media outlets, including The New York Times, The Intercept, and U.S. News & World Report. Its rival, Anthropic, also recently settled a case with several authors and publishers over the alleged use of 7 million pirated books, reportedly for $1.5 billion. And Britannica isn't new to squaring off with AI companies in the courtroom either. The publisher also sued Perplexity in September over similar copyright claims. So it's a bit puzzling how Elon Musk's Grok and its online encyclopedia, Grokpedia, have so far avoided the kind of legal scrutiny facing other AI companies. Musk launched Grokipedia last October as an alternative to Wikipedia. In a post in September, Musk said Grokipedia would be "a massive improvement over Wikipedia." He has also repeatedly mocked the online encyclopedia as "Wokipedia," arguing that there is no major alternative that reflects right-wingers' version of reality. His solution was to create a new platform where articles are generated by AI. Much of Grokpedia's content appears to draw heavily from (or outright clone) Wikipedia, though often with framing that tilts toward Musk's political views. So far, Grok's parent company, xAI, has faced more legal trouble over sexualized deepfakes than copyright claims. But the issue may not be far behind. Some authors, including New York Times reporter John Carreyrou, filed a lawsuit in December against xAI alongside Anthropic, OpenAI, Meta, and Google, alleging the companies used their books without permission to train AI models. For its part, xAI tried to block a California law that would require AI companies to disclose information about the data used to train their models, arguing the requirement violates free speech. A judge recently declined to side with the company. It seems like Musk isn't too keen to reveal Grok's data sources and potentially open the company up to even more lawsuits. The reality is that xAI is a fairly new competitor in the AI landscape, having been founded in 2023. OpenAI has had an extra eight years to earn all that litigation. So, just give Grok some time. It'll catch up.
[8]
Encyclopedia Britannica and Merriam-Webster sue OpenAI
Filed in New York on 13 March 2026, the complaint accuses OpenAI of using the reference publishers' content as AI training data without permission, then generating responses that reproduce it verbatim, six months after the same companies sued Perplexity on nearly identical grounds. Encyclopedia Britannica and Merriam-Webster have filed a copyright and trademark lawsuit against OpenAI in the US District Court for the Southern District of New York, alleging that ChatGPT has been trained on and continues to reproduce their copyrighted content without authorisation, to the material detriment of both publishers. The complaint, filed on 13 March 2026 (case no. 1:2026cv02097), accuses OpenAI of using nearly 100,000 of Britannica's online articles as training inputs for its AI language models. The full extent of the copying, the complaint acknowledges, is only known to OpenAI itself. Britannica, which owns Merriam-Webster as a subsidiary, argues that the law does not permit OpenAI's systematic disregard for its intellectual property rights and calls on it to account for the substantial harm it is causing and the profits it is reaping through that infringement. "ChatGPT then provides narrative responses to user queries that often contain verbatim or near-verbatim reproductions, summaries, or abridgements of original content, including plaintiffs' copyrighted works." - from the complaint. The lawsuit is structured around two legal pillars, both of which mirror the framework the same plaintiffs used when they sued AI search engine Perplexity in September 2025. The first is copyright infringement under the Copyright Act of 1976, which gives authors the exclusive right to reproduce and distribute their works. Britannica argues OpenAI violated those rights at multiple stages: by scraping its websites to create training inputs, by feeding that content into its models during training, and then by generating outputs that reproduce or closely summarise the originals when users query ChatGPT on topics covered by Britannica's editorial catalogue. The second pillar is trademark law under the Lanham Act. By presenting AI-generated responses, which may contain inaccuracies or hallucinations, alongside Britannica's and Merriam-Webster's famous trademarks and brand identities, the complaint argues that OpenAI misleads users into believing that Britannica or Merriam-Webster has endorsed or is the source of those responses. Britannica's reputation rests on accuracy built over more than 250 years; associating that brand with AI-fabricated information, the complaint contends, causes direct reputational harm that goes beyond copyright loss alone. The underlying commercial argument follows the logic established in the broader wave of publisher lawsuits against AI companies. Britannica's business today is primarily digital, built on subscriptions and advertising revenue that depend on web traffic. When ChatGPT answers a user's question about, say, the causes of the French Revolution or the properties of a chemical element using content sourced from Britannica's articles, those users have less reason to visit Britannica's website directly. The complaint describes ChatGPT as taking a free ride on Britannica's trusted, high-quality content, shifting the value of that content to OpenAI without compensation. Britannica discontinued its 32-volume print edition in 2012, fully pivoting to digital. That transition made the quality and exclusivity of its editorial content, produced by researchers, writers, and editors working to standards the brand has maintained since its founding in Edinburgh in 1768, the central asset of the business. The complaint frames OpenAI's alleged copying not merely as a legal violation but as an existential threat to a model that cannot survive if the economic returns from that content flow to AI platforms rather than to its creators. This is not Britannica's first such case. In September 2025, the same plaintiffs, Encyclopedia Britannica and Merriam-Webster, filed an essentially parallel complaint against Perplexity, the AI-powered answer engine. That complaint, also lodged in the SDNY, alleged that Perplexity's system scraped Britannica's content to build its responses in real time, bypassing robots.txt protections and presenting verbatim or near-verbatim reproductions under the guise of AI-generated summaries. The Perplexity case is still proceeding. The OpenAI case is structurally similar, but arrives in a significantly more complex legal landscape. OpenAI is already the subject of a large multidistrict litigation (MDL) in the SDNY, currently overseen by Judge Sidney Stein, that consolidates more than a dozen copyright lawsuits brought by news publishers including the New York Times. That MDL is approaching the close of fact discovery, with no fair use ruling expected before summer 2026 at the earliest. Analysts tracking the litigation landscape, including the ChatGPT Is Eating the World legal tracker, have noted that the Britannica-OpenAI case will very likely be transferred to that MDL and then stayed pending its outcome, meaning a resolution on the merits could be years away. For now, the complaint brings the total number of copyright lawsuits filed against AI companies in the United States to 91, according to the same tracker. OpenAI had not publicly responded to the complaint at the time of writing. The Britannica filing lands in a market where the legal and licensing approaches to AI content are diverging sharply. At the same time as publishers are suing, a growing number of media organisations have signed licensing deals with AI companies, News Corp signed a deal with Meta worth up to $50 million per year in March 2026; UK publisher Reach agreed a usage-based deal with Amazon for its Nova AI model the same month. The Anthropic copyright case (Bartz v. Anthropic), involving the use of pirated books for AI training, reached a $1.5 billion class action settlement in 2025, the largest in the AI copyright litigation wave so far, establishing that these cases can produce significant financial consequences.
[9]
Encyclopedia Britannica just sued OpenAI over ChatGPT -- here's why AI training is under fire (again)
A new lawsuit filed in federal court could have major implications for the future of AI Encyclopaedia Britannica -- one of the oldest and most respected reference publishers in the world -- has filed a lawsuit against OpenAI, accusing the company of using its copyrighted material to train AI systems like ChatGPT without permission. According to a report from Reuters, the lawsuit was filed in Manhattan federal court and also includes dictionary publisher Merriam-Webster. The companies claim OpenAI used tens of thousands of copyrighted articles while training its models and that AI systems can sometimes reproduce passages that closely resemble the original content. The case adds another major chapter to the growing legal battle over how AI is trained. Why Britannica is suing OpenAI In the complaint, Britannica alleges that OpenAI used nearly 100,000 articles from its encyclopedia without licensing the material. Those articles are part of the high-quality reference database that Britannica has built over decades with historians, researchers and subject-matter experts. The lawsuit argues that training AI systems on this material without permission amounts to copyright infringement. Britannica also claims that AI tools like ChatGPT can sometimes generate answers that resemble passages from the encyclopedia, which it says could undermine its business by giving users information without sending them to the original source. In the filing, the companies reportedly asked the court for financial damages and an order preventing OpenAI from using their content in future training. The bigger battle over AI training data The case is part of a much larger wave of lawsuits targeting AI companies over the data used to train large language models. Publishers, authors and media organizations have increasingly argued that their work has been used to train AI systems without consent. One of the most closely watched cases was filed by The New York Times Company, which sued OpenAI over claims that its articles were used to train AI models. Authors including George R. R. Martin and John Grisham have also been part of legal actions related to AI training data. At the same time, the debate has expanded beyond publishers to include everyday AI users. Many AI companies allow people to opt out of having their conversations used to improve future models, reflecting growing concerns about how user data may contribute to training systems. At the heart of these lawsuits is a fundamental question that courts have yet to settle: Is training AI on copyrighted material considered fair use -- or copyright infringement? The answer could determine how future AI models are developed and whether companies will need to license massive amounts of training data. How to opt out of having your ChatGPT conversations used for training If you're concerned about how your conversations might be used to improve AI models, OpenAI allows users to turn off training based on their chats. To do this: * Log in to ChatGPT * Click your profile icon * Go to Settings * Tap Data controls * Turn off "Improve the model for everyone" Once disabled, OpenAI says your conversations will not be used to train or improve future versions of the model, though they may still be stored temporarily for safety and abuse monitoring. Many AI companies have introduced similar controls as concerns grow over how user data, published articles and other online content are used to train generative AI systems. Final thoughts Britannica's lawsuit stands out because of the type of content involved. Unlike social media posts or scraped web pages, Britannica articles are carefully researched reference material that has long been considered one of the most reliable sources of factual information. That could make the case particularly important as courts try to define the boundaries of AI training. If courts ultimately rule that training AI on copyrighted reference material requires permission or licensing, it could reshape the economics of building AI systems. It could also influence how AI companies handle training data transparency, licensing deals and user control over how their data is used to improve AI systems. For now, the case is just beginning. But it adds just another high-profile legal challenge to OpenAI's struggle to win the AI race. Follow Tom's Guide on Google News and add us as a preferred source to get our up-to-date news, analysis, and reviews in your feeds.
[10]
Encyclopedia Britannica Hits OpenAI With Scary Lawsuit
Can't-miss innovations from the bleeding edge of science and tech Writers are often cautioned not to hit readers over the head with the dictionary. But what if it's the dictionary that's throwing blows? On Friday, Encyclopedia Britannica and its subsidiary Merriam-Webster struck back against the large language models plundering its tomes of knowledge and sued OpenAI for allegedly using its copyrighted reference materials to train its AI models "at massive scale," after filing a similar suit against Perplexity.AI last year. In the complaint, which Reuters reported on Monday, Britannica claimed that OpenAI unlawfully copied nearly 100,000 of its online articles and encyclopedia and dictionary entries to teach its GPT family of models. ChatGPT will even produce "near-verbatim" copies of its entries and dictionary definitions, it alleged providing several examples, something that is commonly observed across many chatbots. But more than that, OpenAI "cannibalized" Britannica's web traffic by showing ChatGPT users an AI-generated summary of its content, Britannica said, hurting its bottom line. This argument echoes those raised by journalism outlets and other online sites, which find their traffic being suffocated as more people use AI chatbots instead of a traditional search engine. "ChatGPT starves web publishers like [Britannica] of revenue by generating responses to users' queries that substitute, and directly compete with, the content from publishers like [Britannica]," the encyclopedia maker said in the complaint. Citing a key piece of US trademark law called the Lanham Act, Britannica further accuses OpenAI of violating its trademarks when ChatGPT hallucinates made-up answers and wrongly attributes them to Britannica, which it also says gives the false impression that the usage of its content is approved or sponsored by the encyclopedia. The complaint joins several other major lawsuits that authors, publishers, and news agencies have filed against AI companies, most of which are still ongoing. Depending on the outcome, they could have seismic implications for how generative AI companies operate. But as it stands, whether it constitutes infringement to use copyrighted content to train AI models, even without permission or compensation, is an open ended question -- and one frustrated by the fact that AI developers are rarely transparent about where they're sourcing training material for their models. One of the most major suits that's come to a conclusion so far was by a group of authors against Anthropic. Anthropic, it was revealed, pirated millions of digital books to train its Claude chatbot and scanned and shredded millions of more physical ones. The judge ruled that Anthropic's use of the texts to train its AI was "transformative," but said its use of pirated copies was illegal. Anthropic agreed to settle with the authors for $1.5 billion.
[11]
The dictionaries are suing OpenAI for 'massive' copyright infringement, and say ChatGPT is starving publishers of revenue | Fortune
Britannica and Merriam-Webster have filed a lawsuit against OpenAI, alleging that the AI giant has built its $730 billion company on the back of their researched content. In a filing submitted to the Southern District of New York, the companies accuse OpenAI of cannibalizing the traffic and ad revenue that publishers depend on to survive. "ChatGPT starves web publishers, like [the] Plaintiffs, of revenue," the complaint reads. Where a traditional search engine sends users to a publisher's website, Britannica and Merriam-Webster allege ChatGPT instead absorbs the content and delivers a polished answer. It also alleges the AI company fed its LLM with researched and fact-checked work of the companies' hundreds of human writers and editors. The case is the latest in a series accusing AI firms of data theft, raising questions about what counts as public knowledge and what information online should be off-limits for AI use. A group of anonymous individuals sued OpenAI in 2023, alleging that the AI giant stole "vast amounts" of personal information to train its AI models. And in 2024, two writers sued the company, representing writers whose copyrighted work they allege had been "pilfered by" OpenAI and partner Microsoft. But these lawsuits aren't solely confined to the ChatGPT maker. Anthropic, Perplexity, and nearly every other major AI company have all faced lawsuits alleging some form of copyright infringement. The lawsuit argues that OpenAI's use of their content could produce a positive feedback loop in which declining advertising and subscription revenue leads to lower-quality content, which in turn further reduces revenue. "Less content of poorer quality will further result in reduced revenue, and thus less spending on content creation," the complaint alleges, "spawning even less content of even poorer quality and even less revenue, and so on in a downward spiral for content creators like Plaintiffs." The lawsuit comes after the plaintiffs reached out to OpenAI in November 2024 to discuss a potential licensing agreement, that OpenAI rebuffed, according to the complaint. The plaintiffs seek to hold OpenAI accountable for the substantial harm and "illicit profits" it is generating from allegedly infringing on their copyrighted material. The lawsuit alleges OpenAI is also in violation of the Lanham Act (which covers trademark registration) when ChatGPT makes up content or hallucinates content and falsely attributing information to the plaintiffs. They're asking the court for a permanent injunction to stop OpenAI from continuing to use their material. "ChatGPT helps enhance human creativity, advance scientific discovery and medical research, and enable hundreds of millions of people to improve their daily lives," a spokesperson for OpenAI said in a statement to Fortune. They added that their AI models "empower innovation and are trained on publicly available data and grounded in fair use." The alleged plagiarism of "plagiarize" and the Hamilton-Burr duel But Merriam-Webster and Encyclopedia Britannica allege ChatGPT plagiarizes the information their human researchers, writers, and editors produce. In an apt example, the complaint describes a prompt asking "How does Merriam-Webster define plagiarize?" to which the model reportedly responded with a definition identical to the one found in the Merriam-Webster dictionary. The complaint adds that the dictionary has been registered with the U.S. Copyright Office. That alleged plagiarism extends beyond copying dictionary definitions. The complaint outlines questions about specific historical events, for example, to show how the AI mimics the publishers' unique selection and curation of content. When a user asked ChatGPT for "10 Things You Need to Know About the Hamilton-Burr Duel, According to Hamilton's Burr," ChatGPT reportedly reproduced an identical specific selection and ordering of quotes found in a copyrighted Britannica article, including the exact snippets curated by Britannica's editors. The model also noted that Britannica had fact-checked the article. The plaintiffs ultimately argue that these practices threaten to seriously undermine their longstanding business models. "OpenAI imperils the very market for the high-quality content that it copies and reproduces."
[12]
OpenAI sued by Merriam-Webster for copyright infringement
Encyclopedia Britannica and its subsidiary Merriam-Webster have filed suit against OpenAI, alleging that the ChatGPT maker copied their copyrighted content without authorization to train its large language models, The lawsuit, filed in Manhattan federal court last week, alleges that OpenAI used close to 100,000 Britannica articles to train its models, and that ChatGPT responses frequently reproduce or closely paraphrase Britannica's reference content, including encyclopedia articles and dictionary entries. The complaint also alleges OpenAI uses a retrieval-augmented generation system to pull from Britannica's content in real time when generating responses. The plaintiffs argue that ChatGPT substitutes for visits to their websites, depriving them of the subscription and advertising revenue that funds their content creation. The trademark claims center on two alleged harms: ChatGPT presenting invented content under Britannica's name, and displaying incomplete reproductions of Britannica material in ways that suggest the company's endorsement. The plaintiffs are seeking monetary damages in an amount to be determined, along with injunctive relief to halt the alleged violations, according to Reuters. OpenAI disputed the claims. "Our models empower innovation, and are trained on publicly available data and grounded in fair use," a company spokesperson said, according to Reuters. A separate Britannica case against AI search company Perplexity AI, filed last year, is also moving through the courts. The Britannica and Merriam-Webster lawsuit is among a growing number of copyright suits brought by publishers, authors, and news organizations against AI companies over the use of their material in model training.
[13]
ChatGPT's latest enemy is the world's best dictionary and encyclopedia
The company that taught us everything wants OpenAI to pay for the lesson. It is widely acknowledged that AI companies use web articles to train their models without compensating creators or obtaining permission. Publishers such as The New York Times, the Chicago Tribune, and the Toronto Star have already filed lawsuits against this practice. Now, another prominent organization has joined the legal proceedings. Techcrunch has reported that Encyclopedia Britannica and its subsidiary Merriam-Webster have filed a lawsuit against OpenAI, alleging that the AI giant committed "massive copyright infringement" by scraping and using nearly 100,000 of its online articles to train its LLMs without permission. What's this lawsuit about? Britannica claims that ChatGPT generates responses that substitute its content, reducing web traffic and potential revenue. If users can ask ChatGPT a question and receive an answer based on Britannica's articles, there may be less incentive to visit the website directly. Recommended Videos The complaint also targets OpenAI's use of Britannica content in ChatGPT's RAG workflow, a process where the AI scans the web for updated information when answering questions, alleging that the AI reproduces its content, in full or in part, when answering questions. Additionally, Britannica alleges that OpenAI is violating trademark law. The company has argued that ChatGPT hallucinates information and then falsely attributes it to the publisher. According to Britannica, ChatGPT's hallucinations jeopardize "the public's continued access to high-quality and trustworthy online information." What's going to happen next? That's the big question. There is no strong legal precedent establishing whether training an AI on copyrighted content constitutes copyright infringement. Anyone can tell you that it's not right to use someone else's work to train your data, but the law around it is murky at best. In a recent case involving Anthropic, a federal judge ruled that using copyrighted content as training data was transformative enough to be legal. However, the same judge found that Anthropic had illegally downloaded millions of books, resulting in a $1.5 billion settlement with affected writers. As this issue continues to evolve, lawmakers have significant ground to cover. The outcome of these cases will likely shape how AI companies can legally use web content in the future.
[14]
Encyclopedia Britannica is the latest giant to sue OpenAI
Encyclopedia Britannica is suing OpenAI for allegedly misusing its reference materials to train its artificial intelligence (AI) models. The Chicago-based Britannica Group runs Britannica.com and Merriam-webster.com, the online version of the Merriam-Webster dictionary. Creator of the 250-year-old Encyclopaedia Britannica, the company ended its print edition in 2012, survived Wikipedia, and has since focused on educational software and digital growth, including selling artificial intelligence agent software, according to The New York Times. Britannica had acquired Melingo AI in 2000, which offers "AI-powered solutions and natural-language processing" in multiple languages by leveraging artificial intelligence and computational linguistics, according to Britannica's website. The Britannica Group alleges OpenAI -- which is backed by Microsoft -- used information from its encyclopedia and dictionary to train its AI chatbot ChatGPT. The problem is, OpenAI now automatically generates AI summaries of that content on its own platform, which is resulting in Encyclopedia Britannica and Merriam-Webster's own web traffic plummeting.
[15]
Encyclopedia Britannica Sues OpenAI Over AI Training
March 16 (Reuters) - Encyclopedia Britannica and its Merriam-Webster subsidiary have sued OpenAI in Manhattan federal court for allegedly misusing their reference materials to train its artificial intelligence models. Britannica said in the complaint filed on Friday that Microsoft-backed OpenAI used its online articles and encyclopedia and dictionary entries to teach its flagship chatbot ChatGPT to respond to human prompts and "cannibalized" Britannica's web traffic with AI-generated summaries of its content. Spokespeople for the companies did not immediately respond to requests for comment on the complaint on Monday. The case is one of many high-stakes lawsuits filed by copyright owners including authors and news outlets against tech companies for using their material to train AI systems without permission. Britannica filed a related lawsuit against artificial intelligence startup Perplexity AI last year that is still ongoing. AI companies have argued that their systems make fair use of copyrighted content by transforming it into something new. Britannica's lawsuit said that OpenAI unlawfully copied nearly 100,000 of its articles to train GPT large language models. The complaint said that ChatGPT produces "near-verbatim" copies of Britannica's encyclopedia entries, dictionary definitions and other content, diverting users who would otherwise visit its websites. Britannica also accused OpenAI of infringing its trademarks by implying that it has permission to reproduce its material and wrongfully citing Britannica in false AI "hallucinations." Britannica requested an unspecified amount of monetary damages and a court order blocking the alleged infringement. (Reporting by Blake Brittain in Washington; Editing by Kirsten Donovan)
[16]
Encyclopedia Britannica sues OpenAI over AI training
Encyclopedia Britannica and Merriam-Webster have sued OpenAI. They allege OpenAI used their reference materials without permission to train AI models like ChatGPT. Britannica claims OpenAI copied nearly 100,000 articles. The lawsuit states AI-generated summaries are diverting users from Britannica's websites. Britannica seeks monetary damages and a court order to stop the alleged infringement. Encyclopedia Britannica and its Merriam-Webster subsidiary have sued OpenAI in Manhattan federal court for allegedly misusing their reference materials to train its artificial intelligence models. Britannica said in the complaint filed on Friday that Microsoft-backed OpenAI used its online articles and encyclopedia and dictionary entries to teach its flagship chatbot ChatGPT to respond to human prompts and "cannibalized" Britannica's web traffic with AI-generated summaries of its content. Spokespeople for the companies did not immediately respond to requests for comment on the complaint on Monday. The case is one of many high-stakes lawsuits filed by copyright owners including authors and news outlets against tech companies for using their material to train AI systems without permission. Britannica filed a related lawsuit against artificial intelligence startup Perplexity AI last year that is still ongoing. AI companies have argued that their systems make fair use of copyrighted content by transforming it into something new. Britannica's lawsuit said that OpenAI unlawfully copied nearly 100,000 of its articles to train GPT large language models. The complaint said that ChatGPT produces "near-verbatim" copies of Britannica's encyclopedia entries, dictionary definitions and other content, diverting users who would otherwise visit its websites. Britannica also accused OpenAI of infringing its trademarks by implying that it has permission to reproduce its material and wrongfully citing Britannica in false AI "hallucinations." Britannica requested an unspecified amount of monetary damages and a court order blocking the alleged infringement.
[17]
Encyclopedia Britannica Sues OpenAI Over Alleged Copyright Infringement | PYMNTS.com
By completing this form, you agree to receive marketing communications from PYMNTS and to the sharing of your information with our sponsor, if applicable, in accordance with our Privacy Policy and Terms and Conditions. Britannica also alleges that OpenAI violated trademark law by making up information and falsely attributing it to the publisher, according to the report. The company alleges in its lawsuit that with these practices, OpenAI "starves web publishers like [Britannica] of revenue" and jeopardizes "the public's continued access to high-quality and trustworthy online information," per the report. Reuters also reported on the lawsuit Monday, saying Britannica seeks monetary damages and a court order blocking the alleged infringement. OpenAI did not immediately reply to PYMNTS' request for comment. In a statement provided to Reuters, an OpenAI spokesperson said, per the report: "Our models empower innovation, and are trained on publicly available data and grounded in fair use." It was reported in September that Encyclopedia Britannica and Merriam-Webster sued Perplexity, alleging that it misused their copyrighted content in its AI-driven "answer engine." That complaint alleges that Perplexity copied articles from Britannica and Merriam-Webster and reduced their online traffic by directing users to AI-generated summaries instead of original sources. In a September press release announcing the lawsuit against Perplexity, Jorge Cauz, who was CEO of Britannica Group at the time, said: "As demonstrated by today's action, we will take all steps necessary to protect our data and intellectual capital so we can continue to offer innovative digital instructional and informational solutions that enhance student learning outcomes, assist educators in their teaching journeys, and inform and delight learners of all ages." There have been several other lawsuits alleging that AI companies committed copyright infringement. It was reported in August that Anthropic settled a copyright infringement lawsuit brought by a group of U.S. authors. In that case, the authors alleged that the AI company used pirated books without permission to train its AI assistant. In February 2025, a group of news publishers sued Cohere for copyright infringement, alleging that the company improperly used copyrighted works to train its large language model and displayed large portions of articles while bypassing visits to the publishers' websites.
[18]
Encyclopaedia Britannica Sues OpenAI: Says Its Hallucinations is Depriving People of High-Quality Content
What is more noteworthy than merely calling out illegal scraping is that the plaintiff called out OpenAI's hallucinations as a cause for people not trusting high quality content It has barely been three months since the owners of the non-profit Wikipedia first called out big tech AI companies for illegally scraping its content and then announced partnerships with many of them. Now, it is the turn of the Encyclopaedia Britannica and Merriam-Webster to bring a lawsuit against OpenAI alleging "massive copyright infringement." The problem appears to be some 100,000 online articles for which Britannica has copyright and has been scraped by OpenAI for training their LLMs without due permissions. This is one of many lawsuits that AI companies have been battling since ChatGPT first arrived on the scene in January 2023. Several publishers have since done financial deals with these AI giants. Just so that we are clear, Wikipedia did deals on January 15 this year with everyone in the AI circuit including Google, Amazon, Meta, Microsoft, Mistral AI and Perplexity. What stood out was that they didn't announce deal any with OpenAI. Now Britannica is also accusing OpenAI of violating copyright laws when it generates outputs containing either full or partial verbatim reproduction of its content as well as when the AI Lab uses its articles in ChatGPT's retrieval augmented generation (RAG) workflow. The plaintiffs are upset that the AI startup is directly competing with them by starving them of users. "ChatGPT starves web publishers like [Britannica] of revenue by generating responses to users' queries that substitute, and directly compete with, the content from publishers like [Britannica]," says the lawsuit. However, what is more noteworthy is that Britannica says ChatGPT's hallucinations jeopardize "the public's continued access to high-quality and trustworthy online information." Britannica has also filed a similar lawsuit against Perplexity, which is also pending hearing now. In its filing, Britannica alleges that Perplexity has engaged in the widespread, knowing, and illegal use of Britannica's trusted and human-verified content, as well as its historic brands, oftentimes verbatim, without Britannica's consent or reference. Just so that readers are aware, OpenAI's RAG tool is used to scan the web or other databased for newly updated information when it responds to a query. In fact, Britannica alleges that the AI giant also violates the Lanham Act (a trademark statute) when it delivers made-up hallucinations and attributes them falsely to one publisher or the other. While several publications including The New York Times have taken these AI companies to court over unauthorised scraping from their websites, their counterparts in India have kept themselves away from getting into legal tangles for a variety of reason, some of which we had explained in one of our earlier stories.
[19]
OpenAI Faces Lawsuit Over ChatGPT Content Use Claims
Encyclopædia Britannica and Merriam-Webster have sued OpenAI in a federal court in New York, accusing the company of copying their content without permission to build and run ChatGPT. The publishers say OpenAI used their copyrighted material to train its AI models, including ChatGPT's retrieval-augmented generation (RAG) system, which pulls in external information when generating responses. The lawsuit, filed in the Southern District of New York on March 13, 2026, says ChatGPT sometimes produces answers that reproduce the Britannica and Merriam-Webster content verbatim, nearly verbatim, or in summarized form. The publishers also allege that ChatGPT copies the selection and ordering of material from Britannica articles and lists. Impact on publishers' business: Britannica and Merriam-Webster argue that these AI-generated answers act as substitutes for their own websites. According to the complaint, this can reduce traffic to their platforms and hurt their subscription and advertising revenue. The publishers say their business depends on users visiting their sites for trusted, fact-checked information created by human writers, editors, and researchers. The complaint also says OpenAI's systems may use the Britannica material twice: first in training large language models, and again through RAG systems that retrieve outside content to improve answers. The publishers argue that this is not just a training-data dispute, but also a case about how AI systems generate responses for users in real time. Trademark and attribution concerns: The lawsuit also includes trademark claims. Britannica and Merriam-Webster accuse OpenAI of violating the Lanham Act by attaching their names to false, incomplete, or misleading AI-generated answers. The complaint cites cases where ChatGPT produces hallucinations and attributes them to Britannica or leaves out parts of Britannica content without clearly telling users that the text is incomplete. According to the complaint, this could confuse users and damage the publishers' reputation. Licensing and wider legal context: The publishers say OpenAI ignored a potential licensing agreement. The complaint states that Britannica contacted OpenAI in November 2024 to discuss licensing, but OpenAI did not pursue the deal seriously. It also notes that OpenAI has signed licensing agreements with other publishers. Britannica and Merriam-Webster are asking the court for damages, OpenAI's profits linked to the alleged misconduct, legal costs, and an order permanently stopping the company from the conduct described in the suit. They have also asked for a jury trial. The case adds to the growing list of lawsuits against OpenAI from publishers and authors over how AI systems are trained and how they reproduce copyrighted material. Context: The case comes amid ongoing legal debates in the US on whether using copyrighted material to train AI models counts as "fair use." In June 2025, a US court held that training AI models on legally purchased books could qualify as fair use, but said using pirated content was infringing. That case did not address whether AI outputs reproduce copyrighted material or compete with original publishers, issues central to the current lawsuit.
[20]
Encyclopedia Britannica sues OpenAI over AI training
March 16 (Reuters) - Encyclopedia Britannica and its Merriam-Webster subsidiary have sued OpenAI in Manhattan federal court for allegedly misusing their reference materials to train its artificial intelligence models. Britannica said in the complaint filed on Friday that Microsoft-backed OpenAI used its online articles and encyclopedia and dictionary entries to teach its flagship chatbot ChatGPT to respond to human prompts and "cannibalized" Britannica's web traffic with AI-generated summaries of its content. Spokespeople for the companies did not immediately respond to requests for comment on the complaint on Monday. The case is one of many high-stakes lawsuits filed by copyright owners including authors and news outlets against tech companies for using their material to train AI systems without permission. Britannica filed a related lawsuit against artificial intelligence startup Perplexity AI last year that is still ongoing. AI companies have argued that their systems make fair use of copyrighted content by transforming it into something new. Britannica's lawsuit said that OpenAI unlawfully copied nearly 100,000 of its articles to train GPT large language models. The complaint said that ChatGPT produces "near-verbatim" copies of Britannica's encyclopedia entries, dictionary definitions and other content, diverting users who would otherwise visit its websites. Britannica also accused OpenAI of infringing its trademarks by implying that it has permission to reproduce its material and wrongfully citing Britannica in false AI "hallucinations." Britannica requested an unspecified amount of monetary damages and a court order blocking the alleged infringement. (Reporting by Blake Brittain in Washington; Editing by Kirsten Donovan)
Share
Share
Copy Link
Encyclopedia Britannica and Merriam-Webster have filed a lawsuit against OpenAI, alleging the AI company scraped nearly 100,000 copyrighted articles without permission to train its large language models. The publishers claim ChatGPT generates verbatim reproductions of their content, cannibalizes web traffic, and falsely attributes AI-generated hallucinations to Britannica, threatening the public's access to trustworthy information.
Encyclopedia Britannica and its subsidiary Merriam-Webster have launched legal action against OpenAI in Manhattan federal court, alleging massive copyright infringement in what marks another significant challenge to AI companies' training practices
1
. The lawsuit claims that OpenAI scraped nearly 100,000 copyrighted articles from the publisher's online encyclopedia and dictionary entries to train AI models, specifically its large language models (LLMs) powering ChatGPT, without authorization or compensation5
. Britannica, which retains copyright to these extensive reference materials, argues that using copyrighted content in this manner constitutes unlawful appropriation of its intellectual property1
.
Source: MediaNama
The complaint includes side-by-side examples demonstrating how ChatGPT generates responses containing "full or partial verbatim reproductions" of Britannica's content, with entire passages appearing to match word for word
3
. According to the lawsuit, "ChatGPT then provides narrative responses to user queries that often contain verbatim or near-verbatim reproductions, summaries or abridgements of original content, including [Britannica's] copyrighted works"2
. The publishers also accuse OpenAI of copyright violations when it uses their articles in ChatGPT's Retrieval Augmented Generation (RAG) workflow, which scans the web or other databases for newly updated information when responding to queries1
.
Source: CXOToday
Britannica alleges that ChatGPT is cannibalizing web traffic by generating responses that "substitute, and directly compete with" the publisher's content rather than directing users to its website as traditional search engines would
3
. "ChatGPT starves web publishers like [Britannica] of revenue by generating responses to users' queries that substitute, and directly compete with, the content from publishers like [Britannica]," the lawsuit states1
. The complaint emphasizes that OpenAI reproduces "web publishers' copyrighted content without authorization or remuneration," thereby limiting the traffic and revenue Britannica generates from its online properties4
.Beyond copyright infringement, Britannica accuses OpenAI of trademark infringement under the Lanham Act when ChatGPT produces AI-generated hallucinations and falsely attributes them to the publisher
1
. According to the complaint, ChatGPT often omits portions of Britannica's explanations and wrongly attributes the publisher to incomplete and inaccurate responses4
. Use of Britannica's trademarks in this manner "deceives users into believing that the hallucinations and/or undisclosed omissions" are approved by the publisher, the complaint adds4
. Britannica warns that these hallucinations jeopardize "the public's continued access to high-quality and trustworthy online information"1
.Related Stories
Britannica joins numerous publishers and writers pursuing legal action against OpenAI over copyright issues. The New York Times, Ziff Davis (owner of Mashable, CNET, IGN, PC Mag, and others), and more than a dozen newspapers across the US and Canada, including the Chicago Tribune, the Denver Post, the Sun-Sentinel, the Toronto Star, and the Canadian Broadcasting Corporation have sued OpenAI
1
. A similar lawsuit Britannica filed against Perplexity remains pending1
. In September, Anthropic settled a class action lawsuit for using copyrighted books to train AI models, resulting in a $1.5 billion payout to authors3
.OpenAI maintains its position that using publicly available data to train AI models falls under fair use. An OpenAI spokesperson told CNET: "Our models empower innovation, and are trained on publicly available data and grounded in fair use"
2
. AI companies have consistently argued that their systems make fair use of copyrighted content by transforming it into something new5
. However, strong legal precedent establishing whether using copyrighted content as training data constitutes infringement remains limited. In one notable case, Anthropic convinced federal judge William Alsup that using content as training data is transformative enough to be legal, though Alsup ruled Anthropic violated the law by illegally downloading millions of books rather than paying for them, warranting the $1.5 billion class action settlement1
. Britannica seeks damages, restitution of profits, and an injunction blocking OpenAI's alleged unlawful activities4
.
Source: The Next Web
Summarized by
Navi
[1]
[3]
11 Sept 2025•Technology

30 Nov 2024•Policy and Regulation

25 Apr 2025•Policy and Regulation
