Google AI Lawsuit: Publishers Join Copyright Fight

Publishers Join Lawsuit Against Google Over AI Training Practices

Hachette Book Group and Cengage Group filed a motion Thursday in California federal court seeking permission to intervene in an existing class-action lawsuit against Google, alleging the tech company "engaged in one of the most prolific infringements of copyrighted materials in history" to build its AI capabilities1

. The publishers claim Google copied content from Hachette books and Cengage textbooks without permission to train its Gemini large language model, escalating a legal battle that could significantly increase potential damages at stake3

Source: Decrypt

Maria Pallante, CEO of the Association of American Publishers, stated, "We believe our participation will bolster the case, especially because publishers are uniquely positioned to address many of the legal, factual, and evidentiary questions before the Court"1

. The motion to intervene in this Google AI lawsuit marks a critical expansion of copyright infringement claims originally filed in 2023 by visual artists and individual authors.

Evidence of Training AI with Copyrighted Books from Piracy Sites

The complaint alleges Google deliberately chose to steal content rather than obtain proper licenses, engaging in copyright infringement "at every stage" of development2

. According to the publishers, Google downloaded books from pirate sites and repeatedly copied them during AI training—first into computer memory, then into formats AI systems could read, and again into training sets for each new model version.

Source: ET

Google's C4 training dataset allegedly contains copyrighted works scraped from Z-Library, a pirate collection from which authorities have seized more than 350 websites and web domains2

. The dataset pulls from at least 28 piracy-linked websites identified by the U.S. government as markets for piracy and counterfeits. Books were copied from b-ok.org, a Z-Library domain now displaying a federal seizure notice, along with OceanofPDF and WeLib, "another prolific site with access to troves of unauthorized copyrighted content"2

The copyright symbol (©) appears more than 200 million times in the C4 dataset, the complaint notes, while Google allegedly excluded "policy notices" and "terms of use" warnings but included "vast categories of copyrighted works, pirated works, and works taken from behind paywalls"2

. The publishers also allege Google copied works from subscription-based libraries like Scribd.com, circumventing legitimate licensing agreements.

Specific Examples and Damages Sought in Historic Copyright Infringement Case

The publishers cited 10 examples of their textbooks and other books that Google allegedly misused from authors, including Scott Turow and N.K. Jemisin, to train its Gemini platform1

. They seek statutory damages, injunctions to halt further infringement, and an order requiring Google to destroy all unauthorized copies of their works and disclose which books were used to train Gemini2

. The publishers requested an unspecified amount of monetary damages on behalf of themselves and a larger class of authors and publishers4

The lawsuit alleges Gemini now produces outputs that "substitute for copyrighted works," including verbatim reproductions, detailed summaries, and "knockoffs that copy creative elements of original works"2

. U.S. District Judge Eumi Lee will decide whether to approve the publishers' request to join the case3

Broader Implications for Intellectual Property Law and AI Development

This case represents one of many high-stakes lawsuits brought by artists, authors, music labels, and other copyright owners against tech companies over their AI training practices1

. Anthropic settled a lawsuit for $1.5 billion last year with a group of authors over its use of their work to train its AI chatbot Claude, signaling the substantial financial exposure tech companies face in these intellectual property law disputes3

Recent federal court rulings have delivered partial victories to Meta and Anthropic, with judges ruling that their use of copyrighted books to train models constituted fair use under copyright law, though courts criticized the companies for maintaining permanent libraries of pirated books2

. The outcome of this expanded class-action lawsuit against Google could establish critical precedents for how AI companies must handle copyrighted material, potentially forcing fundamental changes to AI training methodologies and licensing practices across the industry. Google spokespeople did not immediately respond to requests for comment on the publishers' bid1

Major publishers join Google AI lawsuit alleging massive copyright infringement in Gemini training

Publishers Join Lawsuit Against Google Over AI Training Practices

Evidence of Training AI with Copyrighted Books from Piracy Sites

Specific Examples and Damages Sought in Historic Copyright Infringement Case

Broader Implications for Intellectual Property Law and AI Development

References

Publishers seek to join lawsuit against Google over AI training

Book Publishers Seek Entry Into Google AI Copyright Fight - Decrypt

Publishers seek to join lawsuit against Google over AI training - The Economic Times

Publishers seek to join lawsuit against Google over AI training

Related Stories

John Carreyrou Sues Six AI Companies Over Unauthorized Use of Copyrighted Books for Training

Anthropic Faces Class-Action Lawsuit Over Alleged Copyright Infringement in AI Training

Penske Media Sues Google Over AI-Generated Summaries in Search Results

Recent Highlights

Google releases Gemma 4 with Apache 2.0 license, enabling unrestricted local AI on devices

AI Models Lie, Cheat, and Defy Human Instructions to Protect Other AI Models From Deletion

Anthropic discovers emotion-like patterns in Claude that actively shape AI behavior and decisions

Recent Highlights

Today's Top Stories

OpenAI insiders compiled secret memos questioning Sam Altman's trustworthiness to lead AI

Google quietly launches free offline AI dictation app that polishes speech without subscriptions

OpenAI urges California and Delaware to investigate Elon Musk for anti-competitive behavior

Anthropic secures massive Google TPU deal as revenue run rate soars to $30 billion