Salesforce Sued by Authors Over Alleged Copyright Infringement in AI Training Data

Authors File Class Action Against Salesforce for Copyright Infringement in AI Training

Cloud software giant Salesforce (CRM.N) is facing a proposed class action lawsuit, with authors Molly Tanzer and Jennifer Gilmore alleging that the company utilized thousands of copyrighted books without authorization to train its artificial intelligence software 1

. Filed in a San Francisco federal court, the lawsuit asserts that Salesforce engaged in copyright infringement under the Copyright Act, intensifying the ongoing debate surrounding intellectual property rights and AI development 2

Source: Reuters

Allegations of Unauthorized Content Use

Tanzer and Gilmore claim that Salesforce integrated pirated books, including their own literary works, into its xGen AI models designed for advanced language processing. The complaint specifically points to the use of datasets like RedPajama and The Pile, which reportedly include a book corpus known as Books3. This collection, comprising over 196,000 books, is alleged to have been illicitly copied from the private tracker Bibliotik 2

Salesforce's Shifting Stance on Training Data

According to the lawsuit, Salesforce initially acknowledged using "RedPajama-Books" as a training source when it launched xGen in June 2023. However, by September, the company purportedly removed these explicit references from its website, substituting them with more ambiguous descriptions of "natural language data" sourced from "publicly available sources" 2

Source: Seeking Alpha

CEO's Past Remarks Surface in Complaint

Adding an ironic twist, the lawsuit cites earlier statements made by Salesforce CEO Marc Benioff. In a January 2024 Bloomberg interview, Benioff openly criticized AI companies for leveraging "stolen" training data, remarking that "all the training data has been stolen" and suggesting that compensating content creators would be "very easy to do" 2

. These comments are now being used against the company in the legal proceedings.

Wider Industry Implications and Precedent

This legal action is not isolated; it is part of a growing wave of lawsuits targeting technology companies over their use of copyrighted material for AI training. Similar cases have been brought against industry giants like OpenAI, Microsoft, and Meta Platforms 1

. A notable development occurred in August when Anthropic reached a significant $1.5 billion settlement with a group of authors in a comparable copyright infringement dispute, setting a potential precedent for future cases 3

Legal Experts Weigh In

Ishita Sharma, managing partner at Fathom Legal, commented on the challenges authors face, stating they must "prove real financial harm, not just that their books were used for training." While some recent rulings have favored AI companies due to authors' inability to demonstrate market harm, Sharma emphasized that "using public datasets like RedPajama or The Pile doesn't automatically erase willful infringement" 2

Potential Outcomes and Demands

Should the class action lawsuit prove successful, Salesforce could face substantial financial penalties or be ordered to compensate authors. The plaintiffs are seeking class certification for all U.S. copyright holders whose works have been utilized since October 2022. Their demands include statutory damages, the destruction of infringing copies, profit disgorgement, a declaration of willful infringement, and attorneys' fees 2

Salesforce has, as of now, declined to comment on the ongoing lawsuit 5

. This case further highlights the critical need for clear legal frameworks and ethical guidelines to navigate the complex intersection of AI advancement and intellectual property rights.

Salesforce Sued by Authors Over Alleged Copyright Infringement in AI Training Data

Authors File Class Action Against Salesforce for Copyright Infringement in AI Training

Allegations of Unauthorized Content Use

Salesforce's Shifting Stance on Training Data

CEO's Past Remarks Surface in Complaint

Wider Industry Implications and Precedent

Legal Experts Weigh In

Potential Outcomes and Demands

References

Salesforce sued by authors over artificial intelligence software

Salesforce Faces Class Action Over Alleged Illegal AI Training Data - Decrypt

Salesforce sued by authors over artificial intelligence software

Salesforce faces copyright lawsuit from authors over training AI models

Salesforce sued by authors over artificial intelligence software

Related Stories

Adobe faces class-action lawsuit over alleged use of pirated books in AI training

Authors Sue AI Company Anthropic Over Copyright Infringement

Apple Faces Copyright Lawsuit Over AI Training with Pirated Books

Recent Highlights

Google launches Gemini 3 Flash as default AI model, delivering speed with Pro-grade reasoning

OpenAI launches ChatGPT app store, opening doors for third-party developers to build AI-powered apps

OpenAI launches GPT Image 1.5 as AI image generator war with Google intensifies

Recent Highlights

Today's Top Stories

Anna's Archive scrapes 300TB from Spotify, raising alarm over AI training data misuse

OpenAI admits ChatGPT Atlas prompt injection attacks may never be fully solved

Alphabet buys Intersect Power for $4.75 billion to secure energy for AI infrastructure

Nvidia targets February for H200 chip shipments to China as Beijing weighs approval