Meta Faces Legal Challenges Over Alleged Use of Pirated Books for AI Training

11 Sources

Share

Meta is embroiled in a lawsuit alleging the company used pirated books to train its AI models, including Llama. Internal communications reveal ethical concerns and attempts to conceal the practice.

Meta's AI Training Practices Under Scrutiny

Meta, the parent company of Facebook, is facing a class-action lawsuit over allegations that it used pirated books to train its artificial intelligence models, including the popular Llama series. Court filings and internal communications have revealed that Meta allegedly downloaded and used vast amounts of copyrighted material without proper authorization

1

.

Alleged Copyright Infringement

According to the lawsuit, Meta is accused of downloading nearly 82 terabytes of pirated books from shadow libraries such as Anna's Archive, Z-Library, and LibGen

2

. The plaintiffs, including bestselling authors Sarah Silverman and Ta-Nehisi Coates, allege that Meta infringed upon their copyrights and potentially harmed their livelihoods

3

.

Internal Concerns and Concealment Attempts

Unsealed court documents reveal that some Meta employees raised ethical concerns as early as 2022. One researcher explicitly stated, "I don't think we should use pirated material," while another employee commented that "torrenting from a corporate laptop doesn't feel right"

4

.

Despite these internal warnings, Meta allegedly took steps to conceal its activities. Employees discussed ways to prevent Meta's infrastructure from being directly linked to the downloads, including using servers outside of Facebook's main network in what was referred to as "stealth mode"

5

.

Meta's Response and Legal Defense

Meta has defended its practices by invoking the "fair use" doctrine, asserting that using publicly available materials to train AI tools is legal in certain cases. The company argues that it uses text to statistically model language and generate original expression

3

.

Broader Implications for AI Industry

This case is part of a larger trend of legal challenges against tech companies developing AI technologies. OpenAI and Nvidia have also faced similar accusations regarding their use of copyrighted materials for AI training

2

.

Ongoing Legal Proceedings

U.S. District Judge Vince Chhabria has dismissed some claims but allowed the authors to amend their complaint to include new allegations, including those related to the removal of copyright management information

3

. The outcome of this lawsuit could have significant implications for the tech industry, particularly concerning the use of copyrighted materials in AI training.

TheOutpost.ai

Your Daily Dose of Curated AI News

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

© 2025 Triveous Technologies Private Limited
Instagram logo
LinkedIn logo