Meta Defends Torrenting Practices in AI Training Dataset Lawsuit

Curated by THEOUTPOST

On Fri, 21 Feb, 4:04 PM UTC

2 Sources

Share

Meta claims it didn't seed pirated books used for AI training, sparking debate on copyright infringement and data acquisition methods in AI development.

Meta's Defense in Copyright Infringement Lawsuit

Meta, the social media giant, is embroiled in a legal battle over its use of pirated books to train its AI models. In a recent court filing, Meta defended its actions by claiming that while it did torrent a dataset of pirated books, it took precautions not to "seed" any downloaded files 1.

The Torrenting Controversy

Meta admitted to torrenting an 82 TB dataset of pirated, copyrighted material from shadow libraries to train its LLaMA AI models. However, the company insists that there is no evidence of "seeding" - the act of sharing a torrented file after the download completes 2.

Legal Implications and Meta's Defense

The lawsuit, filed by authors including Richard Kadrey, Sarah Silverman, and Ta-Nehisi Coates, alleges that Meta unlawfully copied and distributed their works through AI outputs. Meta's defense hinges on the lack of evidence of seeding, arguing that downloading copyrighted content isn't illegal, but distribution is 1.

Contradictory Evidence

Despite Meta's claims, there is testimony that might challenge their defense:

  1. Michael Clark, a Meta executive, testified that configuration settings were modified "so that the smallest amount of seeding possible could occur" 2.
  2. An internal message from Meta researcher Frank Zhang suggested attempts to conceal potential seeding from Meta's servers 1.

Legal Complexities

Meta is attempting to dismiss the authors' claim under California's Computer Data Access and Fraud Act (CDAFA), arguing it's preempted by copyright law. The authors contend that Meta's "decision to bypass lawful acquisition methods" constitutes a separate CDAFA violation 1.

Broader Implications for AI and Copyright

This case highlights the ongoing tension between AI development and copyright law. Similar lawsuits have been filed against other AI companies, including OpenAI and Microsoft, over the use of copyrighted material for training large language models 2.

Industry Impact

The outcome of this case could have far-reaching implications for the AI industry, potentially setting precedents for how companies can legally acquire and use data for AI training. It also raises questions about the ethics of using pirated material for technological advancement 12.

Next Steps

As the court battle continues, no final decision has been made. Meta is expected to fight the seeding claims at summary judgment, and any decision is likely to face appeals, suggesting a long legal process ahead 12.

Continue Reading
Meta Faces Legal Scrutiny Over Alleged Copyright

Meta Faces Legal Scrutiny Over Alleged Copyright Infringement in AI Training

Meta is embroiled in a lawsuit accusing the company of using torrented copyrighted books to train its AI models, potentially setting a precedent for how courts view copyright law in AI development.

Ars Technica logoPC Magazine logotheregister.com logoTechSpot logo

6 Sources

Ars Technica logoPC Magazine logotheregister.com logoTechSpot logo

6 Sources

Meta Faces Legal Challenges Over Alleged Use of Pirated

Meta Faces Legal Challenges Over Alleged Use of Pirated Books for AI Training

Meta is embroiled in a lawsuit alleging the company used pirated books to train its AI models, including Llama. Internal communications reveal ethical concerns and attempts to conceal the practice.

TechCrunch logoTechRadar logoDigital Trends logoEconomic Times logo

11 Sources

TechCrunch logoTechRadar logoDigital Trends logoEconomic Times logo

11 Sources

Zuckerberg's YouTube Defense in Meta's AI Copyright Battle

Zuckerberg's YouTube Defense in Meta's AI Copyright Battle Sparks Debate

Meta CEO Mark Zuckerberg defends the use of copyrighted e-books to train AI models, comparing it to YouTube's content moderation challenges. The case raises questions about fair use in AI development.

Analytics Insight logoTechCrunch logoNDTV Gadgets 360 logoRolling Stone logo

17 Sources

Analytics Insight logoTechCrunch logoNDTV Gadgets 360 logoRolling Stone logo

17 Sources

French Publishers and Authors Sue Meta Over AI Copyright

French Publishers and Authors Sue Meta Over AI Copyright Infringement

French publishing and authors' associations have filed a lawsuit against Meta, accusing the tech giant of using copyrighted content without permission to train its AI models. This marks the first such legal action against an AI company in France.

TechCrunch logoReuters logoAP NEWS logoFrance 24 logo

11 Sources

TechCrunch logoReuters logoAP NEWS logoFrance 24 logo

11 Sources

OpenAI Denies Copyright Infringement Allegations in Author

OpenAI Denies Copyright Infringement Allegations in Author Lawsuits

OpenAI, the company behind ChatGPT, has responded to copyright infringement lawsuits filed by authors, denying allegations and asserting fair use. The case highlights the ongoing debate surrounding AI and intellectual property rights.

The Economic Times logoEconomic Times logoThe Hindu logo

3 Sources

The Economic Times logoEconomic Times logoThe Hindu logo

3 Sources

TheOutpost.ai

Your one-stop AI hub

The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.

© 2025 TheOutpost.AI All rights reserved