Meta Defends Torrenting Practices in AI Training Dataset Lawsuit

Meta's Defense in Copyright Infringement Lawsuit

Meta, the social media giant, is embroiled in a legal battle over its use of pirated books to train its AI models. In a recent court filing, Meta defended its actions by claiming that while it did torrent a dataset of pirated books, it took precautions not to "seed" any downloaded files 1

The Torrenting Controversy

Meta admitted to torrenting an 82 TB dataset of pirated, copyrighted material from shadow libraries to train its LLaMA AI models. However, the company insists that there is no evidence of "seeding" - the act of sharing a torrented file after the download completes 2

Legal Implications and Meta's Defense

The lawsuit, filed by authors including Richard Kadrey, Sarah Silverman, and Ta-Nehisi Coates, alleges that Meta unlawfully copied and distributed their works through AI outputs. Meta's defense hinges on the lack of evidence of seeding, arguing that downloading copyrighted content isn't illegal, but distribution is 1

Contradictory Evidence

Despite Meta's claims, there is testimony that might challenge their defense:

Michael Clark, a Meta executive, testified that configuration settings were modified "so that the smallest amount of seeding possible could occur" 2
2
.
An internal message from Meta researcher Frank Zhang suggested attempts to conceal potential seeding from Meta's servers 1
1
.

Legal Complexities

Meta is attempting to dismiss the authors' claim under California's Computer Data Access and Fraud Act (CDAFA), arguing it's preempted by copyright law. The authors contend that Meta's "decision to bypass lawful acquisition methods" constitutes a separate CDAFA violation 1

Broader Implications for AI and Copyright

This case highlights the ongoing tension between AI development and copyright law. Similar lawsuits have been filed against other AI companies, including OpenAI and Microsoft, over the use of copyrighted material for training large language models 2

Industry Impact

The outcome of this case could have far-reaching implications for the AI industry, potentially setting precedents for how companies can legally acquire and use data for AI training. It also raises questions about the ethics of using pirated material for technological advancement 1

Next Steps

As the court battle continues, no final decision has been made. Meta is expected to fight the seeding claims at summary judgment, and any decision is likely to face appeals, suggesting a long legal process ahead 1

Meta Defends Torrenting Practices in AI Training Dataset Lawsuit

Meta's Defense in Copyright Infringement Lawsuit

The Torrenting Controversy

Legal Implications and Meta's Defense

Contradictory Evidence

Legal Complexities

Broader Implications for AI and Copyright

Industry Impact

Next Steps

References

Meta defends its vast book torrenting: We're just a leech, no proof of seeding

Meta defends using pirated material, claims it's legal if you don't seed content

Related Stories

Meta Faces Legal Scrutiny Over Alleged Copyright Infringement in AI Training

Meta Faces Legal Challenges Over Alleged Use of Pirated Books for AI Training

Meta Accused of Pirating Adult Content to Train AI Models

Recent Highlights

Pentagon threatens to cut Anthropic's $200M contract over AI safety restrictions in military ops

ByteDance's Seedance 2.0 AI video generator triggers copyright infringement battle with Hollywood

OpenAI closes in on $100 billion funding round with $850 billion valuation as spending plans shift

Recent Highlights

Today's Top Stories

ChatGPT cracks decades-old gluon amplitude puzzle, marking AI's first major theoretical physics win

OpenAI's first Jony Ive device is a $200-$300 smart speaker with a camera, launching in 2027

Anthropic's Claude Code Security triggers sharp selloff in cybersecurity stocks

Indie publisher Finji accuses TikTok of creating racist, sexualized AI ads without permission