Zuckerberg's YouTube Defense in Meta's AI Copyright Battle Sparks Debate

17 Sources

Meta CEO Mark Zuckerberg defends the use of copyrighted e-books to train AI models, comparing it to YouTube's content moderation challenges. The case raises questions about fair use in AI development.

News article

Meta's AI Copyright Controversy

In a high-profile lawsuit, Meta faces allegations of using copyrighted materials to train its AI models without proper authorization. The case, Kadrey v. Meta, involves bestselling authors Sarah Silverman and Ta-Nehisi Coates as plaintiffs, challenging the tech giant's practices in AI development 1.

Zuckerberg's YouTube Defense

During a deposition, Meta CEO Mark Zuckerberg drew a controversial parallel between Meta's use of copyrighted e-books and YouTube's content moderation challenges. He argued that, like YouTube, which may temporarily host pirated content, it's not always rational to completely avoid using certain datasets in AI training 2.

Zuckerberg stated, "So would I want to have a policy against people using YouTube because some of the content may be copyrighted? No. There are cases where having such a blanket ban might not be the right thing to do" 2.

The LibGen Controversy

At the heart of the lawsuit is Meta's alleged use of LibGen, a controversial "links aggregator" providing access to copyrighted works. Court filings suggest that Zuckerberg approved the use of LibGen for training Meta's Llama AI models, despite internal concerns about legal implications 3.

Allegations of Concealment

Plaintiffs' counsel alleges that Meta attempted to conceal its use of copyrighted materials. According to the filings, Meta engineer Nikolay Bashlykov wrote a script to remove copyright information from ebooks in LibGen. The company is also accused of stripping copyright markers from science journal articles and source metadata in the training data 4.

Torrenting and Further Copyright Concerns

The lawsuit also claims that Meta torrented the LibGen dataset, potentially engaging in another form of copyright infringement by participating in the distribution of copyrighted materials. This decision allegedly raised concerns among some Meta research engineers 4.

Meta's Defense and Fair Use Argument

Meta's primary defense rests on the fair use doctrine, arguing that using text to statistically model language and generate original expression falls under permissible use of copyrighted material. However, the recently unsealed documents appear to challenge this argument 5.

Broader Implications for AI Development

This case is part of a larger debate surrounding AI companies' use of copyrighted works for training. The outcome could set a precedent for how fair use is interpreted in the context of AI development, potentially affecting the entire tech industry's approach to AI training data 1.

As the AI industry continues to grapple with these legal and ethical challenges, the resolution of this case may have far-reaching implications for the future of AI development and copyright law in the digital age.

Explore today's top stories

Thinking Machines Lab Raises Record $2 Billion in Seed Funding, Valued at $12 Billion

Mira Murati's AI startup Thinking Machines Lab secures a historic $2 billion seed round, reaching a $12 billion valuation. The company plans to unveil its first product soon, focusing on collaborative general intelligence.

TechCrunch logoWired logoReuters logo

11 Sources

Startups

17 hrs ago

Thinking Machines Lab Raises Record $2 Billion in Seed

Google's AI Agent 'Big Sleep' Thwarts Cyberattack Before It Happens, Marking a Milestone in AI-Driven Cybersecurity

Google's AI agent 'Big Sleep' has made history by detecting and preventing a critical vulnerability in SQLite before it could be exploited, showcasing the potential of AI in proactive cybersecurity.

The Hacker News logoDigital Trends logoAnalytics India Magazine logo

4 Sources

Technology

9 hrs ago

Google's AI Agent 'Big Sleep' Thwarts Cyberattack Before It

AI Researchers Urge Preservation of Chain-of-Thought Monitoring as Critical Safety Measure

Leading AI researchers from major tech companies and institutions have published a position paper calling for urgent action to preserve and enhance Chain-of-Thought (CoT) monitoring in AI systems, warning that this critical safety measure could soon be lost as AI technology advances.

TechCrunch logoVentureBeat logoDigit logo

4 Sources

Technology

9 hrs ago

AI Researchers Urge Preservation of Chain-of-Thought

Google's AI-Powered Cybersecurity Breakthroughs: Big Sleep Agent Foils Live Attack

Google announces major advancements in AI-driven cybersecurity, including the first-ever prevention of a live cyberattack by an AI agent, ahead of Black Hat USA and DEF CON 33 conferences.

Google Blog logoSiliconANGLE logo

2 Sources

Technology

9 hrs ago

Google's AI-Powered Cybersecurity Breakthroughs: Big Sleep

Mistral Unveils Voxtral: Open-Source AI Audio Model Challenges Industry Giants

French AI startup Mistral releases Voxtral, an open-source speech recognition model family, aiming to provide affordable and accurate audio processing solutions for businesses while competing with established proprietary systems.

TechCrunch logoThe Register logoVentureBeat logo

7 Sources

Technology

17 hrs ago

Mistral Unveils Voxtral: Open-Source AI Audio Model
TheOutpost.ai

Your Daily Dose of Curated AI News

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

© 2025 Triveous Technologies Private Limited
Instagram logo
LinkedIn logo