YouTube Creators Sue Apple and Amazon Over Alleged AI Training Video Scraping

6 Sources

Share

YouTube content creators have filed class action lawsuits against Apple and Amazon, accusing both tech giants of illegally scraping millions of videos to train their generative AI models. The creators claim the companies circumvented YouTube's protections using sophisticated techniques, violating the Digital Millennium Copyright Act and copyright law.

YouTubers Launch Legal Battle Against Tech Giants Over AI Training

YouTube content creators are taking Apple and Amazon to court in separate class action lawsuits, alleging both tech giants engaged in illegal scraping of videos to train their generative AI models without permission or compensation. The lawsuits, filed by creators including Ted Entertainment (the company behind h3h3 Productions and H3 Podcast), Matt Fisher of MrShortGame Golf, and Golfholics Inc., represent channels with more than 2.6 million combined subscribers and approximately four billion views

3

.

Source: MacRumors

Source: MacRumors

The creators argue that while their YouTube videos are publicly available to watch, both companies allegedly went beyond simple viewing by circumventing protections designed to prevent bulk downloading. This represents a critical test of how copyright law applies to data acquisition for AI training in an era where tech companies race to develop competitive video generation capabilities.

Amazon Accused of Bypassing Safeguards for Nova Reel Development

The lawsuit against Amazon targets the company's Nova Reel, a text-to-video generative AI model launched in December 2024 and made available through Amazon Bedrock

3

. According to the complaint filed in federal court in Seattle, Amazon used automated tools combined with virtual machines and rotating IP addresses to evade YouTube's detection and blocking systems

1

.

The plaintiffs claim Amazon relied on two academic datasets: HD-VILA-100M, produced by Microsoft Research Asia in 2021, and HD-VG-130M, developed by researchers from Peking University and Microsoft

3

. While these datasets contain URL identifiers rather than actual video files, the lawsuit alleges Amazon downloaded the videos directly from YouTube to train Nova Reel. This approach allegedly allowed Amazon to compete with rivals like OpenAI's Sora and Google Veo in the enterprise AI market.

Apple Faces Similar Allegations Over Panda-70M Dataset Usage

Apple confronts parallel copyright infringement allegations centered on the Panda-70M dataset, which indexes millions of YouTube clips for AI training

4

. The same group of YouTubers filed a separate class action lawsuit claiming Apple used their content to train video generation models described in a research paper titled "STIV: Scalable Text and Image Conditioned Video Generation" published in late 2024

5

.

Source: Engadget

Source: Engadget

The complaint states that extracting clips from Panda-70M requires independently accessing source videos on YouTube and isolating designated segments, with each clip representing a separate act of circumvention

5

. The plaintiffs claim their content appears more than 500 times in the dataset and that Apple circumvented YouTube's anti-scraping protections to download and use the underlying content.

Digital Millennium Copyright Act at Center of Legal Strategy

Both lawsuits invoke Section 1201 of the Digital Millennium Copyright Act, the anti-circumvention provision that prohibits bypassing technological protection measures copyright holders use to restrict access to their works

3

. The creators argue that YouTube's systems for protecting video files constitute technological protection measures, and that circumventing them deliberately constitutes a violation regardless of whether content is publicly viewable.

Source: CNET

Source: CNET

This legal theory, if successful in court, would establish that downloading YouTube videos for AI training purposes violates the Digital Millennium Copyright Act even when content is accessible to regular viewers, because circumventing the technical mechanisms that enforce terms of service crosses statutory boundaries. The plaintiffs emphasize what they describe as permanent harm: "Once AI ingests content, that content is stored in its neural network and not capable of deletion or retraction"

3

.

Broader Pattern of AI Copyright Litigation Emerges

These cases represent just two fronts in a wider legal campaign. The same YouTubers have filed similar lawsuits against Meta, Nvidia, ByteDance, Snap, and OpenAI

2

. The creators are seeking both damages and an injunction, the latter potentially forcing companies to stop distributing models trained on disputed material or to retrain them without the contested content

3

.

These lawsuits land at a pivotal moment for the AI industry, as courts weigh whether training on copyrighted material qualifies as fair use of copyrighted material and how much control copyright holders retain once their work builds these systems

1

. While earlier disputes centered on written material, AI video generators have emerged more recently, creating new legal battlegrounds. The outcome could reshape how tech companies approach training data collection and force greater transparency about AI training practices, potentially slowing development timelines or increasing costs as companies seek licensed content rather than scraped material.

Today's Top Stories

TheOutpost.ai

Your Daily Dose of Curated AI News

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

© 2026 Triveous Technologies Private Limited
Instagram logo
LinkedIn logo