6 Sources
6 Sources
[1]
YouTubers Sue Amazon, Claim AI Tool Was Trained on Scraped Videos
With more than a decade of experience, Nelson covers Apple and Google and writes about iPhone and Android features, privacy and security settings, and more. A group of YouTube creators is suing Amazon, accusing the tech giant of secretly scraping their videos to train its AI video model without permission. The proposed class action lawsuit, filed in federal court in Seattle, alleges Amazon used automated tools to download and extract data from millions of YouTube videos to build and improve its Nova Reel generative AI system -- a model that can create short videos from text prompts and images. At the center of the complaint is how that data was obtained. The plaintiffs claim that Amazon bypassed YouTube's protections using virtual machines and rotating IP addresses to avoid detection, effectively sidestepping the platform's safeguards against bulk downloading. The lawsuit was brought by several creators, including Ted Entertainment (the company behind the H3 Podcast and h3h3 Productions), along with individual YouTubers and channel operators. They argue that the alleged scraping violated copyright law and the Digital Millennium Copyright Act, and are seeking damages as well as an injunction to stop the practice. Amazon didn't respond to a request for comment. The case lands at a pivotal moment for generative AI, as courts weigh whether training on copyrighted material qualifies as fair use and how much control creators retain once their work is used to build these systems. The disputes have often centered on written material, which has been at the center of the AI revolution for several years, while AI video generators such as OpenAI's Sora and Google's Veo have emerged more recently. The lawsuit is one of dozens testing the boundaries of AI training practices, alongside high-profile cases from authors, artists and news organizations, including lawsuits against OpenAI and Meta, all circling the same unresolved question: Where does fair use end and infringement begin?
[2]
Three YouTubers accuse Apple of illegal scraping to train its AI models
Three YouTube channels have banded together and filed a class action lawsuit against Apple, as first spotted by MacRumors. According to the lawsuit, the creators behind h3h3 Productions, MrShortGameGolf and Golfholics have accused Apple of violating the Digital Millennium Copyright Act by scraping copyrighted videos on YouTube to train its AI models. While the YouTubers' videos are available to watch on the platform, the lawsuit alleged that Apple illegally circumvented the "controlled streaming architecture" that regular users are limited to. The creators claimed that Apple's video scraping was used to train its generative AI products, adding that the tech giant's "massive financial success would not have been possible without the video content created" by the YouTubers. MacRumors noted that these YouTube channels have also filed similar lawsuits against other tech companies, including Meta, Nvidia, ByteDance and Snap. It's not the first time a company's alleged AI training methods have gotten them in legal trouble. OpenAI and Microsoft were both accused of using copyrighted articles from the NYTimes to train its AI chatbots. Similarly, Perplexity was recently sued by Reddit and Encyclopedia Britannica for alleged copyright and trademark infringements. Last year, Apple was also named in a separate class action lawsuit from two neuroscience professors who claimed their copyrighted works were used without permission. We reached out to Apple for comment and will update the story when we hear back.
[3]
YouTubers sue Amazon for allegedly scraping their videos to train Nova Reel
In short: Three YouTube content creators, specifically the company behind H3H3 Productions, a solo golf presenter, and a golf channel, have filed a proposed class action lawsuit in Seattle alleging that Amazon bypassed YouTube's technical protections using virtual machines and rotating IP addresses to scrape their videos without consent, feeding the footage into training datasets for Nova Reel, its generative video AI model available through Amazon Bedrock. The suit invokes the anti-circumvention provisions of the Digital Millennium Copyright Act and is the latest in a series of similar cases the same group has filed against Nvidia, Meta, ByteDance, Snap, OpenAI, and Apple. Ted Entertainment Inc., the company behind H3H3 Productions and H3 Podcast Highlights, the YouTube channels run by Ethan and Hila Klein, filed the complaint in the US District Court for the Western District of Washington alongside Matt Fisher, who runs the MrShortGame Golf channel, and Golfholics Inc. The three plaintiffs collectively account for more than 2.6 million YouTube subscribers, approximately four billion combined views, and more than 5,800 original videos. The suit names Amazon as the defendant and targets Nova Reel specifically as the product built, in part, on their content. The lawsuit rests on Section 1201 of the Digital Millennium Copyright Act, the anti-circumvention provision that prohibits bypassing technological protection measures put in place by copyright holders to restrict access to their works. The plaintiffs argue that YouTube's systems for protecting its video files constitute such technological protection measures and that Amazon circumvented them deliberately and at scale to extract training data. If the theory holds in court, it would establish that the act of downloading YouTube videos for AI training purposes constitutes a DMCA violation regardless of whether the content is publicly viewable, because circumventing the technical mechanisms that enforce terms of service crosses the statutory line. The complaint draws attention to what the plaintiffs describe as the permanent nature of the harm: "Once AI ingests content, that content is stored in its neural network and not capable of deletion or retraction." The plaintiffs are seeking both damages and injunctive relief, the latter potentially forcing Amazon to stop distributing a model trained in part on their content or to retrain it without the disputed material. The complaint centres on two academic datasets: HD-VILA-100M, produced by Microsoft Research Asia in 2021, and HD-VG-130M, produced by researchers from Peking University and Microsoft. Both were published for academic purposes and consist of URL identifiers pointing to YouTube videos rather than the video files themselves. That distinction is legally significant: to use either dataset for AI model training, a company must download the actual video files from YouTube, and the plaintiffs allege Amazon did exactly that. According to the complaint, Amazon did not simply download the videos. It deployed automated programmes combined with virtual machines that rotated IP addresses continuously to evade YouTube's detection and blocking systems. The combination of these technical measures, namely automated mass downloading, virtual machine infrastructure, and IP rotation, is characterised in the complaint as a deliberate circumvention of the technological protection measures YouTube maintains over its video library. The same evasion pattern was alleged in this group of plaintiffs' earlier suit against Nvidia, which the complaint in that case said had downloaded 38.5 million video URLs using comparable infrastructure. Nova Reel is Amazon's text-to-video generative AI model, launched in December 2024 and made available through Amazon Bedrock. The model accepts text prompts and images as inputs and generates video clips ranging from six seconds to two minutes in length, with a watermarking feature that Amazon positions as a content authenticity measure. It sits within the broader Nova model family, which Amazon has been expanding across text, image, and video modalities as competition in enterprise AI accelerates. The competitive pressure on Amazon to build capable video AI is substantial. Nova Reel represents the company's attempt to compete with Sora, Google Veo, and other text-to-video systems for enterprise workloads. Amazon's wider AI infrastructure investment, including its partnership with Uber to deploy custom Trainium chips for large-scale model training via AWS, reflects the breadth of the company's ambitions across the AI stack, from cloud compute to generative media. The capital available to frontier AI developers has intensified the competitive pressure to acquire training data at speed and at scale, with SoftBank's $40 billion bridge loan to OpenAI illustrating the resources flowing into the race for generative AI supremacy. The three plaintiffs arrived at this complaint with prior litigation experience. The year 2025 was one in which AI training data practices moved from an industry footnote to the subject of co-ordinated legal action. In December 2025, Ted Entertainment, Fisher, and Golfholics filed a proposed class action against Nvidia in California federal court, alleging that Nvidia scraped their YouTube content using the same HD-VILA-100M and HD-VG-130M datasets and the same IP-rotation and virtual machine infrastructure to train its Cosmos video model. In January 2026 the group extended the strategy, filing suits against Meta, ByteDance, and Snap. In the first week of April, parallel complaints against OpenAI and Apple were filed in the Northern District of California. The Amazon suit, filed in Seattle, is the most recent entry in the sequence. The suits arrive as the broader wave of copyright litigation against AI developers continues to grow. The number of US copyright cases filed against AI companies has now surpassed 100, a figure that includes a March 2026 complaint from Encyclopaedia Britannica and Merriam-Webster against OpenAI, alleging that nearly 100,000 of Britannica's articles were used as training inputs without consent. That case, like the YouTuber suits, relies on the argument that AI developers have systematically extracted content from publishers and creators whose work underpins the capabilities that those developers are now commercialising. The academic dataset mechanism sits at the centre of what the plaintiffs' legal theory is attempting to challenge. By alleging that downloading video files pointed to by an academic URL index constitutes a DMCA violation, the suits target the gap between the published URL list -- which carries a veneer of academic legitimacy -- and the actual extraction activity required to use it. Questions about how frontier AI models source and handle their training data have come into sharper focus in 2026, as scrutiny of the industry's data supply chain has intensified. If courts accept the plaintiffs' reading of Section 1201, the practical consequence would be that AI developers using academic video URL datasets as a path to training footage face the same exposure as developers who downloaded that footage directly. Amazon, like the other defendants in this series of suits, has not commented publicly on the filing.
[4]
Apple sued for allegedly scraping 70 million YouTube videos
A dataset called Panda-70M is at the center, indexing millions of YouTube clips for AI training. Apple's AI plans are facing a familiar challenge as creators push back, and now they are taking the issue to court. Three YouTube channels have sued Apple, claiming the company secretly collected videos from the platform to train its AI models (via MacRumors). The case involves well-known channels like h3h3Productions and golf creators such as MrShortGame Golf and Golfholics. They say Apple used their videos without permission, payment, or even basic credit. The lawsuit says Apple did more than just link to content. It claims Apple got around YouTube's protections to download and use videos directly. The creators argue this breaks the Digital Millennium Copyright Act, which bans getting past systems meant to protect copyrighted material. The lawsuit also says Apple made significant profits by using creators' content to build its AI system, without giving anything back to the people who made the videos. The lawsuit highlights a dataset called Panda-70M, which Apple researchers mentioned in a 2025 paper on video-generation AI. Panda-70M is a large index of YouTube clips, segmented and organized by URLs, timestamps, and identifiers. To use these clips, someone would have to access and extract each one from YouTube. The plaintiffs say that accessing these clips means getting around YouTube's safeguards, making each clip a separate act of scraping. They also claim their own videos appear hundreds of times in the dataset. Apple has not explained in detail how it handled the data, but its research papers confirm that YouTube videos were used in its AI training process. At this point, the lawsuit is asking for damages and possibly an injunction.
[5]
Apple accused of scraping millions of YouTube videos for AI training - 9to5Mac
Lawsuit says Apple used a dataset comprising millions of YouTube videos to train an AI model, as described in a study published in late 2024. Here are the details. As spotted by MacRumors, a proposed class action lawsuit filed by Ted Entertainment, Matt Fisher, and Golfholics alleges that Apple circumvented YouTube's anti-scraping protections to download millions of videos. According to the document, a team of Apple researchers published a study called STIV: Scalable Text and Image Conditioned Video Generation, in which they say they used a dataset called Panda-70M to help train the video generation model described in the paper. From the lawsuit: The Panda 70M dataset functions as a map or index file identifying specific YouTube videos and clips by URL, video identifier, and timestamp. A single YouTube video may be divided into numerous clips, each treated as a separate training sample. Extracting any clip requires independently accessing the source video on YouTube and isolating the designated segment, a process that constitutes a separate act of circumvention for each clip retrieved. The plaintiffs claim their content appears more than 500 times in the dataset, and are seeking to represent "all others similarly situated" as part of a proposed class action. In essence, they argue that while the dataset only provides links to the videos, Apple allegedly circumvented YouTube's anti-scraping protections to download and use the underlying content to train its AI models. For this reason, they demand a trial jury for all claims, and ask for the following: In addition to Apple, the plaintiffs have filed proposed class action suits against Amazon and OpenAI, alleging that both companies also used the Panda-70M dataset in their own AI model training processes.
[6]
Apple Sued by Three YouTube Channels
Three established YouTube channels have sued Apple, alleging that the company violated the U.S. Digital Millennium Copyright Act (DMCA) by unlawfully accessing and scraping millions of copyrighted videos from YouTube to train its AI models. In a class action lawsuit filed in California federal court last week, the owners of the YouTube channels h3h3Productions (plus H3 Podcast and H3 Podcast Highlights), MrShortGame Golf, and Golfholics alleged that Apple "deliberately circumvented" YouTube's protections against video scraping and "profited substantially" by doing so. Apple's research papers indicate that some of the YouTube videos uploaded by the plaintiffs were used to train its AI models, the complaint alleged. Apple's actions were "not only unlawful, but an unconscionable attack on the community of content creators whose content is used to fuel the multi-trillion-dollar generative AI industry without any compensation," the lawsuit adds. The plaintiffs are seeking an injunction and damages individually and on behalf of all others similarly situated in the U.S., per the complaint. In recent months, the same three YouTube channels have filed similar lawsuits against other tech giants, including Meta, Nvidia, ByteDance, and Snap. h3h3Productions is a well-known YouTube channel created by Ethan Klein and Hila Klein, and they later created the H3 Podcast. Their channels have millions of followers, while MrShortGame Golf and Golfholics have hundreds of thousands of followers.
Share
Share
Copy Link
YouTube content creators have filed class action lawsuits against Apple and Amazon, accusing both tech giants of illegally scraping millions of videos to train their generative AI models. The creators claim the companies circumvented YouTube's protections using sophisticated techniques, violating the Digital Millennium Copyright Act and copyright law.
YouTube content creators are taking Apple and Amazon to court in separate class action lawsuits, alleging both tech giants engaged in illegal scraping of videos to train their generative AI models without permission or compensation. The lawsuits, filed by creators including Ted Entertainment (the company behind h3h3 Productions and H3 Podcast), Matt Fisher of MrShortGame Golf, and Golfholics Inc., represent channels with more than 2.6 million combined subscribers and approximately four billion views
3
.
Source: MacRumors
The creators argue that while their YouTube videos are publicly available to watch, both companies allegedly went beyond simple viewing by circumventing protections designed to prevent bulk downloading. This represents a critical test of how copyright law applies to data acquisition for AI training in an era where tech companies race to develop competitive video generation capabilities.
The lawsuit against Amazon targets the company's Nova Reel, a text-to-video generative AI model launched in December 2024 and made available through Amazon Bedrock
3
. According to the complaint filed in federal court in Seattle, Amazon used automated tools combined with virtual machines and rotating IP addresses to evade YouTube's detection and blocking systems1
.The plaintiffs claim Amazon relied on two academic datasets: HD-VILA-100M, produced by Microsoft Research Asia in 2021, and HD-VG-130M, developed by researchers from Peking University and Microsoft
3
. While these datasets contain URL identifiers rather than actual video files, the lawsuit alleges Amazon downloaded the videos directly from YouTube to train Nova Reel. This approach allegedly allowed Amazon to compete with rivals like OpenAI's Sora and Google Veo in the enterprise AI market.Apple confronts parallel copyright infringement allegations centered on the Panda-70M dataset, which indexes millions of YouTube clips for AI training
4
. The same group of YouTubers filed a separate class action lawsuit claiming Apple used their content to train video generation models described in a research paper titled "STIV: Scalable Text and Image Conditioned Video Generation" published in late 20245
.
Source: Engadget
The complaint states that extracting clips from Panda-70M requires independently accessing source videos on YouTube and isolating designated segments, with each clip representing a separate act of circumvention
5
. The plaintiffs claim their content appears more than 500 times in the dataset and that Apple circumvented YouTube's anti-scraping protections to download and use the underlying content.Related Stories
Both lawsuits invoke Section 1201 of the Digital Millennium Copyright Act, the anti-circumvention provision that prohibits bypassing technological protection measures copyright holders use to restrict access to their works
3
. The creators argue that YouTube's systems for protecting video files constitute technological protection measures, and that circumventing them deliberately constitutes a violation regardless of whether content is publicly viewable.
Source: CNET
This legal theory, if successful in court, would establish that downloading YouTube videos for AI training purposes violates the Digital Millennium Copyright Act even when content is accessible to regular viewers, because circumventing the technical mechanisms that enforce terms of service crosses statutory boundaries. The plaintiffs emphasize what they describe as permanent harm: "Once AI ingests content, that content is stored in its neural network and not capable of deletion or retraction"
3
.These cases represent just two fronts in a wider legal campaign. The same YouTubers have filed similar lawsuits against Meta, Nvidia, ByteDance, Snap, and OpenAI
2
. The creators are seeking both damages and an injunction, the latter potentially forcing companies to stop distributing models trained on disputed material or to retrain them without the contested content3
.These lawsuits land at a pivotal moment for the AI industry, as courts weigh whether training on copyrighted material qualifies as fair use of copyrighted material and how much control copyright holders retain once their work builds these systems
1
. While earlier disputes centered on written material, AI video generators have emerged more recently, creating new legal battlegrounds. The outcome could reshape how tech companies approach training data collection and force greater transparency about AI training practices, potentially slowing development timelines or increasing costs as companies seek licensed content rather than scraped material.Summarized by
Navi
[4]
1
Technology

2
Technology

3
Science and Research
