Amazon Found High Volume of CSAM in AI Training Data

Amazon Dominates AI-Related CSAM Reports to NCMEC

Amazon detected and reported hundreds of thousands of suspected child sexual abuse material cases in its AI training data throughout 2025, according to a Bloomberg investigation1

. The National Center for Missing and Exploited Children received more than 1 million AI-related CSAM reports last year, with the vast majority coming from Amazon2

. This marks a dramatic 15-fold increase from the 67,000 AI-related reports NCMEC received in 2024 and just 4,700 in 20233

. The scale of Amazon's reporting stands in stark contrast to other major tech companies, which collectively submitted only "a handful of reports" during the same period.

Source: Engadget

Lack of Origin Details Hampers Law Enforcement Efforts

Child safety officials have identified "glaring differences" between Amazon and its peers in how they handle these discoveries. While Amazon removed the content before training its AI models, the company has not provided information about the source of the material, potentially hindering law enforcement from finding perpetrators and protecting victims1

. An Amazon spokesperson stated that the training data was obtained from external sources and the company doesn't have details about its origin that could aid investigators. "This is really an outlier," said Fallon McNulty, executive director of NCMEC's CyberTipline. "Having such a high volume come in throughout the year begs a lot of questions about where the data is coming from, and what safeguards have been put in place."2

McNulty noted that reports from other companies included actionable data for law enforcement, while Amazon's reports have proved "inactionable."

Scanning Foundation Model Training Data Reveals High Volume of CSAM

Amazon employs a detection tool that uses "hashing" to compare content against a database of known child abuse material involving real victims. The company stated that approximately 99.97% of the reports resulted from scanning "non-proprietary training data"3

. In an emailed statement, an Amazon spokesperson said the company "takes a deliberately cautious approach to scanning foundation model training data, including data from the public web, to identify and remove known child sexual abuse material and protect our customers." The company indicated it intentionally uses an over-inclusive threshold for scanning, which yields a high percentage of false positives, believing it over-reported cases to avoid accidentally missing something. As of January, Amazon stated it is "not aware of any instances" of its AI models generating child sexual abuse material, and none of its reports to NCMEC were of AI-generated content.

Source: Bloomberg

Risks in AI Development and the Fast-Moving AI Race

The spike in Amazon's reports coincides with a fast-moving AI race that has companies scrambling to acquire and ingest huge volumes of datasets to improve their models3

. This rush has complicated the work of child safety officials who are struggling to keep up with changing technology and challenged regulators tasked with safeguarding AI from abuse. AI safety experts warn that quickly amassing large datasets without proper safeguards comes with grave risks. Training AI models on illegal and exploitative content raises concerns about shaping a model's underlying behaviors, potentially improving its ability to digitally alter and sexualize photos of real children or create entirely new images that never existed. It also raises the threat of continuing the circulation of images that models were trained on, re-victimizing children who have suffered abuse.

Broader Context of AI-Related CSAM Reports

The AI-related reports represent a fraction of the total reports submitted to NCMEC. In 2024, NCMEC received more than 20 million reports from across the industry, with most coming from Meta Platforms subsidiaries including Facebook, Instagram, and WhatsApp3

. Not all reports are ultimately confirmed as containing CSAM. The category of AI-related CSAM reports can include AI-generated photos and videos, sexually explicit conversations with AI chatbots, or photos of real victims collected unintentionally during efforts to improve AI models. Recent months have seen safety questions for minors emerge as a critical concern for the artificial intelligence industry. OpenAI and Character.AI have both faced lawsuits after teenagers planned their suicides using those companies' platforms, while Meta is being sued for alleged failures to protect teen users from sexually explicit conversations with chatbots2

Amazon Found High Volume of CSAM in AI Training Data, Raising Questions About Safeguards

Amazon Dominates AI-Related CSAM Reports to NCMEC

Lack of Origin Details Hampers Law Enforcement Efforts

Scanning Foundation Model Training Data Reveals High Volume of CSAM

Risks in AI Development and the Fast-Moving AI Race

Broader Context of AI-Related CSAM Reports

References

Amazon Found 'High Volume' Of Child Sex Abuse Material in AI Training Data

Amazon discovered a 'high volume' of CSAM in its AI training data but isn't saying where it came from

Amazon found 'high volume' of child sex abuse material in AI training data, center says

Related Stories

AI Videos of Child Sexual Abuse Surge to Record Highs as Advanced Generators Fuel Crisis

AI-Generated Child Sexual Abuse Material: A Growing Threat Outpacing Tech Regulation

AI Dataset LAION-5B Back Online After Removal of Illegal Content

Recent Highlights

Google Gemini 3.1 Pro doubles reasoning score, beats rivals in key AI benchmarks

Meta strikes up to $100 billion AI chips deal with AMD, could acquire 10% stake in chipmaker

Pentagon threatens Anthropic with supply chain risk label over AI safeguards for military use

Recent Highlights

Today's Top Stories

ChatGPT Health fails critical emergency safety tests, raising concerns for 40 million users

Anthropic launches enterprise agents with plugins for finance, HR, and engineering workflows

Oura launches proprietary AI model for women's health spanning menstrual cycles to menopause

Elon Musk's Grok AI secures Pentagon deal for classified military operations