OpenAI ordered to hand over 20 million ChatGPT logs in New York Times copyright lawsuit

Reviewed byNidhi Govil

5 Sources

Share

A federal judge ruled that OpenAI must turn over 20 million de-identified ChatGPT logs to The New York Times and other publishers in a copyright infringement lawsuit. The decision, issued by U.S. Magistrate Judge Ona T. Wang, rejected OpenAI's privacy objections and could reshape how AI companies handle training data and user information. OpenAI has appealed the ruling, warning it undermines user trust.

OpenAI Court Ruling Forces Disclosure of Millions of Chat Records

U.S. Magistrate Judge Ona T. Wang delivered a decisive blow to OpenAI on December 3, ordering the company to produce 20 million de-identified ChatGPT user logs as evidence in The New York Times lawsuit

1

. The federal judge rejected OpenAI's motion to reconsider an earlier discovery order, finding that the company failed to demonstrate any change in law, new evidence, or clear error that would justify blocking the production

5

. This copyright lawsuit, filed in December 2023, accuses OpenAI of using Times content without permission to train its AI systems

1

. The anonymized chat logs represent a statistically valid monthly sample spanning from December 2022 through November 2024

1

.

Source: Decrypt

Source: Decrypt

The New York Times Seeks Evidence to Prove Copyright Infringement

News publishers have been seeking access to ChatGPT output logs since May 2024 to determine whether the conversational AI reproduced their copyrighted works

5

. The New York Times and other plaintiffs, including newspapers owned by Alden Global Capital's MediaNews Group, argued that the logs were necessary to rebut OpenAI's assertion that they "hacked" ChatGPT's responses to manufacture evidence

1

. OpenAI countered that turning over the logs would disclose confidential user information and that "99.99%" of the transcripts have nothing to do with the infringement allegations

3

. MediaNews Group executive editor Frank Pine stated that OpenAI's leadership was "hallucinating when they thought they could get away with withholding evidence about how their business model relies on stealing from hardworking journalists"

3

.

User Privacy Concerns Clash With Evidentiary Needs

Judge Wang acknowledged that "privacy considerations of OpenAI's users are sincere" but ruled that such considerations "cannot predominate where there is clear relevance and minimal burden"

2

. The court emphasized that "there are multiple layers of protection in this case precisely because of the highly sensitive and private nature of much of the discovery"

3

. OpenAI must complete its "exhaustive de-identification" process within seven days before handing over the logs

3

. Despite these safeguards, OpenAI COO Brad Lightcap warned in an October 22 statement that the demand "fundamentally conflicts with the privacy commitments we have made to our users" and "abandons long-standing privacy norms"

1

. The company has separately appealed the ruling to U.S. District Judge Sidney Stein, arguing the disclosure could undermine user trust

5

.

Source: CXOToday

Source: CXOToday

Implications for Training AI Models and Industry Practices

This case joins a growing wave of copyright challenges aimed at how AI labs source and use training data

2

. The outcome could shape how tech firms such as OpenAI, Anthropic, and Perplexity license content and build guardrails around what their AI systems can output

2

. As many as 60 copyright suits have been filed in the US alone, with competitors including Meta and Microsoft facing similar scrutiny over unauthorized data scraping

4

. Legal experts believe AI startups would need to justify their data practices given the court order that dismissed OpenAI's burden arguments and prioritized evidentiary needs

4

. The ruling could accelerate paid data partnerships and licensing agreements, potentially costing AI companies billions as publishers seek compensation for their intellectual property rights

4

.

Source: ET

Source: ET

What This Means for ChatGPT Users and AI Accountability

For current ChatGPT users, OpenAI is under no obligation to preserve new consumer ChatGPT or API data indefinitely

1

. Deleted ChatGPT conversations and Temporary Chats are automatically removed from OpenAI systems within 30 days

1

. However, the historical data covered by this ruling may include some old chats, though OpenAI says it has already stripped all identifying information

1

. Privacy advocates warn that de-identified data could still be reverse-engineered to reveal sensitive information

4

. This marks the first time that OpenAI has been forced to disclose chat data in litigation, signaling that once courts get involved in copyright claims, previous assurances around data privacy become less certain

1

. The case provides a preview of the transparency and AI accountability battles to come, as regulators worldwide grapple with drawing guardrails around innovation while protecting user trust and intellectual property

4

.

TheOutpost.ai

Your Daily Dose of Curated AI News

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

© 2026 Triveous Technologies Private Limited
Instagram logo
LinkedIn logo