4 Sources
4 Sources
[1]
OpenAI fights order to hand over 20 million private ChatGPT...
There are very few things OpenAI has said or done over the last couple years that I agree with or am comfortable with, but there are exceptions to every rule and this is one of them. Sorry NYT, if I were a subscriber to that or other AI platform, you don't get to just rifle through all of my (and other people's) chats to see if I'm one of the tiny fraction of users trying to get around your paywall. I'm not unsympathetic to authors and book publishers who want to hammer these assholes for copyright infringement for example, but media outlets will have to find a way to work with these companies, cooperating on how news info gets "regurgitated" by LLMs going forward because going through user prompts as a discovery method is bullshit.
[2]
OpenAI fights order to hand over 20 million private ChatGPT conversations
OpenAI wants a court to reverse a ruling forcing the ChatGPT maker to give 20 million user chats to The New York Times and other news plaintiffs that sued it over alleged copyright infringement. Although OpenAI previously offered 20 million user chats as a counter to the NYT's demand for 120 million, the AI company says a court order requiring production of the chats is too broad. "The logs at issue here are complete conversations: each log in the 20 million sample represents a complete exchange of multiple prompt-output pairs between a user and ChatGPT," OpenAI said today in a filing in US District Court for the Southern District of New York. "Disclosure of those logs is thus much more likely to expose private information [than individual prompt-output pairs], in the same way that eavesdropping on an entire conversation reveals more private information than a 5-second conversation fragment." OpenAI's filing said that "more than 99.99%" of the chats "have nothing to do with this case." It asked the district court to "vacate the order and order News Plaintiffs to respond to OpenAI's proposal for identifying relevant logs." OpenAI could also seek review in a federal court of appeals. OpenAI posted a message on its website to users today saying that "The New York Times is demanding that we turn over 20 million of your private ChatGPT conversations" in order to "find examples of you using ChatGPT to try to get around their paywall." ChatGPT users concerned about privacy have more to worry about than the NYT case. For example, ChatGPT conversations have been found in Google search results and the Google Search Console tool that developers can use to monitor search traffic. OpenAI today said it plans to develop "advanced security features designed to keep your data private, including client-side encryption for your messages with ChatGPT. " OpenAI: AI chats should be treated like private emails OpenAI's court filing argues that the chat log production should be narrowed based on the relevance of chats to the case. "OpenAI is unaware of any court ordering wholesale production of personal information at this scale," the filing said. "This sets a dangerous precedent: it suggests that anyone who files a lawsuit against an AI company can demand production of tens of millions of conversations without first narrowing for relevance. This is not how discovery works in other cases: courts do not allow plaintiffs suing Google to dig through the private emails of tens of millions of Gmail users irrespective of their relevance. And it is not how discovery should work for generative AI tools either." A November 7 order by US Magistrate Judge Ona Wang sided with the NYT, saying that OpenAI must "produce the 20 million de-identified Consumer ChatGPT Logs to News Plaintiffs by November 14, 2025, or within 7 days of completing the de-identification process." Wang ruled that the production must go forward even though the parties don't agree on whether the logs must be produced in full: Whether or not the parties had reached agreement to produce the 20 million Consumer ChatGPT Logs in whole -- which the parties vehemently dispute -- such production here is appropriate. OpenAI has failed to explain how its consumers' privacy rights are not adequately protected by: (1) the existing protective order in this multidistrict litigation or (2) OpenAI's exhaustive de-identification of all of the 20 million Consumer ChatGPT Logs. OpenAI's filing today said the court order "did not acknowledge OpenAI's sworn witness declaration explaining that the de-identification process is not intended to remove information that is non-identifying but may nonetheless be private, like a Washington Post reporter's hypothetical use of ChatGPT to assist in the preparation of a news article." Chats stored under legal hold The 20 million chats consist of a random sampling of ChatGPT conversations from December 2022 to November 2024 and do not include chats of business customers, OpenAI said in the message on its website. "We presented several privacy-preserving options to The Times, including targeted searches over the sample (e.g., to search for chats that might include text from a New York Times article so they only receive the conversations relevant to their claims), as well as high-level data classifying how ChatGPT was used in the sample. These were rejected by The Times," OpenAI said. The chats are stored in a secure system that is "protected under legal hold, meaning it can't be accessed or used for purposes other than meeting legal obligations," OpenAI said. The NYT "would be legally obligated at this time to not make any data public outside the court process," and OpenAI said it will fight any attempts to make the user conversations public. An NYT filing on October 30 accused OpenAI of defying prior agreements "by refusing to produce even a small sample of the billions of model outputs that its conduct has put in issue in this case." The filing continued: Immediate production of the output log sample is essential to stay on track for the February 26, 2026, discovery deadline. OpenAI's proposal to run searches on this small subset of its model outputs on Plaintiffs' behalf is as inefficient as it is inadequate to allow Plaintiffs to fairly analyze how "real world" users interact with a core product at the center of this litigation. Plaintiffs cannot reasonably conduct expert analyses about how OpenAI's models function in its core consumer-facing product, how retrieval augmented generation ("RAG") functions to deliver news content, how consumers interact with that product, and the frequency of hallucinations without access to the model outputs themselves. OpenAI said the NYT's discovery requests were initially limited to logs "related to Times content" and that it has "been working to satisfy those requests by sampling conversation logs. Towards the end of that process, News Plaintiffs filed a motion with a new demand: that instead of finding and producing logs that are 'related to Times content,' OpenAI should hand over the entire 20 million-log sample 'via hard drive.'" OpenAI disputes judge's reasoning The November 7 order cited a California case, Concord Music Group, Inc. v. Anthropic PBC, in which US District Magistrate Judge Susan van Keulen ordered the production of 5 million records. OpenAI consistently relied on van Keulen's use of a sample-size formula "in support of its previous proposed methodology for conversation data sampling, but fails to explain why Judge [van] Keulen's subsequent order directing production of the entire 5-million record sample to the plaintiff in that case is not similarly instructive here," Wang wrote. OpenAI's filing today said the company was never given an opportunity to explain why Concord shouldn't apply in this case because the news plaintiffs did not reference it in their motion. "The cited Concord order was not about whether wholesale production of the sample was appropriate; it was about the mechanism through which Anthropic would effectuate an already agreed-upon production," OpenAI wrote. "Nothing about that order suggests that Judge van Keulen would have ordered wholesale production had Anthropic raised the privacy concerns that OpenAI has raised throughout this case." The Concord logs were just prompt-output pairs, "i.e., a single user prompt followed by a single model output," OpenAI wrote. "The logs at issue here are complete conversations: each log in the 20 million sample represents a complete exchange of multiple prompt-output pairs between a user and ChatGPT." That could result in "up to 80 million prompt-output pairs," OpenAI said. We contacted The New York Times about OpenAI's filing and will update this article if it provides any comment.
[3]
NYT lawsuit forces OpenAI to keep 20M ChatGPT chats
OpenAI argued that the order presents significant engineering challenges and conflicts with international data protection regulations like GDPR. OpenAI is engaged in a legal battle over a U.S. court order compelling it to indefinitely retain 20 million randomly sampled ChatGPT conversations as part of its copyright infringement lawsuit with The New York Times. The preservation order, issued on May 13 and affirmed by District Judge Sidney Stein on June 26, forces OpenAI to hold user data indefinitely, a move that directly conflicts with the company's standard 30-day deletion policy for unsaved chats. The order affects data from December 2022 through November 2024 for ChatGPT Free, Plus, Pro, and Team subscribers, as well as API customers without Zero Data Retention agreements. Enterprise, Edu, and ZDR customers are excluded. OpenAI has stated it implemented restricted access protocols for the preserved data, limiting access to a small legal and security team, and affirmed the data will not be used for training or turned over to external parties at this time. The New York Times filed the lawsuit in December 2023, alleging OpenAI illegally used millions of its articles for training models. The lawsuit seeks the destruction of all models trained on its works and potential billions in damages. OpenAI argued that the order presents significant engineering challenges and conflicts with international data protection regulations like GDPR. Judge Stein rejected these arguments, emphasizing that OpenAI's terms of service allow data preservation for legal requirements. A recent modification to the order on September 26, 2025, provided limited relief, ending the requirement for OpenAI to preserve all new chat logs moving forward. However, the company must retain the data already saved and any information from ChatGPT accounts flagged by The New York Times. Security practitioners warn the case shatters assumptions about data deletion in AI interactions. OpenAI CEO Sam Altman suggested the situation accelerates the need for an "AI privilege" concept, similar to attorney-client privilege. The litigation also raises concerns for enterprise users regarding compliance with regulations like HIPAA and GDPR. In a statement regarding security, OpenAI CISO Dane Stuckey said, "Only serious misuse and critical risks -- such as threats to someone's life, plans to harm others, or cybersecurity threats -- may ever be escalated to a small, highly vetted team of human reviewers."
[4]
OpenAI fights order to turn over millions of ChatGPT conversations
OpenAI asked a New York judge to reverse an order to hand over 20 million anonymised ChatGPT logs in a copyright suit by the New York Times, arguing it would expose private user data and is irrelevant to the case. The Times says the logs are needed to assess potential misuse of its content. OpenAI asked a federal judge in New York on Wednesday to reverse an order that required it to turn over 20 million anonymised ChatGPT chat logs amid a copyright infringement lawsuit by the New York Times and other news outlets, saying it would expose users' private conversations. The artificial intelligence company argued that turning over the logs would disclose confidential user information and that "99.99%" of the transcripts have nothing to do with the copyright infringement allegations in the case. "To be clear: anyone in the world who has used ChatGPT in the past three years must now face the possibility that their personal conversations will be handed over to The Times to sift through at will in a speculative fishing expedition," the company said in a court filing. Lawsuit claims OpenAI misused Times' stories The news outlets argued that the logs were necessary to determine whether ChatGPT reproduced their copyrighted content and to rebut OpenAI's assertion that they "hacked" the chatbot's responses to manufacture evidence. The lawsuit claims OpenAI misused their articles to train ChatGPT to respond to user prompts. Magistrate Judge Ona Wang said in her order to produce the chats that users' privacy would be protected by the company's "exhaustive de-identification" and other safeguards. OpenAI has a Friday deadline to produce the transcripts. Spokespeople and attorneys for the New York Times did not immediately respond to requests for comment. OpenAI chief information security officer Dane Stuckey said in a blog post on Wednesday that sharing the logs would violate privacy and security protections and "force us to turn over tens of millions of highly personal conversations from people who have no connection to the Times' baseless lawsuit." The case is one of many pending lawsuits against tech companies over the alleged misuse of copyrighted work to train AI systems.
Share
Share
Copy Link
OpenAI is fighting a federal court order requiring it to turn over 20 million anonymized ChatGPT conversations to The New York Times in a copyright infringement lawsuit, arguing the disclosure would violate user privacy despite de-identification measures.
OpenAI is mounting a vigorous legal challenge against a federal court order requiring the company to turn over 20 million anonymized ChatGPT conversations to The New York Times and other news outlets in an ongoing copyright infringement lawsuit
2
. The AI company filed a motion on Wednesday asking US District Court for the Southern District of New York to reverse the November 7 ruling by Magistrate Judge Ona Wang, arguing that the disclosure would expose private user information despite de-identification measures4
.
Source: Economic Times
The disputed chat logs represent a random sampling of ChatGPT conversations spanning from December 2022 to November 2024, covering ChatGPT Free, Plus, Pro, and Team subscribers, as well as API customers without Zero Data Retention agreements
3
. OpenAI emphasizes that these are complete conversations rather than individual prompt-output pairs, making them "much more likely to expose private information, in the same way that eavesdropping on an entire conversation reveals more private information than a 5-second conversation fragment"2
.Source: Ars Technica
The New York Times filed the lawsuit in December 2023, alleging that OpenAI illegally used millions of its articles to train ChatGPT and other AI models
3
. The news outlets argue that the chat logs are necessary to determine whether ChatGPT reproduced their copyrighted content and to counter OpenAI's assertion that they "hacked" the chatbot's responses to manufacture evidence4
. The lawsuit seeks the destruction of all models trained on NYT works and potentially billions in damages.Related Stories
OpenAI contends that "more than 99.99%" of the requested chats have no relevance to the copyright case and warns of setting a dangerous precedent for AI privacy
2
. The company argues that courts don't allow plaintiffs suing Google to "dig through the private emails of tens of millions of Gmail users irrespective of their relevance," and the same principle should apply to AI tools2
.The preservation order, originally issued on May 13 and affirmed by District Judge Sidney Stein on June 26, conflicts with OpenAI's standard 30-day deletion policy for unsaved chats
3
. OpenAI has implemented restricted access protocols, limiting access to a small legal and security team, and has committed that the data will not be used for training purposes. The company is also developing advanced security features, including client-side encryption for ChatGPT messages2
.Summarized by
Navi
[1]
[3]
[4]
06 Jun 2025•Technology

11 Oct 2025•Policy and Regulation

03 Jul 2025•Policy and Regulation
