5 Sources
5 Sources
[1]
Your ChatGPT chats could be less private than you thought - here's what a new OpenAI court ruling means for you
A new ruling from U.S. Magistrate Judge Ona Wang, made public on December 3, requires OpenAI to hand over 20 million ChatGPT logs so investigators can check for copyright breaches against The New York Times as part of its ongoing lawsuit. According to Wang, releasing the logs won't risk violating users' privacy because "there are multiple layers of protection in this case precisely because of the highly sensitive and private nature of much of the discovery." Despite those assurances, OpenAI says the move still puts user privacy at risk. OpenAI has not yet issued a response to the latest ruling, but in its October 22 statement, Brad Lightcap, COO of OpenAI, said, "We strongly believe this is an overreach by The New York Times. We're continuing to appeal this order so we can keep putting your trust and privacy first." OpenAI is dealing with several legal disputes, but its fight with The New York Times so far has the biggest implications for your chat privacy. The lawsuit, filed in 2023, accuses OpenAI of using Times content without permission to train its AI systems. The Times has requested user logs to rebut OpenAI's claim that the publication "hacked" ChatGPT's responses to manufacture evidence. "We will continue to fight these overreaches by The New York Times and defend long-standing privacy norms", OpenAI said in its October 22 statement. Lightcap added: "The New York Times and other plaintiffs have made a sweeping and unnecessary demand in their baseless lawsuit against us: retain consumer ChatGPT and API customer data indefinitely. This fundamentally conflicts with the privacy commitments we have made to our users. It abandons long-standing privacy norms and weakens privacy protections." If you're wondering whether your own conversations are now at risk, you shouldn't need to worry going forward. OpenAI is under no obligation to preserve new consumer ChatGPT or API data indefinitely. Deleted ChatGPT conversations and Temporary Chats are currently automatically removed from OpenAI systems within 30 days. However, the historical data covered by this ruling (described as a "statistically valid monthly sample of OpenAI's ChatGPT output logs from December 2022 through November 2024") may include some of your old chats. OpenAI says it has already stripped all identifying information, and Judge Wang has given the company seven days to complete that process before handing over the logs. On the one hand, the court insists that privacy is protected. On the other hand, OpenAI says the move undermines user trust. What's clear is that once these logs are in the hands of legal teams, experts will be dissecting them to see what can be inferred from real conversations. For the millions of ChatGPT users, this case is a preview of the privacy battles to come regarding AI. Even if your name isn't attached to your chats, much can be gleaned from the way we talk to chatbots, or more importantly, the way they talk back. This is the first time that OpenAI has been forced to hand over your chat data. Once the courts get involved in copyright claims, the previous assurances around privacy start to become less certain. It's definitely something to think about the next time you choose to disclose something personal to an AI chatbot.
[2]
OpenAI Ordered to Hand Over 20M ChatGPT Logs in NYT Copyright Case - Decrypt
The case joins a growing wave of copyright challenges aimed at how AI labs source and use training data. A federal magistrate judge has ordered OpenAI to turn over roughly 20 million de-identified ChatGPT logs to The New York Times and other plaintiffs, deepening the AI development company's exposure to an array of copyright and data governance disputes. Issued on Wednesday in New York, the order denies OpenAI's bid to block the production of user-chat records and directs the company to hand over the logs under a protective framework. The outcome could shape how tech firms such as OpenAI, Anthropic, and Perplexity source training data, license content, and build guardrails around and over what their systems can output. While the court "recognizes that the privacy considerations of OpenAI's users are sincere," such considerations "are only one factor in the proportionality analysis, and cannot predominate where there is clear relevance and minimal burden," U.S. Magistrate Judge Ona T. Wang wrote. Decrypt has reached out to both parties for comment. The order stems from the Times' ongoing lawsuit, which alleges that OpenAI's models were trained on copyrighted news content without permission. It was first brought forward in December 2023. In January last year, OpenAI challenged the NYT's claims and filed a countersuit, claiming that the publication was not "telling the full story." The court later found that the 20 million chat log samples in question are "proportional to the needs of the case" to assess whether ChatGPT outputs copied the NYT's material. Over the past year, the dispute has intensified, with plaintiffs pressing for broad access to output data, and OpenAI warning that expansive production of these materials would raise privacy and operational burdens. In June, OpenAI faced another setback when the court ordered the company to keep a wide range of ChatGPT user data for the lawsuit, including chats users may have already deleted. Months later, in October, the dispute resurfaced, with the court flagging OpenAI's October 20 filing (ECF 679) that challenged the production of the 20 million log sample, and ordered both sides to submit clarifications on why they disagree. At the time, the judge pressed the parties to explain how the fight related to earlier concerns over deleted logs and whether OpenAI had backed away from prior agreements on what it previously claimed it would turn over. Late last month, OpenAI filed a formal objection asking the district judge to overturn the magistrate judge's discovery order. The company argued that the ruling was "clearly erroneous" and "disproportionate," in that it would force the company to disclose millions of private user conversations, according to a court document shared with Decrypt by an OpenAI representative. The dispute arises as part of a broader offensive against AI labs, with authors, news organizations, music publishers, and code repositories seeking to test how far existing copyright law extends when models ingest and reproduce protected material.
[3]
OpenAI loses fight to keep ChatGPT logs secret in copyright case - The Economic Times
The judge rejected OpenAI's privacy-related objections to an earlier order requiring the artificial intelligence startup to submit the records as evidence. "There are multiple layers of protection in this case precisely because of the highly sensitive and private nature of much of the discovery," Wang said.OpenAI must produce millions of anonymized chat logs from ChatGPT users in its high-stakes copyright dispute with the New York Times and other news outlets, a federal judge in Manhattan ruled. U.S. Magistrate Judge Ona Wang in a decision made public on Wednesday said that the 20 million logs were relevant to the outlets' claims and that handing them over would not risk violating users' privacy. The judge rejected OpenAI's privacy-related objections to an earlier order requiring the artificial intelligence startup to submit the records as evidence. "There are multiple layers of protection in this case precisely because of the highly sensitive and private nature of much of the discovery," Wang said. An OpenAI spokesperson on Wednesday cited an earlier blog post from the company's Chief Information Security Officer Dane Stuckey, which said the Times' demand for the chat logs "disregards long-standing privacy protections" and "breaks with common-sense security practices." OpenAI has separately appealed Wang's order to the case's presiding judge, U.S. District Judge Sidney Stein. Spokespeople for the New York Times did not immediately respond to a request for comment. A group of newspapers owned by Alden Global Capital's MediaNews Group is also involved in the lawsuit. MediaNews Group executive editor Frank Pine said in a statement on Wednesday that OpenAI's leadership was "hallucinating when they thought they could get away with withholding evidence about how their business model relies on stealing from hardworking journalists." The case, originally brought by the Times in 2023, is one of many brought by copyright owners against tech companies including OpenAI, Microsoft and Meta Platforms for using their material without permission to train their AI systems. The news outlets argued in their case against OpenAI that the logs were necessary to determine whether ChatGPT reproduced their copyrighted content, and to rebut OpenAI's assertion that they "hacked" the chatbot's responses to manufacture evidence. OpenAI countered that turning over the logs would disclose confidential user information and that "99.99%" of the transcripts have nothing to do with the infringement allegations. Wang had said in her initial order to produce the chats that OpenAI users' privacy would be protected by the company's "exhaustive de-identification" and other safeguards. Wang reiterated on Wednesday that the company's measures would "reasonably mitigate associated privacy concerns." Wang ordered OpenAI to produce the logs within seven days of removing users' identifying information.
[4]
OpenAI Loses Battle to Keep ChatGPT Logs a Secret: Implications on the Future of AI
In what could be a pivotal point that potentially reshapes the boundaries around AI and intellectual property, a US Federal judge has ordered OpenAI to disclose millions of anonymised chat longs from ChatGPT. While privacy advocates are concerned over how it would result data revelations, the publishing industry is cock-a-hoop. Not surprising, given that Judge Ona Wang in the Manhattan District Court gave this ruling in a lawsuit brought by The New York Times and others over copyright infringement. In some ways, this was a battle between content creators and AI freeloaders around how developers were training their models. The judge rejected OpenAI's privacy-related objections to an earlier order requiring OpenAI to submit the records as evidence, which the ChatGPT maker had criticised as "breaking with common-sense security practices." "There are multiple layers of protection in this case precisely because of the highly sensitive and private nature of much of the discovery," Wang ruled while rejecting the plea. The lawsuit was initiated in 2023 against OpenAI and Microsoft accusing them of illegally procuring copyrighted material to train AI models, with NYT and others claiming that ChatGPT reproduced and distorted their articles without permission. This, they said, was tantamount to stealing. They had sought chat logs to prove this claim. Experts view this judgment as opening a window into the "opaque world of AI" and expects the 20 million chat logs to provide information around how ChatGPT regurgitates existing content. The court asked OpenAI to provide the logs in a de-identified format that strips off personal identifiers while retaining the substance of the chats. While fair use of content for training AI models is at the heart of the lawsuit, the ruling could assist lawyers prosecuting OpenAI around teen suicides. The company is currently facing as many as seven lawsuits on this front. The Sam Altman-led company had raised a furore by claiming that the victim had overridden "safety features" on ChatGPT. The latest ruling from the Manhattan court builds up on similar decisions where the law had compelled technology companies to open up their black boxes. A recent instance of a German court holding OpenAI liable for reproducing song lyrics without permission is a case in point. Are there any user privacy implications? Since lawsuits across the world are used to set precedents for future rulings, the question now revolves around privacy in the AI age. Experts are concerned that de-identified data could still be reverse-engineered to reveal sensitive information. "Anything you say to AI may be used against you in court," says a user on X (formerly Twitter). OpenAI and others in the business have assured users that chat logs aren't indefinitely stored but this ruling raises questions. OpenAI responded to the latest ruling hinted at potential appeals over claims that broad disclosure could stifle innovation. However, till date courts have prioritised corporate accountability over secrecy. For daily users like this author, the ruling provides better awareness around how the AI systems work. It has been mentioned in the past that while OpenAI logs data for enhancing its capabilities, its use of deleted and sensitive chats could stymie adoption. Of course, accessing personal information isn't new as Google has done it for years. What could be the repercussions around the industry? From the perspective of the media industry, the case could result in publishing houses getting their hands into the deep pockets of these AI startups, given that as many as 60 copyrights suits have been filed in the US alone. Competitors such as Meta and Microsoft too face similar scrutiny as does Google over unauthorised data scraping. These cases are seen as a global pushback against AI's unbridled growth which has already caused revenue losses to publishers. In fact, People Inc. had produced data to prove that since its launch Google Summaries has resulted in publishers losing a big chunk of their revenues from the search industry. Without doubt, the immediate impact of this ruling could be more publishing companies targeting AI companies for licensing deals that could shave off billions from the kitty of these massively funded startups. Paid data partnerships could become the name of the game as seen from OpenAI's recent moves with publishers. How would it impact the regulatory environment? It could accelerate calls for transparency across countries that are grappling with the right ways to draw guardrails around AI, where innovation continues without hampering data privacy and cybersecurity fears. The US itself has no comprehensive AI law but the State of California passed a law to provide transparency in AI business. Legal experts believe AI startups would need to justify their data practices given the ruling by Judge Wang that dismissed OpenAI's burden arguments and prioritised evidentiary needs. Of course, this may not result in curtailing ethical debates and actions by whistleblowers who have claim that AI profits result from uncompensated labour. Could this bring a shift around innovations and future use-cases? Tough to tell for now, but suffice to say that AI giants might seek ways to advance the level of anonymisation in order to accelerate privacy and data security. Which way this exercise could go is still a mystery as the recent experiments by French AI startup Mistral does suggest that training smaller AI models could be an option. Even for a company as large as OpenAI, such a move may curtail infringement risks. Given the recent spate of lawsuits around teen suicides, the new age tech giants may feel more committed to analyse chat logs to ascertain behaviour patterns of their models and provide fixes at launch. OpenAI had reportedly fixed some of the issues around open-ended chats with teens with its latest model - the GPT-5. Would such litigation actually kill innovation? Opinion is sharply divided on this question with proponents claiming that this could well sound the death knell for AI chatbots and the rest of the world countering it with the view that opening up a black box would just reduce the mystique. They point to Europe's stricter rules under the AI Act and how it hasn't yet stymied innovation in any way. Experts have earlier noted that sharing data collected from multiple sources does not compel any AI company to share the secret sauce (algorithms) that they use to deliver customized results. When it comes to agentic AI use cases, the argument does not hold water as these data sets come from within an enterprise and isn't publicly available. The lawyers taking a swipe at OpenAI and others for causing irreversible psychological harm to users believe that the order could actually help them deep dive into issues that prompt these chatbots to hallucinate or respond in a human way. Recently some users had sought help from the Federal Trade Commission after noting that their complaints on the mental wellness issues was going uncared for by OpenAI and others.
[5]
OpenAI Fails to Block 20M ChatGPT Logs Release Order in the US
A federal magistrate judge in the Southern District of New York has rejected OpenAI's request to reconsider an earlier order requiring it to hand over 20 million de-identified ChatGPT output logs to news publishers suing the company for copyright infringement. In a detailed opinion issued by Magistrate Judge Ona T. Wang in the sprawling In re OpenAI Copyright Litigation, the court ruled that OpenAI failed to show any change in law, new evidence, or clear error that would justify revisiting the November 7 order. As a result, OpenAI must produce the log dataset, which the plaintiffs say is necessary to test the company's models for potential misuse of their copyrighted material. News organisations have been seeking access to ChatGPT output logs since May 2024 to determine whether the conversational AI system reproduced their copyrighted works and to test OpenAI's fair-use and substantial non-infringing use defences. OpenAI retains "tens of billions" of consumer ChatGPT logs. After the plaintiffs learnt earlier this year that the company had been deleting certain API, enterprise, and user-deleted consumer logs, Judge Wang ordered OpenAI to preserve all logs going forward. Over the past year, both sides negotiated over what size of a "statistically valid" monthly sample OpenAI should produce for the merits phase. While publishers sought roughly 120 million logs, OpenAI repeatedly argued that a smaller 20-million-log sample was sufficient, citing the time and cost needed to de-identify the data. Plaintiffs eventually agreed to accept the 20-million-log sample. But in October 2025, OpenAI told the plaintiffs it would not produce the entire sample and instead proposed using additional keyword filters to narrow the dataset. The publishers moved to compel, and Judge Wang ordered production of all 20 million ChatGPT logs on November 7. OpenAI then sought reconsideration, claiming the sample was not proportional under Rule 26 and would expose user privacy. The lawsuit, originally filed by The New York Times in 2023, is among the most closely watched copyright fights involving AI. It is part of a broader wave of cases targeting OpenAI, Microsoft and Meta for allegedly using copyrighted journalism to train AI systems without permission. Newspapers owned by Alden Global Capital's MediaNews Group are also part of the suit, arguing that OpenAI's business model unfairly exploits decades of reporting. Judge Wang denied the motion, finding: Wang noted that OpenAI itself had argued earlier in the case that the 20-million sample was manageable and could be safely de-identified using its "significantly more effective" internal tools. The company "failed to explain," she wrote, why those same representations no longer hold. OpenAI has separately appealed Judge Wang's order to U.S. District Judge Sidney Stein, insisting the disclosure could undermine user trust and violate long-standing privacy norms. The company has publicly defended its position, with Chief Information Security Officer Dane Stuckey arguing that the publishers' request "breaks with common-sense security practices." MediaNews Group executive editor Frank Pine was far more blunt, saying OpenAI leadership was "hallucinating when they thought they could get away with withholding evidence about how their business model relies on stealing from hardworking journalists." OpenAI leaned heavily on a Northern District of California ruling in Concord Music Group v. Anthropic to argue against full production. But Judge Wang said OpenAI had invoked that case for nearly a year when arguing for a smaller sample and only now, when it no longer favoured its position, claimed that Concord was irrelevant. The court also dismissed OpenAI's reliance on the Nichols v. Noom case, noting that (1) the logs in this case serve multiple purposes and (2) the privacy protections here exceed those applied in Noom. The plaintiffs further argue that the logs are essential to rebut OpenAI's claim that they "hacked" ChatGPT to manufacture infringing outputs. OpenAI, meanwhile, insists that "99.99%" of the logs are irrelevant and consist of private, non-news-related conversations. Under the new order, OpenAI must turn over the full 20 million de-identified ChatGPT logs within seven days. Judge Wang reiterated that the company's "exhaustive de-identification" tools and strict protective orders "reasonably mitigate associated privacy concerns."
Share
Share
Copy Link
A federal judge ruled that OpenAI must turn over 20 million de-identified ChatGPT logs to The New York Times and other publishers in a copyright infringement lawsuit. The decision, issued by U.S. Magistrate Judge Ona T. Wang, rejected OpenAI's privacy objections and could reshape how AI companies handle training data and user information. OpenAI has appealed the ruling, warning it undermines user trust.
U.S. Magistrate Judge Ona T. Wang delivered a decisive blow to OpenAI on December 3, ordering the company to produce 20 million de-identified ChatGPT user logs as evidence in The New York Times lawsuit
1
. The federal judge rejected OpenAI's motion to reconsider an earlier discovery order, finding that the company failed to demonstrate any change in law, new evidence, or clear error that would justify blocking the production5
. This copyright lawsuit, filed in December 2023, accuses OpenAI of using Times content without permission to train its AI systems1
. The anonymized chat logs represent a statistically valid monthly sample spanning from December 2022 through November 20241
.
Source: Decrypt
News publishers have been seeking access to ChatGPT output logs since May 2024 to determine whether the conversational AI reproduced their copyrighted works
5
. The New York Times and other plaintiffs, including newspapers owned by Alden Global Capital's MediaNews Group, argued that the logs were necessary to rebut OpenAI's assertion that they "hacked" ChatGPT's responses to manufacture evidence1
. OpenAI countered that turning over the logs would disclose confidential user information and that "99.99%" of the transcripts have nothing to do with the infringement allegations3
. MediaNews Group executive editor Frank Pine stated that OpenAI's leadership was "hallucinating when they thought they could get away with withholding evidence about how their business model relies on stealing from hardworking journalists"3
.Judge Wang acknowledged that "privacy considerations of OpenAI's users are sincere" but ruled that such considerations "cannot predominate where there is clear relevance and minimal burden"
2
. The court emphasized that "there are multiple layers of protection in this case precisely because of the highly sensitive and private nature of much of the discovery"3
. OpenAI must complete its "exhaustive de-identification" process within seven days before handing over the logs3
. Despite these safeguards, OpenAI COO Brad Lightcap warned in an October 22 statement that the demand "fundamentally conflicts with the privacy commitments we have made to our users" and "abandons long-standing privacy norms"1
. The company has separately appealed the ruling to U.S. District Judge Sidney Stein, arguing the disclosure could undermine user trust5
.
Source: CXOToday
Related Stories
This case joins a growing wave of copyright challenges aimed at how AI labs source and use training data
2
. The outcome could shape how tech firms such as OpenAI, Anthropic, and Perplexity license content and build guardrails around what their AI systems can output2
. As many as 60 copyright suits have been filed in the US alone, with competitors including Meta and Microsoft facing similar scrutiny over unauthorized data scraping4
. Legal experts believe AI startups would need to justify their data practices given the court order that dismissed OpenAI's burden arguments and prioritized evidentiary needs4
. The ruling could accelerate paid data partnerships and licensing agreements, potentially costing AI companies billions as publishers seek compensation for their intellectual property rights4
.
Source: ET
For current ChatGPT users, OpenAI is under no obligation to preserve new consumer ChatGPT or API data indefinitely
1
. Deleted ChatGPT conversations and Temporary Chats are automatically removed from OpenAI systems within 30 days1
. However, the historical data covered by this ruling may include some old chats, though OpenAI says it has already stripped all identifying information1
. Privacy advocates warn that de-identified data could still be reverse-engineered to reveal sensitive information4
. This marks the first time that OpenAI has been forced to disclose chat data in litigation, signaling that once courts get involved in copyright claims, previous assurances around data privacy become less certain1
. The case provides a preview of the transparency and AI accountability battles to come, as regulators worldwide grapple with drawing guardrails around innovation while protecting user trust and intellectual property4
.Summarized by
Navi
[1]
12 Nov 2025•Policy and Regulation
06 Jun 2025•Technology

11 Oct 2025•Policy and Regulation

1
Business and Economy

2
Policy and Regulation

3
Technology
