Curated by THEOUTPOST
On Fri, 8 Nov, 8:06 AM UTC
10 Sources
[1]
OpenAI Has, For Now, Escaped Copyright Lawsuit Filed Against It Regarding Its Use Of News Articles Without Consent To Train ChatGPT
OpenAI has been in the news since its inception of ChatGPT and has been actively evolving its technology, developing new models, and working aggressively to bring AGI forward. While the company's progression is widely praised, it had to face some legal pressure for misusing articles from news outlets to train its large language models. However, the artificial intelligence giant has been able to dodge the lawsuit for now as a federal judge in New York has dismissed the case. OpenAI found itself to be under legal complications as news outlets Raw Story and AlterNet accused the company of misusing articles from their channels to train their large language models. In the legal pursuit it was claimed that OpenAI used the content without consent and was in violation of copyright laws. However, as per a Reuters report, things seem to favor the company for now, as a New York federal judge dismissed the lawsuit on Thursday due to a lack of evidence to support the claim. The U.S. District Judge, Colleen McMohan, dismissed the lawsuit against OpenAI and suggested that the company did not cause noticeable harm that could back up the claim. While the company dodged the case for now, Judge McMahon did open the option for the plaintiffs to file a new complaint but also went on to express doubt on a favorable response and shared skepticism over "cognizable injury." Both media outlets are, however, working on outing the case and addressing the concerns the court presented in a new, altered complaint. An OpenAI spokesperson presented their side by claiming that they indulged in fair practices, which involved using publicly available data and legal precedents that are widely accepted to support AI model training practices. This is not the first time a lawsuit has been filed against the company, as the New York Times was the first media outlet to have legally pursued OpenAI back in December. We are seeing an increasing number of lawsuits being filed against not just OpenAI but also other big tech companies for using the content of artists, YouTubers, and publishers without their consent to train their AI models. As AI progresses, we see massive improvements in how we interact and gain information, but it has also invited complications with the evolution of the technology.
[2]
OpenAI dodges copyright lawsuit over use of news articles in AI training
OpenAI, the artificial intelligence company behind ChatGPT, avoided a copyright lawsuit claiming it misused news articles in training its large language model (LLM). On Nov. 7 a federal judge in New York dismissed the lawsuit brought by news outlets Raw Story and AlterNet, according to a Reuters report. OpenAI in the clear -- for now US District Judge Colleen McMahon ruled that the outlets did not present enough evidence of harm to support their claims but allowed for the possibility of refiling with additional evidence. McMahon noted that what is "really at stake" is not the exclusion of "the alleged injury for which Plaintiffs truly seek redress is not the exclusion of copyrighted materials but the "use of Plaintiffs' articles to develop ChatGPT without compensation." As the case stands, McMahon said the type of harm claimed did not justify the lawsuit. However, OpenAI may not be entirely off the hook. Lawyers for the news outlets said they were "certain" they could address the court's concerns with an amended complaint. In February, the news outlets filed a lawsuit claiming thousands of articles were unlawfully used by OpenAI to train its chatbot, which reproduces copyrighted material when correctly prompted. Related: Nvidia's new open-source AI model beats GPT-4o on benchmarks In December 2023, The New York Times filed a lawsuit against OpenAI, with allegations of unauthorized use of "millions" of NYT articles for chatbot training. Time and The Associated Press have also challenged the AI developer's use of copyrighted content via lawsuits. AI and the news However, since then the company has made major strides in partnering with news organizations to legally acquire rights to content in order to provide accurate, real-time news information as answers to queries. It has partnered with major news providers globally, including the Financial Times, French publisher Le Monde, Spanish Prisa Media and German media giant Axel Springer. As a result, on Nov. 1, the company announced its latest feature, ChatGPT Search, which allows users to search and receive answers from the web-based on their questions. OpenAI isn't the only AI company seeking out partnerships with news organizations. On Oct. 25 Meta, the parent company of Facebook and Instagram, announced its first news deal with Reuters. The partnership will allow the company's AI chatbot to also link directly to Reuters news articles when responding to related queries.
[3]
Judge tosses publishers' copyright suit against OpenAI
A US judge has thrown out a case against ChatGPT developer OpenAI which alleged it unlawfully removed copyright management information (CMI) when building training sets for its chatbots. Publishers Raw Story and AltNet allege that when OpenAI removed the description of the copyright status, it resulted in a "concrete injury." The plaintiffs also argued there was a substantial risk that OpenAI's systems could "provide responses to users that incorporate ... material from Plaintiffs' copyright-protected work or regurgitate copyright-protected works verbatim or nearly verbatim." In a statement to Reuters, an OpenAI spokesperson said: "We build our AI models using publicly available data, in a manner protected by fair use and related principles, and supported by longstanding and widely accepted legal precedents." In February, Raw Story and AltNet alleged OpenAI populated their training sets with works of journalism, choosing to strip away CMI protected by the Digital Millennium Copyright Act. However, US District Judge Colleen McMahon granted OpenAI's motion to dismiss the case. In her ruling [PDF], she said Raw Story and AltNet had not alleged that the information in their articles was copyrighted, nor could they do so. "When a user inputs a question into ChatGPT, ChatGPT synthesizes the relevant information in its repository into an answer. Given the quantity of information contained in the repository, the likelihood that ChatGPT would output plagiarized content from one of Plaintiffs' articles seems remote," she said. However, the legal ruling has a bearing on whether OpenAI was allowed to develop its products using journalists' articles. "Let us be clear about what is really at stake here. The alleged injury for which plaintiffs truly seek redress is not the exclusion of CMI from [OpenAI's] training sets, but rather [the] use of plaintiffs' articles to develop ChatGPT without compensation to plaintiffs," she said. McMahon said that questions about these kinds of harms had not been put before the court. The judge said she would allow an amended complaint from the publishers. The Raw Story and AltNet case against OpenAI is one among many challenging AI developers' use of copyrighted material in training sets. OpenAI also faces a suit from authors Paul Tremblay, Sarah Silverman, Michael Chabon, David Henry Hwang, and Ta-Nehisi Coates. Another group of authors are suing Anthropic, alleging it unlawfully used their copyrighted work to train its Claude AI model. Last year, Dan Conway, CEO of the UK's Publishers Association, told the House of Lords Communications and Digital Committee that large language models were infringing copyrighted content on an "absolutely massive scale," arguing that the Books3 database - which lists 120,000 pirated book titles - had been ingested by large language models. However, AI developers have argued that maintaining broad access to information on the internet is important for innovation. ®
[4]
Federal court sides with OpenAI in AI copyright lawsuit - SiliconANGLE
A federal court has dismissed a copyright lawsuit that two news outlets brought against OpenAI earlier this year. U.S. District Judge Colleen McMahon issued the ruling on Thursday. The plaintiffs, Raw Story Media Inc. and AlterNet Media Inc., have the option to relaunch the lawsuit in the future with certain revisions. The two publications filed their complaint with the U.S. District Court for the Southern District of New York in February. They charged that OpenAI had incorporated thousands of their articles into three artificial intelligence datasets. According to the lawsuit, the company used those datasets to train some of the large language models that underpin ChatGPT. The complaint didn't accuse OpenAI of copyright infringement. Rather, it took issue with the fact that the AI provider removed so-called copyright management information, or CMI, from the publications' articles before incorporating them into its training datasets. CMI is a legal term that covers details such as an article's author and title. In their lawsuits, the plaintiffs charged that OpenAI's removal of CMI information breached a 1998 piece of legislation called the Digital Millennium Copyright Act. Raw Story and AlterNet argued that this practice made OpenAI liable to pay damages. The second part of the lawsuit requested injunctive relief. That's a court order designed to stop a certain practice, in this case the use of articles for AI training without the associated CMI information. The lawsuit stated that an injunction is needed because there's a risk ChatGPT will display copyrighted content from Raw Story and AlterNet without CMI details such as article titles. In its legal response, OpenAI asked the court to dismiss the case. The ChatGPT developer's argument centered on the fact that lawsuits of this type can only proceed if the plaintiffs demonstrate they suffered "concrete harm." Raw Story and AlterNet made the case that OpenAI's "unlawful removal of CMI from a copyrighted work" qualifies as such a harm. The court sided with OpenAI. "Let us be clear about what is really at stake here," McMahon wrote in the ruling. "The alleged injury for which Plaintiffs truly seek redress is not the exclusion of CMI from Defendants' training sets, but rather Defendants' use of Plaintiffs' articles to develop ChatGPT without compensation to Plaintiffs." The ruling allows Raw Story and AlterNet to refile the lawsuit under certain conditions. In particular, the publications must provide "an explanation of why the proposed amendment would not be futile." OpenAI said in a statement that "we build our AI models using publicly available data, in a manner protected by fair use and related principles, and supported by long-standing and widely accepted legal precedents."
[5]
OpenAI Gets a Win as Court Says No Harm Was Demonstrated in Copyright Case
A judge found that two media outlets alleging copyright violations hadn't demonstrated that they'd been harmed by OpenAI removing "copyright management information" from its training data. OpenAI won an initial victory on Thursday in one of the many lawsuits the company is facing for its unlicensed use of copyrighted material to train generative AI products like ChatGPT. A federal judge in the southern district of New York dismissed a complaint brought by the media outlets Raw Story and AlterNet, which claimed that OpenAI violated copyright law by purposefully removing what is known as copyright management information, such as article titles and author names, from material that it incorporated into its training datasets. OpenAI had filed a motion to dismiss the case, arguing that the plaintiffs did not have standing to sue because they had not demonstrated a concrete harm to their businesses caused by the removal of the copyright management information. Judge Colleen McMahon agreed, dismissing the lawsuit but leaving the door open for the plaintiffs to file an amended complaint. OpenAI and other generative AI companies are fighting dozens of copyright lawsuits brought by news outlets, book publishers, artists, and record companies. The Raw Story and AlterNet case differed from many of the other lawsuits because it centered on a narrow provision in the Digital Millennium Copyright Act (DMCA) that prohibits the removal of copyright management information from a work in order to enable or conceal copyright infringement. The outlets argued that removal of the information by itself constituted a concrete injury and created a substantial risk that OpenAI's large language models would regurgitate their copyrighted works verbatim. McMahon didn't find that argument convincing, writing that the plaintiffs hadn't "alleged any actual adverse effects stemming from this alleged DMCA violation." "When a user inputs a question into ChatGPT, ChatGPT synthesizes the relevant information in its repository into an answer," she wrote. "Given the quantity of information contained in the repository, the likelihood that ChatGPT would output plagiarized content from one of Plaintiffs' articles seems remote." In other cases, particularly a lawsuit filed by the New York Times against OpenAI and Microsoft, the plaintiffs have alleged that the companies' products did in fact reproduce large sections of copyrighted work. The Times' complaint includes multiple examples of ChatGPT and Microsoft's Bing Chat responding to user prompts with multiple paragraphs of content copied verbatim from the newspaper's articles. McMahon's decision doesn't directly address the Times' allegations, but her ruling suggests that plaintiffs hoping to win an AI copyright case in her courtroom will have to demonstrate not only that a generative model has reproduced some work in the past or may do so in the future, but that its current version is actively reproducing the work. "While Plaintiffs provide third-party statistics indicating that an earlier version of ChatGPT generated responses containing significant amounts of plagiarized content, Plaintiffs have not plausibly alleged that there is a 'substantial risk' that the current version of ChatGPT will generate a response plagiarizing one of Plaintiffs' articles," she wrote.
[6]
OpenAI wins first round against Raw Story and AlterNet copyright case
The publications sued OpenAI for using their articles to train its AI technology. OpenAI is facing multiple lawsuits over its use of several publications' and books' content to train its large language models without explicit permission or proper compensation. A judge has just dismissed one of them. New York federal judge Colleen McMahon has dismissed the lawsuit filed by Raw Story and AlterNet, which accused the company of using their materials for AI training without consent. As VentureBeat notes, though, their complaint didn't argue that OpenAI infringed on their copyright like other publications' lawsuits do. Instead, it focused on the DMCA provision that protects "copyright management information." The publications argued that OpenAI removed the author names, titles and other metadata identifying their copyright from the articles it used to train its LLMs. McMahon explained that the plaintiffs failed to show that they suffered "a cognizable injury" from those actions and that the harm they had cited was "not the type of harm that has been elevated" to warrant a lawsuit. The judge also said that "the likelihood that ChatGPT would output plagiarized content from one of [their] articles seems remote." She added that the plaintiffs are truly seeking redress for the use of their articles "to develop ChatGPT without compensation" and not for the removal of their copyright management information. Raw Story and AlterNet don't intend to back down, based on what their lawyer told Reuters. Matt Topic, their attorney, said they're "certain [they] can address the concerns the court identified through an amended complaint."
[7]
OpenAI defeats news outlets' copyright lawsuit over AI training, for now
Nov 7 - A New York federal judge on Thursday dismissed a lawsuit against artificial intelligence giant OpenAI that claimed it misused articles from news outlets Raw Story and AlterNet to train its large language models. U.S. District Judge Colleen McMahon said that the outlets could not show enough harm to support the lawsuit but allowed them to file a new complaint, even though she said she was "skeptical" that they could "allege a cognizable injury." Raw Story's owners acquired AlterNet in 2018. Raw Story attorney Matt Topic of Loevy + Loevy said the outlets were "certain we can address the concerns the court identified through an amended complaint." Spokespeople and attorneys for OpenAI did not immediately respond to a request for comment on the decision. Raw Story and AlterNet filed the lawsuit in February. They said that thousands of their articles were used without permission to train OpenAI's popular chatbot ChatGPT and that it reproduces their copyrighted material when prompted. The case is part of a wave of lawsuits against OpenAI and other tech companies by authors, visual artists, music publishers and other copyright owners over the data used to train their generative AI systems. A lawsuit filed by The New York Times against OpenAI in December was the first from a media outlet. Unlike similar lawsuits, Raw Story and AlterNet's complaint accused OpenAI of unlawfully removing their articles' copyright management information (CMI) without arguing that it infringed their copyrights. McMahon agreed with OpenAI that the claims should be dismissed. "Let us be clear about what is really at stake here," McMahon said. "The alleged injury for which Plaintiffs truly seek redress is not the exclusion of CMI" but the "use of Plaintiffs' articles to develop ChatGPT without compensation." McMahon said that the harm cited by the outlets is "not the type of harm that has been elevated" to a level that would justify the lawsuit. "Whether there is another statute or legal theory that does elevate this type of harm remains to be seen," McMahon said. "But that question is not before the Court today." The cases are Raw Story Media v. OpenAI Inc, U.S. District Court for the Southern District of New York, No. 1:24-cv-01514. For the news outlets: Matt Topic, Jon Loevy and Michael Kanovitz of Loevy + Loevy For OpenAI: Joe Gratz, Vera Ranieri and Rose Lee of Morrison & Foerster; Joseph Wetzel, Andy Gass, Sy Damle and Alli Stillman of Latham & Watkins; Bob Van Nest, Jamie Slaughter, Paven Malhotra, Michelle Ybarra, Nick Goldberg, Tom Gorman and Katie Lynn Joyce of Keker Van Nest & Peters Read more: OpenAI hit with new lawsuits from news outlets over AI training Reporting by Blake Brittain in Washington Our Standards: The Thomson Reuters Trust Principles., opens new tab Blake Brittain Thomson Reuters Blake Brittain reports on intellectual property law, including patents, trademarks, copyrights and trade secrets, for Reuters Legal. He has previously written for Bloomberg Law and Thomson Reuters Practical Law and practiced as an attorney.
[8]
News outlets lose copyright lawsuit against OpenAI
The judge ruled in favour of OpenAI stating that the outlets were unable to prove 'concrete injury'. In a win for the ChatGPT maker, the Southern District of New York has dismissed a lawsuit taken by two news media outlets against OpenAI for allegedly violating copyright law by scraping news article to train its AI models. News outlets Raw Story Media and AlterNet Media filed the complaint earlier this year, accusing OpenAI of violating the Digital Millennium Copyright Act (DMCA) by scraping copyrighted journalistic work to train its AI models. The plaintiffs said that OpenAI "intentionally" removed copyright management information - data that includes the work's title, the author name and terms and conditions of use of the copyrighted work. Under DMCA provisions, it is prohibited to remove or alter copyright management information with knowledge that it would facilitate copyright infringement. OpenAI does not reveal the exact data it uses to train its models. However, the plaintiffs alleged that in an "extensive review" of publicly available information, they found "thousands" of their copyrighted works were included in OpenAI's data sets without details on the author, title and copyright information - details that the plaintiffs made available with their work. The two plaintiffs, in their lawsuit, sought $2,500 in damages per violation. However, yesterday (7 November) Judge Colleen McMohan dismissed the lawsuit and stated that the plaintiffs were unable to prove any "concrete injury". "When a user inputs a question into ChatGPT, ChatGPT synthesises the relevant information in its repository into an answer," the judge wrote. She claimed that the likelihood that ChatGPT, an AI model trained on large swaths of data, would output plagiarised content from one of the plaintiff's article is "remote". In a similar lawsuit filed last year, The New York Times launched a legal battle against OpenAI for alleged copyright infringement, claiming that ChatGPT is trained on millions of articles published by the outlet. Business Insider reported last month that the Times' lawyers were poring over ChatGPT's source code in a secure room to try to figure out how AI trains on creative work. Don't miss out on the knowledge you need to succeed. Sign up for the Daily Brief, Silicon Republic's digest of need-to-know sci-tech news.
[9]
OpenAI Scored a Legal Win Over Progressive Publishers -- but the Fight's Not Finished
A judge tossed out a case against OpenAI brought by Alternet and Raw Story, in what could be a significant ruling in the larger battle between AI companies and publishers. OpenAI has notched a victory in its ongoing legal fight against publishers over how its AI tools use creative work. On November 7, a judge dismissed a copyright case against the startup brought by independent publishers Alternet and Raw Story. While several publishers have lined up content deals with OpenAI -- including WIRED parent company Condé Nast -- dozens of copyright lawsuits against AI startups are winding their way through the US court system. Many of these complaints allege direct copyright infringement, arguing that it's illegal for AI companies to train their tools on news articles, books, paintings, and other copyrighted materials without permission. Some also include other claims ranging from trademark law violation to violations of the Digital Millennium Copyright Act, a copyright law intended as an anti-piracy statute that is now broadly deployed by intellectual property rights holders. The complaint brought by Alternet and Raw Story focused on the DMCA, arguing that OpenAI broke the law by scraping thousands of news articles and stripping them of "copyright management information" (CMI) like the author's name, the terms and conditions for use of the work, and the title of the work. The outlets asked for statutory damages of no less than $2,500 per violation, arguing OpenAI knew that stripping training data of CMI would result in copyright infringement from ChatGPT when it summarized or "regurgitated" articles without proper attribution. OpenAI argued that the publishers had no legal standing to bring this claim, stating they failed to offer proof that ChatGPT was trained on their material, let alone that the training was harmful. Judge Colleen McMahon of the US Southern District of New York agreed with OpenAI's argument, dismissing the case for lack of standing. "We build our AI models using publicly available data, in a manner protected by fair use and related principles, and supported by long-standing and widely accepted legal precedents," says OpenAI spokesperson Jason Deutrom. Although this is a major setback for Alternet and Raw Story, it's not necessarily the end. "We do intend to continue the case," says Raw Story founder and CEO John Byrne. The next step is requesting permission from the judge to file an amended complaint. "We're confident that we can address the court's concerns in an amended complaint," says Matt Topic, a partner at Loevy & Loevy, the firm representing Raw Story Media. While Judge McMahon describes herself as "skeptical" that the outlets could "allege a cognizable injury" in the dismissal, her ruling does indicate that she's open to considering a new filing.
[10]
OpenAI's data scraping wins big as Raw Story's copyright lawsuit dismissed by NY court
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More The Southern District of New York has dismissed a copyright violation lawsuit brought by Raw Story Media, Inc. and AlterNet Media, Inc., alternative left-leaning online news outlets, against OpenAI, effectively shutting down claims that the generative AI firm violated copyrights by using scraped news content in its training data. This dismissal could be seen as an important moment in the ongoing battle over copyright and AI tools -- particularly under Section 1202(b) of the Digital Millennium Copyright Act (DMCA) -- but it is worth noting that other cases have also failed to establish successful claims under this provision. Let's dive into what happened, why the judge dismissed the case, and what this means for the future of AI, copyright and the legality of tech companies to scrape content off the web without the creators' express permission or compensation. Understanding the DMCA's Section 1202(b) The lawsuit revolved around Section 1202(b) of the DMCA, a provision that aims to protect "copyright management information" (CMI). This includes any author names, titles, and other metadata that identify copyrighted works. Section 1202(b) prohibits the removal or alteration of such information without authorization, especially if doing so facilitates copyright infringement. In this case, Raw Story and AlterNet alleged that OpenAI used articles from their websites for training ChatGPT and other models without preserving CMI, violating Section 1202(b). OpenAI is not the only AI company likely to have scraped such material from the web -- while AI model providers tend to closely guard their training datasets, the industry at large has undoubtedly scraped large swaths of the web to train its various models (a practice similar to what Google did to crawl and index search results in its main search engine product). In this way, some creators view data scraping akin to AI's "original sin." In this case, the plaintiffs Raw Story and Alternet claimed that OpenAI's AI outputs -- responses generated by the models -- were sometimes based on their articles and the company knowingly violated copyright after the CMI was removed. Why the court dismissed Raw Story's claims Judge Colleen McMahon granted OpenAI's motion to dismiss the case on grounds of lack of standing. Specifically, the judge found that the plaintiffs couldn't demonstrate that they suffered a concrete, actual injury from OpenAI's actions -- an essential requirement under Article III of the U.S. Constitution for any lawsuit to proceed. Judge McMahon also considered the evolving landscape of large language model (LLM) interfaces, noting that updates to these systems further complicate attribution and traceability. She emphasized that generative AI's iterative improvements make it less likely that content will be reproduced verbatim, making the plaintiffs' claims even more speculative. The judge noted that "the likelihood that ChatGPT would output plagiarized content from one of Plaintiffs' articles seems remote." This reflects a key difficulty in these types of cases: generative AI is designed to synthesize information rather than replicate it verbatim. The plaintiffs failed to present convincing evidence that their specific works were directly infringed in a way that led to identifiable harm. The ruling aligns with similar cases where courts have struggled to apply traditional copyright law to generative AI. For example, the Doe 1 v. GitHub case involving Microsoft's Copilot also dealt with claims under Section 1202(b). There, the court found that the code generated by Copilot wasn't an "identical copy" of the original, but rather snippets that were reconfigured, making it difficult to prove the violation of CMI requirements. A growing divide on Section 1202(b) The Raw Story decision highlights the broader uncertainties courts are facing regarding Section 1202(b), especially with generative AI. There is currently no firm consensus on how Section 1202(b) applies to a wide swath of online content. In one corner, some courts have imposed what's called an "identicality" requirement -- meaning plaintiffs must prove that the infringing works are an exact copy of the original content, minus CMI. Others, however, have allowed for more flexible interpretations. For instance, the court in the Southern District of Texas recently rejected the identicality requirement, stating that even partial reproductions could qualify as violations if CMI is deliberately removed. Meanwhile, in the lawsuit brought by Sarah Silverman and a collection of authors, the court held that the plaintiff failed to show sufficient evidence that OpenAI had actively removed CMI from her content. That ruling, much like Raw Story's, underscores the evidentiary burden plaintiffs face. As explained by Maria Crusey in a piece for the Authors Alliance, "The uptick in §1202(b) claims raises challenging questions, namely: How does §1202(b) apply to the use of a copyrighted work as part of a dataset that must be cleaned, restructured, and processed in ways that separate copyright management information from the content itself?" Why this ruling matters for AI and content creators The dismissal of Raw Story's lawsuit is more than a win for OpenAI -- it's an indicator of how courts may handle similar copyright claims in the rapidly evolving landscape of generative AI. With OpenAI and its investor Microsoft currently defending against a similar lawsuit filed by The New York Times, the ruling can only help establish some precedent to dismiss this and future claims. Indeed, the ruling suggests that without clear, demonstrable harm or exact reproduction, plaintiffs may be challenged to get their day in court. Judge McMahon's ruling also touches on a broader point about how AI synthesizes data versus directly replicating it. OpenAI's ChatGPT doesn't directly recall articles from Raw Story -- it instead uses training data to produce novel outputs that resemble human writing. This makes proving violations under current copyright laws inherently difficult. For content creators, this raises a significant challenge: how to ensure proper credit and prevent unauthorized use of their work in training datasets. Licensing agreements like the ones OpenAI has struck with large news publishers such as Vogue and Wired owner Condé Nast could become a new standard, giving companies a way to legally use copyrighted content while compensating its creators. Between a bot and a hard place Courts are still figuring out how to handle generative AI, and recent rulings suggest they're reluctant to extend Section 1202(b) protections unless plaintiffs show real, specific harm. AI-generated content synthesizes rather than replicates, making it tough to prove copyright violations. For plaintiffs, this means proving harm is an uphill battle. Courts are signaling that vague claims aren't enough -- plaintiffs need hard evidence of damage. For developers and tech companies, even if the odds seem favorable, no one wants a lawsuit. Transparency, data records, and compliance are essential to avoid legal trouble. Judge McMahon noted the case could be refiled ("together with an explanation of why the proposed amendment would not be futile," she wrote), but significant obstacles remain.
Share
Share
Copy Link
A federal judge has dismissed a copyright lawsuit against OpenAI, filed by news outlets Raw Story and AlterNet, citing lack of evidence of harm. The case centered on OpenAI's use of news articles for AI training without consent.
OpenAI, the company behind ChatGPT, has successfully avoided a copyright lawsuit filed against it by news outlets Raw Story and AlterNet. U.S. District Judge Colleen McMahon dismissed the case on Thursday, citing a lack of evidence to support the claim of harm caused by OpenAI's use of news articles to train its large language models [1][2].
The lawsuit, filed in February, alleged that OpenAI had used thousands of articles from Raw Story and AlterNet without consent to train ChatGPT, violating copyright laws [1]. The plaintiffs claimed that OpenAI had removed copyright management information (CMI) from their articles, which they argued constituted a "concrete injury" [3].
However, Judge McMahon ruled that the outlets did not present enough evidence of harm to support their claims. She noted that the real issue at stake was not the exclusion of copyrighted materials but rather "the use of Plaintiffs' articles to develop ChatGPT without compensation" [2].
An OpenAI spokesperson defended the company's practices, stating, "We build our AI models using publicly available data, in a manner protected by fair use and related principles, and supported by longstanding and widely accepted legal precedents" [2][4]. This argument aligns with the broader debate in the AI industry about the use of publicly available information for training AI models.
While the judge dismissed the current lawsuit, she left the door open for the plaintiffs to file an amended complaint addressing the court's concerns [4]. This case highlights the ongoing legal challenges faced by AI companies regarding the use of copyrighted material in training datasets.
This lawsuit is part of a larger trend of legal actions against AI companies. The New York Times filed a similar lawsuit against OpenAI in December 2023, alleging unauthorized use of "millions" of NYT articles for chatbot training [2]. Other entities, including authors and publishers, have also challenged AI developers' use of copyrighted content [3][5].
In response to these legal challenges, some AI companies have begun forming partnerships with news organizations to legally acquire rights to content. OpenAI, for instance, has partnered with major news providers globally, including the Financial Times and Le Monde, to provide accurate, real-time news information in its AI responses [2].
As AI technology continues to advance, the legal landscape surrounding copyright and AI training data remains complex and evolving. This case demonstrates the challenges in balancing innovation in AI with the protection of intellectual property rights. It also highlights the need for clearer legal frameworks to address the unique issues posed by AI's use of copyrighted material [5].
Reference
[3]
OpenAI, the company behind ChatGPT, has responded to copyright infringement lawsuits filed by authors, denying allegations and asserting fair use. The case highlights the ongoing debate surrounding AI and intellectual property rights.
3 Sources
OpenAI engineers inadvertently erased crucial data related to an ongoing copyright lawsuit filed by The New York Times and Daily News, potentially impacting the case's progression and raising questions about evidence handling in AI-related legal disputes.
9 Sources
ANI, a major Indian news agency, has filed a lawsuit against OpenAI in the Delhi High Court, accusing the company of using its content without permission to train ChatGPT and attributing fabricated stories to ANI.
13 Sources
A federal judge in San Francisco has ruled that a copyright infringement lawsuit filed by artists against AI companies can move forward. The case challenges the use of copyrighted works to train AI image generators without permission or compensation.
5 Sources
A group of authors has filed a lawsuit against AI company Anthropic, alleging copyright infringement in the training of their AI chatbot Claude. The case highlights growing concerns over AI's use of copyrighted material.
14 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2024 TheOutpost.AI All rights reserved