24 Sources
24 Sources
[1]
Reddit Sues Perplexity, Others Over Alleged Data Scraping
Reddit Inc. sued Perplexity AI Inc. and three other companies over alleged data scraping from the discussion site without permission, a sign of the growing demand and value of original data in the burgeoning AI industry. Three data scraping companies -- Oxylabs UAB, AWMProxy, and SerpApi -- have been collecting Reddit data via Google search results for the purpose of reselling it, according to the complaint filed Wednesday in federal court in Manhattan. Perplexity has been buying that data from at least one of the companies, the suit alleges.
[2]
Reddit to Perplexity: Get your filthy hands off our forums
Social media site continues legal campaign against those who take its content without a license Reddit on Wednesday filed a lawsuit against Perplexity AI and three of its alleged data dealers for trafficking in unlawfully scraped information. The complaint, filed in the Southern District of New York, claims that Oxylabs UAB, AWM Proxy, and SerpApi unlawfully bypassed Reddit's and Google's defenses to harvest Reddit content and related search results. It also says that Perplexity chose to purchase the purloined data rather than license it from Reddit. Ben Lee, chief legal officer at Reddit, told The Register in an emailed statement that AI companies are desperate for quality content generated by real people and that need is fueling an industrial scale data laundering economy. "Scrapers bypass technological protections to steal data, then sell it to clients hungry for training material," said Lee. "Reddit is a prime target because it's one of the largest and most dynamic collections of human conversation ever created." Lee claimed that Oxylabs UAB, a data scraping business based in Lithuania, AWM Proxy, a former Russian botnet, and SerpApi, which advertises real-time access to scraped Google search results, represent textbook examples of this sort of illegal behavior. "Unable to scrape Reddit directly, they mask their identities, hide their locations, and disguise their web scrapers to steal Reddit content from Google Search," said Lee. "Perplexity is a willing customer of at least one of these scrapers, choosing to buy stolen data rather than enter into a lawful agreement with Reddit itself." Reddit's complaint likens these three providers to "would-be bank robbers, who, knowing they cannot get into the bank vault, break into the armored truck carrying the cash instead." Echoing Cloudflare CEO Matthew Prince's characterization of Perplexity, the Reddit legal filing describes Perplexity as "more akin to a 'North Korean hacker'" who will do whatever is necessary to obtain the data to fuel its AI answer engine, other than pay for a license. Google is not participating in the lawsuit but has tried to prevent automated scraping of its search results. The social media contends that the defendants have violated the US Digital Millennium Copyright Act by bypassing its technological defenses against automated access to its servers. And it accuses SerpApi and Oxylabs specifically of violating the DMCA's prohibition on trafficking in technology circumvention products or services. Other claims include unfair competition, unjust enrichment, and civil conspiracy. Reddit is seeking an injunction to halt the unwanted scraping of its content and damages. In June, Reddit filed a similar complaint against Anthropic after it failed to convince the AI business to enter into a content licensing deal as OpenAI has done. Oxylabs, which advertises itself as "the largest ethical proxy network and advanced scraping solutions empowering the AI industry and beyond," did not immediately respond to a request for comment. SerpApi also did not respond to requests for comment. A spokesperson for Perplexity told The Register, "Perplexity has not yet received the lawsuit, but we will always fight vigorously for users' rights to freely and fairly access public knowledge. Our approach remains principled and responsible as we provide factual answers with accurate AI, and we will not tolerate threats against openness and the public interest." Reddit is not alone in its attempts to defend against its content being scraped and used to train AI models without consent. A lawsuit [PDF] filed last month on behalf of two authors accuses Apple of "using Books3, a dataset of pirated copyrighted books" to train its OpenELM language models. The complaint against Apple says that the company's AppleBot has been scraping web data for nine years and that data is now being used to improve Apple Intelligence models. Another case, Millette v. OpenAI (2024), contends that OpenAI scraped YouTube videos unlawfully to improve its models. The New York Times Co. v. Microsoft Corp., OpenAI (2023) makes similar allegations with regard to Microsoft's and OpenAI's alleged use of its news content. In August, content delivery network Cloudflare called out Perplexity for running web scraping bots that ignore websites' no-scraping directives. ®
[3]
Reddit launches copyright suit against AI search engine Perplexity
Social media platform Reddit has filed a copyright lawsuit against Perplexity, accusing the AI company of illegally scraping its data in order to train the model powering its search engine. The complaint filed in New York federal court on Wednesday marks the latest legal tussle between AI groups over alleged copyrighted material. Reddit also sued three smaller groups: Lithuanian data scraper Oxylabs UAB, former Russian botnet AWMProxy, and Texas start-up SerpApi. Reddit claims the three groups provided data-scraping services for hoovering up copyrighted Reddit content "by masking their identities, hiding their locations, and disguising their web scrapers as regular people". "AI companies are locked in an arms race for quality human content -- and that pressure has fuelled an industrial-scale "data laundering" economy", Ben Lee, chief legal officer at Reddit said in a statement. Perplexity was "a willing customer of at least one of its co-defendants", the social media company wrote in the filing, alleging that the San Francisco-based AI group "desperately" needed "to fuel its "answer engine" by scraping data through Google search results. "We strongly disagree with Reddit's allegations and intend to vigorously defend ourselves in court," SerpApi said. Two people familiar with the matter told the Financial Times that Reddit had confronted Perplexity about its alleged theft and suggested they enter discussions about a paid partnership, but that its founder Aravind Srinivas was not interested. Reddit had also contacted Google with its concerns, asking the tech giant to investigate if Perplexity was scraping Reddit's proprietary data through its search engine and if so, to work out how to prevent this, the people added. A spokesman for Google declined to comment. The suit adds to dozens of copyright lawsuits that have been filed against AI companies since the advent of generative AI systems, which are trained using vast amounts of text data, including content from the internet. Copyright holders have claimed their content has been used without consent or fair compensation. Reddit, which went public in March 2024 and is known for hosting devoted online communities, has struck multimillion-dollar partnerships with Google and OpenAI allowing them to train their large language models on its content. By contrast, Reddit alleged in the complaint that the defendants had circumvented their data protection measures to obtain its copyrighted material without permission. Lee said Reddit was "a prime target because it's one of the largest and most dynamic collections of human conversation ever created". In June, Reddit filed a similar lawsuit against Anthropic, alleging the AI start-up had scraped its platform more than 100,000 times since July 2024. Anthropic responded at the time that it "disagreed" with Reddit's claims and would "defend ourselves vigorously". Perplexity and Oxylabs did not immediately respond to a request for comment. AWMProxy could not be reached for comment.
[4]
Reddit sues Perplexity for scraping data to train AI system
Oct 22 (Reuters) - Social media platform Reddit sued artificial intelligence startup Perplexity in New York federal court on Wednesday, accusing it and three other companies of unlawfully scraping its data to train Perplexity's AI-based search engine. Reddit said in the complaint, opens new tab that the data-scraping companies circumvented its data protection measures in order to steal data that Perplexity "desperately needs" to power its "answer engine" system. The case is one of many filed by content owners against tech companies over the alleged misuse of their copyrighted material to train AI systems. Reddit filed a similar lawsuit against AI startup Anthropic in June that is still ongoing. "Our approach remains principled and responsible as we provide factual answers with accurate AI, and we will not tolerate threats against openness and the public interest," Perplexity said in a statement. "AI companies are locked in an arms race for quality human content -- and that pressure has fueled an industrial-scale 'data laundering' economy," Reddit chief legal officer Ben Lee said in a statement. Reddit, which features thousands of interest-based "subreddit" web communities, said in the lawsuit that it is the most commonly cited source for AI-generated answers to user questions. It has licensed its content to Google, OpenAI and others for their AI training. Reddit said that Lithuania-based Oxylabs, Russia-based AWMProxy and Texas-based SerpApi scraped Reddit data from billions of search results without permission and that Perplexity, which does not have a license to use Reddit content, worked with at least one of the data-scraping companies to obtain Reddit material. "We strongly disagree with Reddit's allegations and intend to vigorously defend ourselves in court," a SerpApi spokesperson said. Spokespeople for Oxylabs did not immediately respond to a request for comment on the case, and AWMProxy could not be reached for comment. Reddit said it sent Perplexity a cease-and-desist letter last year, after which it "increased the volume of citations to Reddit forty-fold." Reddit asked the court for unspecified monetary damages and an order blocking Perplexity from using its data. Reporting by Blake Brittain in Washington Editing by Nick Zieminski Our Standards: The Thomson Reuters Trust Principles., opens new tab
[5]
Reddit sues Perplexity for scraping of posts, expanding user data battle with AI industry
It comes amid a similar lawsuit from Reddit against AI firm Anthropic, as the social media platform attempts to assert ownership over its user data through licensing agreements. Social media giant Reddit has launched a lawsuit against artificial intelligence company Perplexity, alleging that it illegally scraped user posts to train its AI model, marking the latest data-rights clash between content owners and the AI industry. The complaint filed in New York federal court on Wednesday also named three defendants, which Reddit says helped Perplexity collect its data: Lithuanian data scraper Oxylabs, "former Russian botnet" AWMProxy, and Texas startup SerpApi. Reddit alleged that the three smaller entities were able to extract its copyrighted content "by masking their identities, hiding their locations and disguising their web scrapers as regular people." Perplexity, which runs an AI-powered search engine, denied the allegations and accused Reddit of "extortion" and opposition to an open internet, while SerpApi told CNBC it "strongly disagrees" with Reddit's claims and intends to defend itself in court. The case represents one of many filed by content owners accusing AI firms of using copyrighted material without permission to train their large language models. Reddit, in particular, has been on the front lines of that battle, having launched a similar ongoing lawsuit against AI startup Anthropic in June. CNBC was unable to reach Oxylabs and AWMProxy. In a statement shared with CNBC, Ben Lee, Chief Legal Officer at Reddit, said that AI companies are" locked in an arms race for quality human content" and that pressure has fueled an "industrial-scale 'data laundering' economy." Scrapers bypass technological protections to steal data, then sell it to clients hungry for training material. Reddit is a prime target because it's one of the largest and most dynamic collections of human conversation ever created. Reddit -- which hosts over 100,000 interest-based "subreddit" communities -- said in its lawsuit that its user posts had become the most commonly cited source for AI-generated answers on Perplexity. It added that it sent Perplexity a cease-and-desist letter, after which it "increased the volume of citations to Reddit forty-fold." AI researchers have previously noted that Reddit's large volume of moderated conversations can help make AI chatbots produce more natural-sounding responses. In the age of artificial intelligence, Reddit has worked to leverage its massive data pool, permitting access to it only through AI-related licensing agreements. The social media company has signed such agreements with OpenAI and Alphabet's Google. In a response to the lawsuit, Perplexity, in a post on the Reddit platform, argued that it does not train AI models on content but merely summarizes and cites public Reddit discussions. Therefore, it said it is "impossible" to sign a license agreement. "A year ago, after explaining this, Reddit insisted we pay anyway, despite lawfully accessing Reddit data. Bowing to strong arm tactics just isn't how we do business," the statement read, going on to describe the suit as a "show of force in Reddit's training data negotiations with Google and OpenAI." "Perplexity believes this is a sad example of what happens when public data becomes a big part of a public company's business model," Perplexity added, noting that data licensing has become an increasingly important source of revenue for Reddit. In February, Reddit's COO Jen Wong told the trade publication Adweek that AI licensing deals with Google and OpenAI made up nearly 10% of Reddit's revenue.
[6]
Reddit sues AI company Perplexity and others for 'industrial-scale' scraping of user comments
Social media platform Reddit sued the artificial intelligence company Perplexity AI and three other entities on Wednesday, alleging their involvement in an "industrial-scale, unlawful" economy to "scrape" the comments of millions of Reddit users for commercial gain. Reddit's lawsuit in a New York federal court takes aim at San Francisco-based Perplexity, maker of an AI chatbot and "answer engine" that competes with Google, ChatGPT and others in online search. Also named in the lawsuit are Lithuanian data-scraping company Oxylabs UAB, a web domain called AWMProxy that Reddit describes as a "former Russian botnet," and Texas-based startup SerpApi. It's the second such lawsuit from Reddit since it sued another major AI company, Anthropic, in June. But the lawsuit filed Wednesday is different in the way that it confronts not just an AI company but the lesser-known services the AI industry relies on to acquire online writings needed to train AI chatbots. "Scrapers bypass technological protections to steal data, then sell it to clients hungry for training material. Reddit is a prime target because it's one of the largest and most dynamic collections of human conversation ever created," said Ben Lee, Reddit's chief legal officer, in a statement Wednesday. Perplexity said it has not yet received the lawsuit but "will always fight vigorously for users' rights to freely and fairly access public knowledge. Our approach remains principled and responsible as we provide factual answers with accurate AI, and we will not tolerate threats against openness and the public interest." Oxylabs and SerpAPI didn't immediately respond to requests for comment Wednesday. AWMProxy could not immediately be reached for comment. Reddit compares the companies it is suing to "would-be bank robbers" who can't get into the bank vault, so they break into the armored truck instead. The lawsuit alleges they are evading Reddit's own anti-scraping measures while also "circumventing Google's controls and scraping Reddit content directly from Google's search engine results." Lee said that because they're unable to scrape Reddit directly, "they mask their identities, hide their locations, and disguise their web scrapers to steal Reddit content from Google Search. Perplexity is a willing customer of at least one of these scrapers, choosing to buy stolen data rather than enter into a lawful agreement with Reddit itself." Much like its lawsuit against Anthropic, maker of the chatbot Claude, Reddit claims that Perplexity has accessed Reddit's content despite being asked not to do so. Reddit made a similar argument in its lawsuit against Anthropic. That case was initially filed in California Superior Court but was later moved to federal court and has a hearing scheduled for January. Along with digitized books and news articles, websites such as Wikipedia and Reddit are deep troves of written materials that can help teach an AI assistant the patterns of human language. Reddit has previously entered licensing agreements with Google, OpenAI and other companies that are paying to be able to train their AI systems on the public commentary of Reddit's more than 100 million daily users. The licensing deals helped the 20-year-old online platform raise money ahead of its Wall Street debut as a publicly traded company last year.
[7]
Reddit sues Perplexity and three other companies for allegedly using its content without paying
Reddit is suing companies SerApi, OxyLabs, AWMProxy and Perplexity for allegedly scraping its data from search results and using it without a license, The New York Times reports. The new lawsuit follows legal action against AI startup Anthropic, who allegedly used Reddit content to train its Claude chatbot. As of 2023, Reddit charges companies looking access to posts and other content in the hopes of making money on data that could be used for AI training. The company has also signed licensing deals with companies like Google and OpenAI, and even built an AI answer machine of its own to leverage the knowledge in users' posts. Scraping search results for Reddit content avoids those payments, which is why the company is seeking financial damages and a permanent injunction that prevents companies from selling previously scraped Reddit material. Some of the companies Reddit is focused on, like SerApi, OxyLabs and AWMProxy, are not exactly household names, but they've all made collecting data from search results and selling it a key part of their business. Perplexity's inclusion in the lawsuit might be more obvious. The AI company needs data to train its models, and has already been caught seemingly copying and regurgitating material it hasn't paid to license. That also includes reportedly ignoring the robots.txt protocol, a way for websites to communicate that they don't want their material scraped. Per a copy of the lawsuit provided to Engadget, Reddit had already sent a cease-and-desist to Perplexity asking it to stop scraping posts without a license. The company claimed it didn't use Reddit data, but it also continued to cite the platform in answers from its chatbot. Reddit says it was able to prove Perplexity was using scraped Reddit content by creating a "test post" that "could only be crawled by Google's search engine and was not otherwise accessible anywhere on the internet." Within a few hours, queries made to Perplexity's answer engine were able to reproduce the content of the post. "The only way that Perplexity could have obtained that Reddit content and then used it in its 'answer engine' is if it and/or its co-defendants scraped Google [search results] for that Reddit content and Perplexity then quickly incorporated that data into its answer engine," the lawsuit claims. When asked to comment, Perplexity provided the following statement: This new lawsuit fits with the aggressive stance Reddit has taken towards protecting its data, including rate-limiting unknown bots and web crawlers in 2024, and even limiting what access the Internet Archive's Wayback Machine has to its site in August 2025. The company has also sought to define new terms around how websites are crawled by adopting the Really Simple Licensing standard, which adds licensing terms to robots.txt.
[8]
Reddit sues Perplexity over data scraping
Driving the news: Reddit is accusing Perplexity and the other firms of what it dubs "data laundering," whereby the data firms scrape loads of data and then sell it to AI firms, in this case Perplexity. * The lawsuit alleges that the defendants evade Reddit anti-scraping measures and circumvent Google's controls by scraping Reddit results from Google search. * Perplexity was not immediately available for comment. What they're saying: "AI companies are locked in an arms race for quality human content -- and that pressure has fueled an industrial-scale 'data laundering' economy," Reddit Chief Legal Officer Ben Lee said in a statement. * "Scrapers bypass technological protections to steal data, then sell it to clients hungry for training material. " * "Defendants are similar to would-be bank robbers, who, knowing they cannot get into the bank vault, break into the armored truck carrying the cash instead," the suit says. The big picture: Perplexity is also facing suits from other publishers, including Encyclopedia Britannica, The New York Times and other newspapers, while similar lawsuits have been brought against OpenAI and others.
[9]
Reddit Sues Perplexity AI, Alleging 'Industrial-Scale' Data Theft - Decrypt
The lawsuit names Perplexity, SerpApi, Oxylabs, and AWM Proxy as defendants. Social media platform Reddit has sued Perplexity AI in federal court on Wednesday, alleging that the artificial intelligence company and its data partners orchestrated an " industrial-scale" scheme to scrape the platform's user-generated content. Reddit alleges that the other defendants: SerpApi, Oxylabs, and AWM Proxy, developed and sold tools specifically designed to break security measures protecting its content, enabling the large-scale scraping of Reddit data from search results. The tools were allegedly built with the intention of bypassing two layers of protection: first, by evading Reddit's own anti-scraping systems, and second, by circumventing Google's controls to extract Reddit content directly from its search engine results. The data companies operated as "data-scraping service providers" and "circumvented Google's technological control measures and automatedly accessed, without authorization, almost three billion search engine results pages," a copy of the lawsuit reads. Reddit claims Perplexity used data from the three firms for its answer engine even after receiving a cease-and-desist letter in May 2024. A representative from Perplexity responded and shared a full response, posted on Reddit. Perplexity intentionally posted its response on Reddit "to illustrate a simple point: it's a public Reddit link accessible to anyone, yet by the logic of Reddit's lawsuit, if you refer to it in any way, they just might sue you too," the representative told Decrypt. Perplexity described the lawsuit as "a sad example of what happens when public data becomes a big part of a public company's business model." "Reddit thinks that's their right. But it is the opposite of an open internet," Perplexity stated. A representative from SerpApi told Decrypt they did not receive "any communication or service from Reddit" on the matter, adding that they "strongly disagree with Reddit's allegations" and intend to seek legal recourse. "No company should claim ownership of public data that does not belong to them. It is possible that it is just an attempt to sell the same public data at an inflated price," Denas Grybauskas, chief governance and strategy officer at Oxylabs, told Decrypt in an emailed statement. Reddit similarly "made no attempt to speak" with Oxylabs, Grybauskas said. Decrypt has reached out to Reddit, Google, and AWM Proxy for comment and will update this article should they respond. In cases like this, courts would need to look first at whether the terms of service from platforms like Reddit "explicitly addresses AI training, data scraping, and commercial use," Andrew Rossow, public affairs attorney and director of strategic partnerships at video search and content intelligence platform Oriane, told Decrypt. If a user agreed to terms that "grant the platform a broad, perpetual, royalty-free license to their content," that license "generally governs the relationship between the user and the platform," Rossow explained. But it doesn't "automatically grant the AI company a license" to do the same, unless the terms permitted the platform "to sublicense or sell the data for that purpose," he added. Courts would then have to "distinguish between the user's copyright in their expression (the text of the post) and the use of the content for data mining (extracting patterns, facts, and language models)," he explained. Still, the supposed "knowledge" behind an LLM (large-language model) "is the product of millions of users' time, effort, and creative expression," Rossow argued. "Treating this human-generated content as a free, raw, undifferentiated resource is a form of labor exploitation that devalues online contributions," Rossow opined, adding that AI companies need to "respect digital citizenship and community norms," given how these are "the implicit and explicit rules of the digital public spaces they ingest."
[10]
Reddit is suing Perplexity and AI data scraping firms for using its data without permission - SiliconANGLE
Reddit is suing Perplexity and AI data scraping firms for using its data without permission Reddit Inc. has launched lawsuits against startup Perplexity AI Inc. and three data-scraping service providers for trawling the company's copyrighted content to be used to train AI models. Reddit compared the data scraping companies -- SerpApi, Oxylabs, and AWMProxy -- to "bank robbers," adding that one of the firms "will apparently do anything to get the Reddit data it desperately needs to fuel its 'answer engine' -- that is, anything other than enter into an agreement with Reddit directly, as some of its competitors have done." A number of AI have already made deals with Reddit, including OpenAI, which signed on the dotted line last year to use Reddit's trove of data to train its large language models. While no number was given, it was rumored that the deal was worth $60 million. At the time, Reddit said it hoped to bring in around $200 million from licensing agreements over the next three years, with Google LLC also signing on. The company later launched a lawsuit against Anthropic PBC, claiming it was scraping content on Reddit to train its Claude family of AI models, making this latest lawsuit, filed in the U.S. District Court for the Southern District of New York today, one of a handful currently ongoing. Data scraping firms are a fairly new phenomenon that appeared shortly after the generative AI explosion. According to the New York Times, SerpApi is based in Texas and serves a number of companies. Oxylabs is run out of Lithuania, and AWMProxy is Russian. "AI companies are locked in an arms race for quality human content -- and that pressure has fueled an industrial-scale 'data laundering' economy," Ben Lee, the chief legal officer at Reddit, told The Times. "Scrapers bypass technological protections to steal data, then sell it to clients hungry for training material." According to the lawsuit, Reddit claimed it set a trap for Perplexity by publishing a "test post" on its platform that was visible only to Google's search engine and inaccessible anywhere else on the internet. Within hours, the content of that hidden post appeared in Perplexity's search results, Reddit said. Perplexity has said it hasn't yet received the lawsuit, but told media it will "fight vigorously for users' rights to freely and fairly access public knowledge." It added, "Our approach remains principled and responsible as we provide factual answers with accurate AI, and we will not tolerate threats against openness and the public interest."
[11]
Reddit sues Perplexity and others for allegedly scraping millions of user comments
Reddit's lawsuit in a New York federal court takes aim at San Francisco-based Perplexity, maker of an AI chatbot and "answer engine" that competes with Google, ChatGPT, and others in online search. Also named in the lawsuit are Lithuanian data-scraping company Oxylabs UAB, a web domain called AWMProxy that Reddit describes as a "former Russian botnet," and Texas-based startup SerpApi, which lists Perplexity as a customer on its website. It's the second such lawsuit from Reddit since it sued another major AI company, Anthropic, in June.
[12]
Reddit sues over 'industrial-scale' scraping of user comments
Reddit has sued Perplexity AI and three other entities for allegedly scraping user comments for commercial gain Social media platform Reddit sued the artificial intelligence company Perplexity AI and three other entities on Wednesday, alleging their involvement in an "industrial-scale, unlawful" economy to "scrape" the comments of millions of Reddit users for commercial gain. Reddit's lawsuit in a New York federal court takes aim at San Francisco-based Perplexity, maker of an AI chatbot and "answer engine" that competes with Google, ChatGPT and others in online search. Also named in the lawsuit are Lithuanian data-scraping company Oxylabs UAB, a web domain called AWMProxy that Reddit describes as a "former Russian botnet," and Texas-based startup SerpApi. It's the second such lawsuit from Reddit since it sued another major AI company, Anthropic, in June. But the lawsuit filed Wednesday is different in the way that it confronts not just an AI company but the lesser-known services the AI industry relies on to acquire online writings needed to train AI chatbots. "Scrapers bypass technological protections to steal data, then sell it to clients hungry for training material. Reddit is a prime target because it's one of the largest and most dynamic collections of human conversation ever created," said Ben Lee, Reddit's chief legal officer, in a statement Wednesday. Perplexity said it has not yet received the lawsuit but "will always fight vigorously for users' rights to freely and fairly access public knowledge. Our approach remains principled and responsible as we provide factual answers with accurate AI, and we will not tolerate threats against openness and the public interest." Oxylabs and SerpAPI didn't immediately respond to requests for comment Wednesday. AWMProxy could not immediately be reached for comment. Reddit compares the companies it is suing to "would-be bank robbers" who can't get into the bank vault, so they break into the armored truck instead. The lawsuit alleges they are evading Reddit's own anti-scraping measures while also "circumventing Google's controls and scraping Reddit content directly from Google's search engine results." Lee said that because they're unable to scrape Reddit directly, "they mask their identities, hide their locations, and disguise their web scrapers to steal Reddit content from Google Search. Perplexity is a willing customer of at least one of these scrapers, choosing to buy stolen data rather than enter into a lawful agreement with Reddit itself." Much like its lawsuit against Anthropic, maker of the chatbot Claude, Reddit claims that Perplexity has accessed Reddit's content despite being asked not to do so. Reddit made a similar argument in its lawsuit against Anthropic. That case was initially filed in California Superior Court but was later moved to federal court and has a hearing scheduled for January. Along with digitized books and news articles, websites such as Wikipedia and Reddit are deep troves of written materials that can help teach an AI assistant the patterns of human language. Reddit has previously entered licensing agreements with Google, OpenAI and other companies that are paying to be able to train their AI systems on the public commentary of Reddit's more than 100 million daily users. The licensing deals helped the 20-year-old online platform raise money ahead of its Wall Street debut as a publicly traded company last year.
[13]
Reddit sues AI company Perplexity and others, claiming 'industrial-scale' scraping of user comments
Social media platform Reddit sued the artificial intelligence company Perplexity AI and three other entities on Wednesday, alleging their involvement in an "industrial-scale, unlawful" economy to "scrape" the comments of millions of Reddit users for commercial gain. Reddit's lawsuit in a New York federal court takes aim at San Francisco-based Perplexity, maker of an AI chatbot and "answer engine" that competes with Google, ChatGPT and others in online search. Also named in the lawsuit are Lithuanian data-scraping company Oxylabs UAB, a web domain called AWMProxy that Reddit describes as a "former Russian botnet," and Texas-based startup SerpApi. It's the second such lawsuit from Reddit since it sued another major AI company, Anthropic, in June. But the lawsuit filed Wednesday is different in the way that it confronts not just an AI company but the lesser-known services the AI industry relies on to acquire online writings needed to train AI chatbots. "Scrapers bypass technological protections to steal data, then sell it to clients hungry for training material. Reddit is a prime target because it's one of the largest and most dynamic collections of human conversation ever created," said Ben Lee, Reddit's chief legal officer, in a statement Wednesday. Perplexity said it has not yet received the lawsuit but "will always fight vigorously for users' rights to freely and fairly access public knowledge. Our approach remains principled and responsible as we provide factual answers with accurate AI, and we will not tolerate threats against openness and the public interest." Oxylabs and SerpAPI didn't immediately respond to requests for comment Wednesday. AWMProxy could not immediately be reached for comment. Reddit compares the companies it is suing to "would-be bank robbers" who can't get into the bank vault, so they break into the armored truck instead. The lawsuit alleges they are evading Reddit's own anti-scraping measures while also "circumventing Google's controls and scraping Reddit content directly from Google's search engine results." Lee said that because they're unable to scrape Reddit directly, "they mask their identities, hide their locations, and disguise their web scrapers to steal Reddit content from Google Search. Perplexity is a willing customer of at least one of these scrapers, choosing to buy stolen data rather than enter into a lawful agreement with Reddit itself." Much like its lawsuit against Anthropic, maker of the chatbot Claude, Reddit claims that Perplexity has accessed Reddit's content despite being asked not to do so. Reddit made a similar argument in its lawsuit against Anthropic. That case was initially filed in California Superior Court but was later moved to federal court and has a hearing scheduled for January. Along with digitized books and news articles, websites such as Wikipedia and Reddit are deep troves of written materials that can help teach an AI assistant the patterns of human language. Reddit has previously entered licensing agreements with Google, OpenAI and other companies that are paying to be able to train their AI systems on the public commentary of Reddit's more than 100 million daily users. The licensing deals helped the 20-year-old online platform raise money ahead of its Wall Street debut as a publicly traded company last year.
[14]
Reddit Sues AI Company Perplexity and Others for 'Industrial-Scale' Scraping of User Comments
Social media platform Reddit sued the artificial intelligence company Perplexity AI and three other entities on Wednesday, alleging their involvement in an "industrial-scale, unlawful" economy to "scrape" the comments of millions of Reddit users for commercial gain. Reddit's lawsuit in a New York federal court takes aim at San Francisco-based Perplexity, maker of an AI chatbot and "answer engine" that competes with Google, ChatGPT and others in online search. Also named in the lawsuit are Lithuanian data-scraping company Oxylabs UAB, a web domain called AWMProxy that Reddit describes as a "former Russian botnet," and Texas-based startup SerpApi. It's the second such lawsuit from Reddit since it sued another major AI company, Anthropic, in June. But the lawsuit filed Wednesday is different in the way that it confronts not just an AI company but the lesser-known services the AI industry relies on to acquire online writings needed to train AI chatbots. "Scrapers bypass technological protections to steal data, then sell it to clients hungry for training material. Reddit is a prime target because it's one of the largest and most dynamic collections of human conversation ever created," said Ben Lee, Reddit's chief legal officer, in a statement Wednesday. Perplexity said it has not yet received the lawsuit but "will always fight vigorously for users' rights to freely and fairly access public knowledge. Our approach remains principled and responsible as we provide factual answers with accurate AI, and we will not tolerate threats against openness and the public interest." Oxylabs and SerpAPI didn't immediately respond to requests for comment Wednesday. AWMProxy could not immediately be reached for comment. Reddit compares the companies it is suing to "would-be bank robbers" who can't get into the bank vault, so they break into the armored truck instead. The lawsuit alleges they are evading Reddit's own anti-scraping measures while also "circumventing Google's controls and scraping Reddit content directly from Google's search engine results." Lee said that because they're unable to scrape Reddit directly, "they mask their identities, hide their locations, and disguise their web scrapers to steal Reddit content from Google Search. Perplexity is a willing customer of at least one of these scrapers, choosing to buy stolen data rather than enter into a lawful agreement with Reddit itself." Much like its lawsuit against Anthropic, maker of the chatbot Claude, Reddit claims that Perplexity has accessed Reddit's content despite being asked not to do so. Reddit made a similar argument in its lawsuit against Anthropic. That case was initially filed in California Superior Court but was later moved to federal court and has a hearing scheduled for January. Along with digitized books and news articles, websites such as Wikipedia and Reddit are deep troves of written materials that can help teach an AI assistant the patterns of human language. Reddit has previously entered licensing agreements with Google, OpenAI and other companies that are paying to be able to train their AI systems on the public commentary of Reddit's more than 100 million daily users. The licensing deals helped the 20-year-old online platform raise money ahead of its Wall Street debut as a publicly traded company last year.
[15]
Reddit Sues Perplexity for Scraping Data to Train AI System
The firms have been accused of unlawfully scraping Reddit's data Social media platform Reddit sued artificial intelligence startup Perplexity in New York federal court on Wednesday, accusing it and three other companies of unlawfully scraping its data to train Perplexity's AI-based search engine. Reddit said in the complaint that the data-scraping companies circumvented its data protection measures in order to steal data that Perplexity "desperately needs" to power its "answer engine" system. The case is one of many filed by content owners against tech companies over the alleged misuse of their copyrighted material to train AI systems. Reddit filed a similar lawsuit against AI startup Anthropic in June that is still ongoing. "Our approach remains principled and responsible as we provide factual answers with accurate AI, and we will not tolerate threats against openness and the public interest," Perplexity said in a statement. "AI companies are locked in an arms race for quality human content - and that pressure has fueled an industrial-scale 'data laundering' economy," Reddit chief legal officer Ben Lee said in a statement. Reddit, which features thousands of interest-based "subreddit" web communities, said in the lawsuit that it is the most commonly cited source for AI-generated answers to user questions. It has licensed its content to Google, OpenAI and others for their AI training. Reddit said that Lithuania-based Oxylabs, Russia-based AWMProxy and Texas-based SerpApi scraped Reddit data from billions of search results without permission and that Perplexity, which does not have a license to use Reddit content, worked with at least one of the data-scraping companies to obtain Reddit material. "We strongly disagree with Reddit's allegations and intend to vigorously defend ourselves in court," a SerpApi spokesperson said. Oxylabs said in a statement that it was "shocked and disappointed by this news, as Reddit has made no attempt to speak with us directly," and that it would also defend itself against the allegations. AWMProxy could not be reached for comment. Reddit said it sent Perplexity a cease-and-desist letter last year, after which it "increased the volume of citations to Reddit forty-fold." Reddit asked the court for unspecified monetary damages and an order blocking Perplexity from using its data.
[16]
Reddit sues Perplexity for scraping data to train AI system
Reddit sued artificial intelligence startup Perplexity in New York federal court on Wednesday, accusing it and three other companies of unlawfully scraping its data to train Perplexity's AI-based search engine. Social media platform Reddit sued artificial intelligence startup Perplexity in New York federal court on Wednesday, accusing it and three other companies of unlawfully scraping its data to train Perplexity's AI-based search engine. Reddit said in the complaint that Perplexity circumvented its data-protection measures in order to steal the data that it "desperately needs" to power its "answer engine" system.
[17]
Reddit Accuses Perplexity, Other 'Data Scrapers' Of Stealing - Reddit (NYSE:RDDT)
Reddit Inc (NYSE:RDDT) filed a lawsuit Wednesday accusing four companies of illegally stealing its data by scraping Google search results containing Reddit content, according to The New York Times. RDDT stock is showing notable weakness. Check the market position here. The suit, filed in U.S. District Court for the Southern District of New York, targets SerpApi, Lithuanian start-up Oxylabs, Russian company AWMProxy and San Francisco-based Perplexity. Three of the companies allegedly sold scraped data to AI companies including OpenAI and Meta, while Perplexity operates its own AI search engine. "Recognizing they lack permission to access the data directly from Reddit, defendants have devised a scheme to scrape the data from Google's search results," the lawsuit states. The companies allegedly mask their identities and disguise web scrapers as regular users to bypass technical restrictions at industrial scale. Reddit, used by over 416 million people weekly, hosts discussions spanning diverse topics, making it particularly valuable for improving AI chatbot natural language capabilities. In 2023, Reddit began charging for data access and established licensing agreements with Google for its Gemini chatbot and OpenAI for ChatGPT. The report indicates that some companies allegedly evaded these deals through scrapers. Perplexity previously scraped Reddit without paying for it, but agreed to stop after receiving a cease-and-desist from the company. However, citations to Reddit data in Perplexity results jumped "fortyfold," according to the lawsuit. New York Times reported that Reddit created a "trap" for Perplexity by posting test content that could only be accessible to Google's search engine and was otherwise not obtainable. Within hours, that content surfaced in Perplexity search results, the lawsuit states. Reddit said it has invested "tens of millions of dollars" in anti-scraping systems. The company is seeking a permanent injunction, financial damages and prohibition of any use or sale of previously scraped Reddit data. RDDT Price Action: Reddit shares were down 5.98% at $193.41 at the time of publication on Wednesday, according to Benzinga Pro. Read Next: Elizabeth Warren Blasts Amazon For Internet Meltdown, Says 'If A Company Can Break The Entire Internet, They Are Too Big' Image: JarTee/Shutterstock.com RDDTReddit Inc$193.98-5.70%OverviewMarket News and Data brought to you by Benzinga APIs
[18]
Reddit Sues Perplexity Over Alleged Data Scraping | PYMNTS.com
By completing this form, you agree to receive marketing communications from PYMNTS and to the sharing of your information with our sponsor, if applicable, in accordance with our Privacy Policy and Terms and Conditions. The suit names Perplexity AI, Oxylabs UAB, AWMProxy and SerpApi as defendants. Reddit said the firms allegedly obtained its data through Google search results. They then resold it to AI companies without consent or compensation. According to the filing, Perplexity purchased Reddit data from at least one of the scraping firms. Reddit Chief Legal Officer Ben Lee said the lawsuit represents a wider challenge for the industry. AI models depend increasingly on high-quality, human-generated text. "AI companies are locked in an arms race for quality human content, and that pressure has fueled an industrial-scale data laundering economy," Lee said in a statement quoted by Bloomberg. Reddit's repository of public conversations has become a critical resource for training generative AI models. The company has already signed paid data-licensing deals with OpenAI and Google. These grants offer structured access to its posts and comment threads. But Reddit claims other firms are exploiting its data without authorization. The company says this practice undermines fair competition and creator rights. Earlier this year, Reddit filed a similar case against Anthropic, alleging that the AI startup unlawfully used Reddit data to train its large language models. As PYMNTS reported, that lawsuit signaled Reddit's effort to assert ownership over its collection of human conversation as the AI industry races to secure training data. The case, Reddit Inc. v. SerpApi LLC, 25-cv-08736, could help define how U.S. courts interpret the legality of web-scraped content used in AI model training. Spokespeople for Perplexity, SerpApi and Oxylabs did not respond to requests for comment Legal experts say Reddit's lawsuit is part of a growing wave of disputes shaping data governance and compliance. As law firm Nelson Mullins noted, cases such as The New York Times v. OpenAI are forcing companies to reassess how they manage content ownership, consent and data provenance.
[19]
Reddit Sues Perplexity AI and Data Firms Over Alleged Unauthorized Scraping | PYMNTS.com
By completing this form, you agree to receive marketing communications from PYMNTS and to the sharing of your information with our sponsor, if applicable, in accordance with our Privacy Policy and Terms and Conditions. The complaint, submitted Wednesday in federal court in Manhattan, names data-scraping firms Oxylabs UAB, AWMProxy, and SerpApi. The suit alleges these companies have been extracting Reddit content through Google search results and reselling it to third parties. Perplexity AI is accused of purchasing that data from at least one of those entities, per Bloomberg. Reddit is seeking financial compensation as well as an injunction to halt what it claims is unauthorized data collection and use in violation of U.S. copyright law. Shares of Reddit, based in San Francisco, reportedly dropped 6.5% in afternoon trading in New York following news of the suit, according to Bloomberg. The company's extensive repository of user-generated discussions has become a highly sought-after resource as AI developers race to train systems on real human interactions and opinions. Reddit has already secured licensing deals with OpenAI and Google, allowing them to use its data for AI training. However, it continues to pursue legal action against others it believes are exploiting its data without proper authorization. Related: Reddit Files Suit Against Anthropic Over Alleged Data Misuse for AI Training Earlier this year, Reddit also filed a lawsuit against AI startup Anthropic, alleging similar scraping practices. Ben Lee, Reddit's chief legal officer, told Bloomberg that "AI companies are locked in an arms race for quality human content -- and that pressure has fueled an industrial-scale 'data laundering' economy." He emphasized that Reddit's vast archive of conversations makes it an attractive target for companies seeking to train AI systems on human-generated material. Perplexity's spokesperson Beejoli Shah said the company had not yet received the lawsuit but asserted that it would "fight vigorously for users' rights to freely and fairly access public knowledge." Shah added that Perplexity's approach is "principled and responsible" as it aims to provide accurate AI-generated answers. Representatives for SerpApi and Oxylabs declined to comment, while AWMProxy, identified in the filing as a Russian company, could not be reached, according to Bloomberg. The case, filed under Reddit Inc. v. SerpApi LLC, 25-cv-08736, is being heard in the U.S. District Court for the Southern District of New York.
[20]
AI Ethics Clash: Reddit Slaps Perplexity & Three Others with Lawsuit Over AI Data Theft Claims
The lawsuit also names three other firms: SerpApi, Oxylabs, and AWMProxy. Reddit has previously sued another major AI company, Anthropic, in June for web scraping and data protection violations. The lawsuit, filed in the US District Court for the Southern District of New York, accuses these companies of unfair competition and unjust enrichment while also alleging that some violated US copyright laws. Reddit said in the complaint that the "data-scraping companies circumvented its data protection measures to steal data that Perplexity desperately needs to power its answer engine system." "AI companies are locked in an arms race for quality human content, and that pressure has fueled an industrial-scale 'data laundering' economy," Reddit chief legal officer Ben Lee said in a statement. Lee added, "scrapers bypass technological protections to steal data and sell it to clients looking for training material." He added that "Reddit is a prime target because it's one of the largest and most dynamic collections of human conversation ever created." "Reddit hosts over 100,000 interest-based subreddit communities," said in its lawsuit that its "user posts had become the most commonly cited source for AI-generated answers on Perplexity." It added that it sent a 'cease-and-desist' letter, after which it increased the volume of citations to Reddit "forty-fold."
[21]
Reddit sues AI company Perplexity and others for 'industrial-scale' scraping of user comments
Social media platform Reddit sued the artificial intelligence company Perplexity AI and three other entities on Wednesday, alleging their involvement in an "industrial-scale, unlawful" economy to "scrape" the comments of millions of Reddit users for commercial gain. Reddit's lawsuit in a New York federal court takes aim at San Francisco-based Perplexity, maker of an AI chatbot and "answer engine" that competes with Google, ChatGPT and others in online search. Also named in the lawsuit are Lithuanian data-scraping company Oxylabs UAB, a web domain called AWMProxy that Reddit describes as a "former Russian botnet," and Texas-based startup SerpApi. It's the second such lawsuit from Reddit since it sued another major AI company, Anthropic, in June. But the lawsuit filed Wednesday is different in the way that it confronts not just an AI company but the lesser-known services the AI industry relies on to acquire online writings needed to train AI chatbots. "Scrapers bypass technological protections to steal data, then sell it to clients hungry for training material. Reddit is a prime target because it's one of the largest and most dynamic collections of human conversation ever created," said Ben Lee, Reddit's chief legal officer, in a statement Wednesday. Perplexity said it has not yet received the lawsuit but "will always fight vigorously for users' rights to freely and fairly access public knowledge. Our approach remains principled and responsible as we provide factual answers with accurate AI, and we will not tolerate threats against openness and the public interest." Oxylabs and SerpAPI didn't immediately respond to requests for comment Wednesday. AWMProxy could not immediately be reached for comment. Reddit compares the companies it is suing to "would-be bank robbers" who can't get into the bank vault, so they break into the armored truck instead. The lawsuit alleges they are evading Reddit's own anti-scraping measures while also "circumventing Google's controls and scraping Reddit content directly from Google's search engine results." Lee said that because they're unable to scrape Reddit directly, "they mask their identities, hide their locations, and disguise their web scrapers to steal Reddit content from Google Search. Perplexity is a willing customer of at least one of these scrapers, choosing to buy stolen data rather than enter into a lawful agreement with Reddit itself." Much like its lawsuit against Anthropic, maker of the chatbot Claude, Reddit claims that Perplexity has accessed Reddit's content despite being asked not to do so. Reddit made a similar argument in its lawsuit against Anthropic. That case was initially filed in California Superior Court but was later moved to federal court and has a hearing scheduled for January. Along with digitized books and news articles, websites such as Wikipedia and Reddit are deep troves of written materials that can help teach an AI assistant the patterns of human language. Reddit has previously entered licensing agreements with Google, OpenAI and other companies that are paying to be able to train their AI systems on the public commentary of Reddit's more than 100 million daily users. The licensing deals helped the 20-year-old online platform raise money ahead of its Wall Street debut as a publicly traded company last year.
[22]
Reddit sues Perplexity AI and others over alleged data scraping By Investing.com
Investing.com -- Reddit Inc (NYSE:RDDT) has filed a lawsuit against Perplexity AI and three data scraping companies for allegedly collecting and using Reddit data without permission. The lawsuit, filed Wednesday in federal court in Manhattan, claims that Oxylabs UAB, AWMProxy, and SerpApi have been collecting Reddit data through Google search results to resell it. According to the complaint, Perplexity AI has been purchasing this data from at least one of these companies. Reddit alleges these companies are circumventing its technological barriers by scraping data from Google's search engine results instead of directly from Reddit. The complaint states that during a two-week period in July 2025, the three data scraping defendants accessed almost three billion search engine results pages containing Reddit content. The social media platform claims it caught Perplexity "red-handed" using Reddit data acquired through scraping Google search results, despite sending the company a cease-and-desist letter. According to the lawsuit, Perplexity's citations to Reddit increased forty-fold after being told to stop. Reddit argues that some major AI companies like OpenAI and Google have properly entered into agreements to access Reddit data while protecting user rights, but the defendants chose not to follow this path. The lawsuit cites violations of the Digital Millennium Copyright Act, which prohibits circumventing technological measures that control access to copyrighted works. Reddit is seeking to end what it describes as the defendants' "circumvention of security measures" and "blatant misuse of Reddit content." Reddit stock is down 6.6% today amid AI bubble concerns. Earlier in the year, Reddit sued another AI startup leader, Anthropic, in California court for similar data scrapping violations.
[23]
Reddit sues Perplexity for illegal data extraction
On Wednesday,Reddit filed a lawsuit against Perplexity and three other companies in federal court in New York, accusing them of illegally circumventing its protective measures to extract massive amounts of data from its platform. According to the social network, this content was used to train the startup's AI-powered search engine without authorization or compensation. In its complaint, Reddit denounces "large-scale data theft," citing a deliberate strategy to access restricted information. The company believes that Perplexity had a critical need for this content to improve the relevance of its automated response system, which is based on the analysis of human discussions and exchanges. Reddit's general counsel, Ben Lee, refers to an "industrial-scale data laundering economy" fueled by competition among AI players for access to quality content. This legal action is part of a broader context of tensions between online content owners and artificial intelligence companies. Reddit had already initiated similar proceedings in June against the start-up Anthropic. Several media outlets and publishers also accuse AI models of having been trained on their copyrighted content without compensation. Perplexity responded by stating that its methods are based on principles of responsibility and accuracy, and that it will not "tolerate threats to openness and the public interest." The rapidly expanding startup is seen as one of Alphabet's most serious competitors in the field of AI-augmented search engines.
[24]
Reddit sues Perplexity over alleged illegal data scraping to train its AI engine
The complaint also names three other companies allegedly involved in the data scraping. Reddit has filed a lawsuit against AI startup Perplexity in a New York federal court, accusing the company of illegally scraping its data to train Perplexity's AI-based search engine. The complaint also names three other companies allegedly involved in the data scraping. According to Reddit, the data-scraping companies bypassed its data protection measures to steal content that Perplexity "desperately needs" to power its "answer engine" system, reports Reuters. This case is part of a growing wave of lawsuits where content owners are taking legal action against tech companies for using copyrighted material without permission to train artificial intelligence systems. In June, Reddit filed a similar lawsuit against AI startup Anthropic, which is still ongoing. "Our approach remains principled and responsible as we provide factual answers with accurate AI, and we will not tolerate threats against openness and the public interest," Perplexity was quoted as saying in the report. Reddit's chief legal officer, Ben Lee, said, "AI companies are locked in an arms race for quality human content - and that pressure has fueled an industrial-scale 'data laundering' economy." Also read: Govt warns online shoppers against Drip Pricing scam: What is it and what you should do The social media platform, known for its thousands of topic-focused subreddit communities, pointed out in the lawsuit that it is one of the most frequently cited sources for AI-generated answers to user questions. Reddit has already licensed its content to major companies like Google and OpenAI for AI training. The lawsuit claims that Lithuania-based Oxylabs, Russia-based AWMProxy, and Texas-based SerpApi scraped data from billions of Reddit search results without permission. Reddit alleges that Perplexity, which does not have a license to use Reddit's content, collaborated with at least one of these scraping companies to obtain Reddit material. "We strongly disagree with Reddit's allegations and intend to vigorously defend ourselves in court," a SerpApi spokesperson said. Meanwhile, Oxylabs said that it was "shocked and disappointed by this news, as Reddit has made no attempt to speak with us directly," and that it would defend itself against the accusations. Reddit stated that it had sent Perplexity a cease-and-desist letter last year. Following that, Reddit says Perplexity "increased the volume of citations to Reddit forty-fold." The company is seeking monetary damages and a court order to prevent Perplexity from using its content.
Share
Share
Copy Link
Reddit files a lawsuit against AI search engine Perplexity and three data scraping companies for allegedly harvesting its user-generated content without permission. The case highlights the growing value of quality data in AI training and the legal challenges in the industry.
Reddit, the prominent social media platform, has filed a lawsuit against AI search engine Perplexity and three data scraping companies for allegedly illicitly harvesting user-generated content for AI model training
1
2
. This federal court action in Manhattan underscores a significant escalation in the ongoing conflict over data rights in the AI industry. The lawsuit specifically names Perplexity AI, alongside data scraping firms Oxylabs UAB (Lithuania), AWMProxy (formerly a Russian botnet), and SerpApi (Texas)3
. Reddit claims these entities circumvented its defenses to collect data from its platform and Google search results, subsequently using or selling it for AI development2
.Source: Digit
Ben Lee, Reddit's chief legal officer, highlighted the surging demand for quality human-generated content in AI training, describing it as an "arms race" fueling an "industrial-scale data laundering economy"
4
. Reddit's extensive user discussions make it a prime target for scrapers aiming to improve AI models5
. The lawsuit alleges violations of the US Digital Millennium Copyright Act (DMCA), unfair competition, unjust enrichment, and civil conspiracy, seeking an injunction and unspecified damages2
4
.Source: Financial Times News
Related Stories
Perplexity has denied Reddit's allegations, stating it only summarizes and cites public Reddit discussions, not trains AI models on the content
5
. The company accused Reddit of "extortion" and opposition to an open internet. This lawsuit is part of a wider trend of copyright disputes where content owners are challenging AI firms over unauthorized use of material for large language model training2
3
. Reddit has proactively monetized its content through licensing deals with major AI companies like Google and OpenAI, providing controlled access for training and generating revenue5
. The outcome of this case could establish crucial precedents for user-generated content in AI development, raising important questions about the balance between open access and content rights in the digital age.Source: AP NEWS
Summarized by
Navi
[1]
[2]
[3]
05 Jun 2025•Policy and Regulation
12 Aug 2025•Technology
01 Aug 2024