Curated by THEOUTPOST
On Thu, 25 Jul, 12:03 AM UTC
6 Sources
[1]
Reddit Is Giving Google a Search Monopoly
If you want to search something on Reddit, you're going to have to use Google. "Reddit" has quickly become one of my favorite search engine keywords. When I want to know what actual people have to say about restaurants near me or a hotel I might be looking at (as opposed to a bunch of ads or questionable AI junk), Adding "Reddit" to the end of my search is a good way to get it. But going forward, you'll only be able to use this trick with Google: Microsoft today confirmed that Reddit is blocking Bing from accessing it, and a 404 Media investigation further showed that most non-Google sites don't work with Reddit anymore. The news comes after an update to Reddit's robots.txt file, which controls which search engines are allowed to crawl through the social media site for information. Search experts later took to X (formerly Twitter) to report they had noticed Reddit seemed to be blocking Bing and other search engines, after which Microsoft confirmed to Search Engine Land that "Bing stopped crawling Reddit after they implemented their updated robots.txt file on July 1, which prohibits all crawling of their site." If it seems like it took a while for the news to circulate, that's because it can take time to catch these kinds of behind the scenes changes. Bing and other affected search engines can still show old links their robots had already crawled, but going forward, they won't be able to gather new ones. When reporters noticed something fishy was afoot, they reached out for details, although Microsoft is the only company to provide official confirmation as of yet. Still, according to 404 Media, Reddit isn't populating new links on other popular Google alternatives like DuckDuckGo, either. The outlet did say links are still updating on independent paid search engine Kagi, but it's worth noting that Kagi buys some of its search index from Google. Currently, there's no official reason for the lockout, but the news coincidentally follows a $60 million-a-year deal that allows Google to train its AI on posts from Reddit users. In a statement to The Verge, Reddit spokesperson Tim Rathschmidt said, "This is not at all related to our recent partnership with Google," before clarifying that "We have been in discussions with multiple search engines. We have been unable to reach agreements with all of them, since some are unable or unwilling to make enforceable promises regarding their use of Reddit content, including their use for AI." It's true that more search engines are getting in on the AI game, but reading between the lines, it seems like Reddit's concerns for its users' content could actually be a hint that companies now need to pay to play with the social media giant, and that so far, Google is the only one that's been willing to pony up.
[2]
Reddit now blocks all search engines other than Google following 'misuse'
Reddit can be a valuable source for information from real people, which is why Google is spending millions on a deal with the platform. But, now, Reddit has started to block many of its results from showing in other search engines properly. In February of this year Google announced a new deal with Reddit which would see Reddit data used for training Google's AI models as well as more prominently showing Reddit results within Google Search. Bloomberg reported that the deal was for around $60 million. In the time since, Reddit has also been showing in Google Search drastically more, often outranking websites that Reddit posts are linking to. Now, Reddit results in other search engines are effectively being blocked. This behavior was first reported by 404 Media, which notes that Reddit has updated its robots.txt file to block all bots from scraping any part of the site. In the file, Reddit says: Reddit believes in an open internet, but not the misuse of public content. In June, just prior to when the file was first updated, Reddit explained that the change was being made following "an uptick in obviously commercial entities who scrape Reddit" and "use Reddit content for any use case they want." While it's not said explicitly, it's also pretty clear that AI training is a clear focus for this change in policy. But, as a result, search engines other than Google are now not able to show proper results from Reddit as they previously did. 404 notes that Bing, DuckDuckGo, Mojeek, and Qwant are all affected, with results either not showing anything recent, or not showing the full site result. Kagi, a paid search engine, is apparently still showing data, but only because it buys some of its search index from Google, which continues to have access to Reddit data through the aforementioned deal.
[3]
Reddit Is Blocking Most Search Engines
Reddit is one of the internet's top places for discussion, and it's an important bastion of internet culture. Which is also the reason why it often shows up near the top of search results. Now, though, it looks like Google is the only search engine that can properly pull up Reddit search results. Reddit has seemingly restricted access to its content, effectively making Google the only search engine capable of displaying Reddit results. Alternative search engines like Bing, DuckDuckGo, and others are now unable to properly crawl Reddit and display new content. Some of these search engines might show some non-descript Reddit search results, while others show none at all. While neither Reddit nor Google have officially commented on this matter, this may be related to Reddit's multi-million dollar deal with Google to scrape Reddit data for AI training purposes. The decision has drawn criticism from smaller search engine providers. Colin Hayhurst, the CEO of search engine company Mojeek, told 404 Media, "They're [Reddit] killing everything for search but Google." Microsoft (which owns Bing), DuckDuckGo, and other search engine companies have not chimed in yet. Reddit's actions come amidst a broader trend of websites blocking bots used by AI companies for data scraping. The platform recently updated its robots.txt file, a set of instructions for web crawlers, to strictly prohibit any automated access. The current robots.txt file is set to disallow all web crawling, including regular search engines (an archived version from July 23rd does not). Google, owning the most popular search engine out there, is in an unique position to train its generative AI on its search results and on the other data it discovers. We can't think of a single generative AI developer that has trained its models in an "ethical" manner, but Google might be willing to go way lower than others. Source: 404 Media
[4]
Search engines that don't pay up can't index Reddit content
Google and Reddit signed a deal earlier this year, reportedly worth $60 million annually. When Reddit said last month that it would block unauthorized data scraping from its site, everyone's (rightful) first reaction was "AI, AI, AI." However, now that the change has taken effect, chatbot makers aren't the only ones being locked out. The widely used forum also appears to be blocking all search engines other than Google, which reportedly inked a deal earlier this year with Reddit worth $60 million annually. 404 Media reported on Wednesday (and Engadget confirmed in our queries) that searching for Reddit results from the past week on rival engine Bing (using "site:reddit.com") returns empty results. The publication reported that DuckDuckGo produced seven links without any descriptions, only providing the note, "We would like to show you a description here but the site won't allow us." The engine now appears to have removed even those, as our test only produced an empty page, reading, "no results found." When Reddit said last month that it would update its Robots Exclusion Protocol (robots.txt) to block automated data scraping, it's now apparent that it wasn't only meant to thwart AI companies like Perplexity and its controversial "answer engine." Currently, Google appears to be the only search engine allowed to crawl Reddit and produce results from "the front page of the internet." Ironically, part of the forum website's robots.txt file reads, "Reddit believes in an open internet, but not the misuse of public content." The file for Reddit now essentially says, "Do not scrape." Apparently, it now considers search engines that don't buy into exclusive deals to be misusing its content. The ubiquitous robots.txt is the web standard that communicates which parts of a site can be crawled. Although many crawlers are known to ignore its instructions, Google's standard procedure is to respect it. So, on the technical side, the companies in cahoots on the lucrative deal appear to have deployed some manual override. Of course, the saga is a trickle-down effect of AI chatbots scraping the live web for results. With courts slow to determine how much of the open web is fair use to train chatbots on, companies like Reddit, whose bottom lines now depend on safeguarding their data from those who don't pay, are building walls at the expense of the open web. (Although, given the integral role Microsoft has played in this AI era, cozying up with OpenAI early on, it seems ironic that Bing finds itself on the losing end of at least one aspect of the fallout.) Colin Hayhurst, CEO of lesser-known "no-tracking" search engine Mojeek, told 404 Media that Reddit is "killing everything for search but Google." In addition, the executive said his attempts to contact Reddit were ignored. "It's never happened to us before," he said. "Because this happens to us, we get blocked, usually because of ignorance or stupidity or whatever, and when we contact the site you certainly can get that resolved, but we've never had no reply from anybody before." Engadget asked Google and Reddit for comment and confirmation, but we hadn't heard back by publication. 404 Media reported running into a similar wall of silence from the companies. Reddit has made no secret of its desire to block AI companies from scraping its treasure trove of data in this burgeoning age of AI. Last year, CEO Steve Huffman risked alienating large portions of its user base by blocking third-party API requests, leading to the demise of beloved apps like Christian Selig's Apollo. Despite widespread protests among moderators and forum-goers, the company only temporarily lost negligible numbers of users. The gamble appeared to pay off, and Reddit recovered. It went public in March.
[5]
Google just gained exclusive access to Reddit | Digital Trends
Reddit has begun blocking all search engines except those that pay to crawl its site -- namely, Google. A report from 404 Media says that search engines like Bing or DuckDuckGo don't show any results from the last week, even when using the "site:reddit.com" search query. Because Google has paid the bill upfront, niche search engines like Kagi that rely on Google still have access to Reddit. In the case of DuckDuckGo, the report claims that Reddit has blocked the search engine from pulling any data, stating, "We would like to show you a description here but the site won't allow us." Recommended Videos It's fascinating to see how much sway Reddit now has, even over a company as big as Google. Just last October, Reddit reportedly threatened to block Google too if it didn't agree to its new API pricing, an increase that resulted in the death of some popular third-party apps and a blackout across many prominent subreddits last summer. This culminated with Reddit going public in March 2024. It has since been on a mission to monetize its forums to great effect, primarily through data licensing. Just a couple of months later, Reddit and OpenAI would announce that ChatGPT would be trained on Reddit's data API. We've also seen Google begin to favor Reddit in its search results over the past six months, which was announced in February, facilitating "more content-forward displays of Reddit information." This went on to be integrated into AI Overviews to produce some embarrassing results, including surfacing a joke from Reddit about cooking pizza with glue.
[6]
Reddit is now blocking major search engines and AI bots -- except the ones that pay
It's a bold move for a massive website like Reddit to block some of the most popular search engines, but it's not all that surprising. Over the past year, Reddit has become more protective of its data as it looks to open up another source of revenue and appease new investors. After making its API more expensive for some third-party developers, Reddit reportedly threatened to cut off Google if it didn't stop using the platform's data to train AI for free.
Share
Share
Copy Link
Reddit has implemented a new policy blocking non-Google search engines from indexing its content, raising concerns about internet openness and search engine competition.
Reddit, one of the internet's most popular platforms, has made a significant change to its content accessibility. The social news aggregator has implemented a policy that blocks non-Google search engines from indexing its content 1. This move has sparked a heated debate about internet openness and search engine competition.
At the heart of this controversy is a reported deal between Reddit and Google. While the exact terms remain undisclosed, it's believed that Google has secured exclusive rights to index Reddit's vast content library 2. This arrangement effectively means that other search engines, including Bing, DuckDuckGo, and Brave, are now unable to display Reddit content in their search results.
Reddit has implemented this change through its robots.txt file, a standard used by websites to communicate with search engine crawlers. The updated file now explicitly allows Google's crawlers while disallowing all others 3. This technical measure ensures that only Google can index and display Reddit's content in its search results.
The implications of this decision are far-reaching. Users who prefer alternative search engines will find it challenging to discover Reddit content through their preferred platforms. This change could potentially drive more traffic to Google, further cementing its dominance in the search market 4.
Critics argue that this move goes against the principles of an open internet. By limiting access to its content, Reddit is potentially creating a more fragmented online experience. Some worry that this could set a precedent for other large websites to follow suit, leading to a less diverse and less competitive internet landscape 5.
While Reddit has not publicly commented on the specifics of the deal, the company likely sees this as a strategic business decision. By partnering with the largest search engine, Reddit may be aiming to increase its visibility and potentially its revenue. However, this comes at the cost of reduced accessibility through other search platforms.
This development raises questions about the future of web indexing and the power dynamics between major tech companies. As the internet continues to evolve, the balance between business interests and the principles of an open web remains a contentious issue. The long-term effects of Reddit's decision on user behavior, search engine competition, and internet accessibility are yet to be fully understood.
Reference
[1]
[3]
[5]
Reddit's CEO Steve Huffman calls out Microsoft and other companies for scraping the site's content without permission. He implements changes to block most non-Google search engines, sparking controversy in the tech industry.
3 Sources
3 Sources
Google and Reddit have expanded their partnership, with Reddit now using Google's Vertex AI to improve its search functionality. This move comes amid concerns about Reddit's user growth and its dependence on Google Search traffic.
3 Sources
3 Sources
Reddit introduces 'Reddit Answers', an AI-powered search tool designed to provide curated summaries from relevant conversations across the platform, aiming to simplify information discovery for users.
26 Sources
26 Sources
OpenAI's SearchGPT is set to disrupt the search engine market, challenging Google's dominance and forcing innovation in SEO practices. This AI-powered search tool promises to transform how we find and consume information online.
3 Sources
3 Sources
Google's AI-generated search result overviews are reportedly declining in frequency, possibly due to competition from platforms like Reddit and concerns about information quality.
2 Sources
2 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved