2 Sources
2 Sources
[1]
AI Can Mass-Unmask Pseudonymous Accounts, Research Paper Finds
Can't-miss innovations from the bleeding edge of science and tech For about as long as the internet has existed, users have been able to speak their mind freely through pseudonymous accounts that protect them from being doxxed or stalked. But thanks to the advent of sophisticated AI, unmasking pseudonymous users on the internet has become ominously easy. As detailed in a yet-to-be-peer-reviewed paper, a team of researchers at ETH Zurich and AI company Anthropic found that "large language models can be used to perform at-scale deanonymization." In a series of experiments, the researchers showed that their agent could "re-identify" users on the popular forums Hacker News and Reddit based on their "pseudonymous online profiles and conversations alone," something that would "take hours for a dedicated human investigator" to do. The results were alarming: the AI agent unmasked an astonishing two-thirds of users. "Our results show that the practical obscurity protecting pseudonymous users online no longer holds and that threat models for online privacy need to be reconsidered," the researchers warned. "Across Hacker News, Reddit, LinkedIn, and anonymized interview transcripts, our method identifies users with high precision -- and scales to tens of thousands of candidates," coauthor and ETH Zurich AI engineer Simon Lermen wrote in a blog post accompanying the paper. The implications for online privacy could be considerable. "The average online user has long operated under an implicit threat model where they have assumed pseudonymity provides adequate protection because targeted deanonymization would require extensive effort," they wrote. "LLMs invalidate this assumption." In their experiments, the team collected datasets from public social media sites to test out their deanonymization AI. They linked Hacker News posts to LinkedIn profiles by using references in user profiles. Then they anonymized the datasets by removing any identifying references from the posts. Finally, they trained an LLM on the datasets, asking it to link up the posts with the original author. "What we found is that these AI agents can do something that was previously very difficult: starting from free text (like an anonymized interview transcript) they can work their way to the full identity of a person," Lermen told Ars Technica. "This is a pretty new capability; previous approaches on re-identification generally required structured data, and two datasets with a similar schema that could be linked together." The team had to tread carefully, since "you don't want to actually deanonymize anonymous individuals," as Lermen explained in his post. Instead, the team came up with "two types of deanonymization proxies which allow us to study the effectiveness of LLMs at these tasks." Even when the data fed to the AI was extremely general, like responses to an Anthropic questionnaire about how people use AI in their daily lives, the LLM could pick up on the clues to identify people around seven percent of the time. While that may sound low, Lermen told Ars it's noteworthy "that AI can do this at all." The researchers also found that when fed comments from various movie communities on Reddit, an AI could identify users with an astonishing rate of precision. The more the users discussed movies, the easier it was for the AI to deanonymize them. However, they also pointed out several limitations. For one, sample sets are "small because they require verified identity links," they wrote. It's also difficult to distinguish what the LLM gathered from its web search. "The attack relies on opaque web search systems, making it difficult to isolate what the LLM agent contributes versus what the search engine embeddings contribute," the researchers admitted. Nonetheless, the team warns that their findings paint an alarming picture of the future of online anonymity. "LLMs democratize deanonymization," they concluded, which could potentially allow governments to "link pseudonymous accounts to real identities for surveillance of dissidents, journalists, or activists." "Corporations could connect seemingly anonymous forum posts to customer profiles for hyper-targeted advertising," they added. "Attackers could build sophisticated profiles of targets at scale to launch highly personalized social engineering scams." In short, the advent of AI has ushered in a new era that calls for enhanced safety measures -- or that could even be the death knell of online pseudonymity. "Users, platforms, and policymakers must recognize that the privacy assumptions underlying much of today's internet no longer hold," the paper reads.
[2]
Anthropic research says AI can mass expose of anonymous internet accounts
Researchers warn AI could make Internet anonymity harder to maintain New research involving scientists from Anthropic and ETH Zurich suggests that modern artificial intelligence systems could identify the real-world identities behind supposedly anonymous internet accounts. The study, published as a preprint on arXiv, shows that large language models (LLMs) may be capable of analyzing online activity and linking pseudonymous profiles to real individuals at scale. The research, titled Large-scale online deanonymization with LLMs, explores how AI agents can automate the process of deanonymization - the act of connecting anonymous or pseudonymous online accounts to real identities. Traditionally, this process required significant manual investigation by analysts who searched through posts, writing styles, and scattered online clues. However, the researchers demonstrate that modern AI models can perform many of these steps automatically. Recommended Videos In the study, the AI system analyzed public text from online platforms and extracted identity-related signals such as personal interests, demographic clues, writing style, and incidental details revealed in posts. The AI then searched for matching profiles across the web and evaluated whether the clues aligned with known individuals. To test the method, researchers created several datasets with known ground-truth identities One experiment attempted to match Hacker News users with their LinkedIn profiles, even after removing obvious identifiers such as names and usernames. Another dataset involved linking pseudonymous Reddit accounts across different communities. A third dataset split a single user's posting history into two separate profiles to see if the AI could identify that they belonged to the same person. The results showed that LLM-based systems significantly outperformed traditional deanonymization techniques. In some cases, the models achieved up to 68% recall with about 90% precision, meaning the AI correctly identified many accounts while maintaining relatively low error rates. Conventional methods in the same experiments achieved close to zero success. Researchers say the findings highlight how AI can replicate tasks that once required hours of work by human investigators. An AI system can automatically extract identity-related features from text, search for potential matches among thousands of profiles, and reason about which candidate is most likely correct. This development is significant because anonymity has long been considered a basic protection for many internet users Pseudonymous accounts are widely used by journalists, whistleblowers, activists, and ordinary individuals who want to discuss sensitive topics without revealing their real identities. The study suggests that this layer of protection - sometimes called "practical obscurity" - may be weakening as AI systems become better at connecting digital clues across platforms. If automated tools can perform this work quickly and cheaply, the barrier to identifying anonymous users could drop dramatically. Researchers estimate that the cost of identifying an online account using their experimental pipeline could fall between $1 and $4 per profile, meaning large-scale investigations could be conducted relatively cheaply. However, the authors also note that the research was conducted in controlled environments using public data. The paper has not yet been peer-reviewed, and the researchers intentionally withheld some technical details to reduce the risk of misuse. Even so, the findings have already sparked debate among privacy experts and technologists The work suggests that individuals may need to rethink how much personal information they reveal online - even in spaces that appear anonymous. Looking ahead, researchers say further work is needed to understand both the risks and possible defenses against AI-powered deanonymization. Potential solutions could include improved privacy tools, stronger platform safeguards, or AI systems designed to anonymize sensitive data before it is shared publicly. As artificial intelligence becomes more capable at analyzing massive volumes of online content, the study highlights a growing challenge: balancing the power of AI-driven discovery with the need to protect personal privacy in the digital age.
Share
Share
Copy Link
Research by Anthropic and ETH Zurich reveals that large language models can identify real-world identities behind pseudonymous accounts with alarming precision. The AI successfully unmasked two-thirds of users on platforms like Hacker News and Reddit by analyzing writing patterns and post content alone—a task that previously required hours of human investigation.
A yet-to-be-peer-reviewed study conducted by researchers at ETH Zurich and AI company Anthropic has uncovered a troubling capability: large language models can perform AI deanonymization at scale, effectively ending the era of practical obscurity that has long protected pseudonymous accounts online. The research demonstrates that AI agents can "re-identify" users on popular platforms like Hacker News and Reddit based solely on their pseudonymous online profiles and conversations—work that would traditionally take hours for a dedicated human investigator
1
2
.
Source: Futurism
The results proved alarming: the AI agent successfully unmasked an astonishing two-thirds of users in their experiments. In some cases, the models achieved up to 68% recall with approximately 90% precision, meaning the AI correctly identified many accounts while maintaining relatively low error rates
2
. Conventional deanonymization methods in the same experiments achieved close to zero success, highlighting how dramatically AI has shifted the landscape.The research by Anthropic and ETH Zurich involved collecting datasets from public social media sites to test their deanonymization capabilities. Researchers linked Hacker News posts to LinkedIn profiles using references in user profiles, then anonymized the datasets by removing identifying information. They trained an LLM to link posts back to original authors by analyzing personal interests, demographic clues, writing style, and incidental details revealed in posts
1
2
."What we found is that these AI agents can do something that was previously very difficult: starting from free text (like an anonymized interview transcript) they can work their way to the full identity of a person," coauthor and ETH Zurich AI engineer Simon Lermen told Ars Technica. This represents a significant departure from previous approaches, which generally required structured data and two datasets with similar schemas that could be linked together
1
.Even when fed extremely general data, like responses to an Anthropic questionnaire about how people use AI in their daily lives, the LLM could identify people approximately seven percent of the time. When analyzing comments from various movie communities on Reddit, the AI demonstrated an astonishing rate of precision in linking pseudonymous online profiles—the more users discussed movies, the easier it became for the system to deanonymize them
1
.The implications for online privacy are considerable, as researchers estimate the cost of identifying an online account using their experimental pipeline could fall between $1 and $4 per profile. This means mass expose of anonymous internet accounts could be conducted relatively cheaply, dramatically lowering the barrier to large-scale surveillance
2
."The average online user has long operated under an implicit threat model where they have assumed pseudonymity provides adequate protection because targeted deanonymization would require extensive effort," the researchers wrote. "LLMs invalidate this assumption." The study warns that governments could link pseudonymous accounts to real identities for surveillance of dissidents, journalists, or activists. Corporations could connect seemingly anonymous forum posts to customer profiles for hyper-targeted advertising, while attackers could build sophisticated profiles of targets at scale to launch highly personalized social engineering scams
1
.Related Stories
The research highlights how AI democratizes deanonymization capabilities that were once limited to well-resourced investigators. "Across Hacker News, Reddit, LinkedIn, and anonymized interview transcripts, our method identifies users with high precision -- and scales to tens of thousands of candidates," Lermen wrote in a blog post accompanying the paper
1
.While the authors acknowledge limitations—including small sample sets that require verified identity links and difficulty distinguishing what the LLM gathered from web search versus search engine embeddings—they maintain their findings paint an alarming picture. "Users, platforms, and policymakers must recognize that the privacy assumptions underlying much of today's internet no longer hold," the paper states
1
.Looking ahead, researchers suggest that individuals may need to rethink how much personal information they reveal online, even in spaces that appear anonymous. Potential solutions could include improved privacy tools, stronger platform safeguards, or AI systems designed to anonymize sensitive data before it is shared publicly
2
. The advent of AI has ushered in a new era that calls for enhanced safety measures—or could potentially signal the death knell of online pseudonymity as we know it.Summarized by
Navi
22 Jan 2026•Policy and Regulation

18 Nov 2025•Science and Research

02 Feb 2026•Science and Research

1
Technology

2
Policy and Regulation

3
Policy and Regulation
