AI’s growing threat to online anonymity

A new study reveals how AI can exploit text patterns to link anonymous posts on platforms like Reddit or Hacker News to real identities, with accuracy rates rising sharply as more clues are provided.

A user’s anonymous post on a tech forum might soon carry unintended consequences. Researchers have shown that large language models (LLMs) can analyze text patterns across platforms to connect seemingly untraceable accounts with real-world identities—often with alarming precision.

The study, led by teams at ETH Zurich and the MATS fellowship, tested whether LLMs could match anonymous Reddit usernames to Netflix accounts based on shared movie preferences. The results were significant: a single movie recommendation increased identification accuracy by 3.1%, while five or more raised it to between 23.2% and 48.1%. In some cases, the matches approached near-certainty.

In another experiment, the researchers linked Hacker News posts to LinkedIn profiles, extracting details such as age, location, and employment without explicit personal data. A short anonymous quiz—completed in just ten minutes—revealed job roles, education history, and even regional language quirks for 7% of participants.

These findings suggest that LLMs are far more effective at deanonymization than previously understood. The risk isn’t limited to malicious actors; state agencies and investigators could also exploit such automation at scale. While traditional doxxing remains possible, the speed and precision of AI-driven methods introduce new vulnerabilities for anonymous communities.

Accuracy with single clue: 3.1% identification rate when one movie recommendation was shared.
Accuracy with multiple clues: 23.2–48.1% when five to ten recommendations were provided, with some matches reaching near-certainty.
Quiz-based deanonymization: 7% of quiz respondents uniquely identified by text patterns alone.

The study does not confirm that every anonymous account is vulnerable, but it underscores a growing threat: the more general personal details shared—even in quizzes or casual posts—the higher the risk. Platforms and AI vendors are urged to restrict data access, yet the most reliable defense remains avoiding exposure of identifying information altogether.

For IT teams managing anonymous systems, this research highlights the need for stricter API controls and monitoring of LLM activity. The era of true anonymity online may be fading faster than expected.

TECHOLAM

AI’s growing threat to online anonymity

Key takeaways