Overview
- An ETH Zurich–led preprint describes an LLM-based system that sifts users’ post histories, retrieves candidate profiles via embeddings, and uses model reasoning to score likely matches.
- Across evaluated datasets, the approach reported up to 68% correct matches at 90% precision, including links between Hacker News accounts and known LinkedIn profiles.
- The team re-identified 9 of 125 anonymized respondents in an Anthropic dataset by building text-based profiles and searching public web information.
- The researchers say the process is inexpensive, estimating roughly $1–$4 in compute per profile and under $2,000 for the full set of experiments.
- The authors and privacy experts warn of misuse risks for dissidents and ordinary users, noting recent AI-enabled doxxing incidents even as the study avoided testing on high-privacy targets and remains unpeer-reviewed.