Particle.news
Download on the App Store

OpenAI Open-Sources ‘Privacy Filter,’ a Local Model to Redact Personal Data

The compact release lets developers scrub sensitive text on their own machines before sending data to AI services.

Overview

  • OpenAI released the open-weight Privacy Filter on Wednesday under the Apache 2.0 license, with downloads on GitHub and Hugging Face.
  • The tool detects and masks eight types of sensitive information, covering private names, addresses, emails, phone numbers, URLs, dates, account numbers, and secrets like passwords or API keys.
  • By running locally, the model keeps unfiltered text on device so teams can clean emails, logs, or code before any chatbot or cloud service sees it.
  • The 1.5 billion-parameter token-classification design labels text in one pass, supports up to 128,000 tokens of context, and emphasizes context-aware decisions over simple pattern matching.
  • OpenAI reports 96% F1 on the PII-Masking-300k benchmark, rising to 97.43% after annotation fixes, yet it warns this is not a compliance guarantee and calls for human review in high-risk domains.