Particle.news
Download on the App Store

OpenAI Flags Rising Cyber Risk in Next Models, Unveils Guardrails and Aardvark Tool

Steep capability gains prompted tighter access controls to keep advanced tools in defensive hands.

Overview

  • Internal CTF results jumped from 27% on GPT-5 in August 2025 to 76% on GPT-5.1-Codex-Max in November, signaling rapid growth in cybersecurity proficiency.
  • Under its Preparedness Framework, OpenAI says upcoming models may reach a 'high' level that could enable working zero-day exploits or assist stealthy enterprise intrusions.
  • A defense-in-depth stack now includes stricter access and egress controls, hardened infrastructure, comprehensive monitoring, model training to refuse abuse, and external red teaming.
  • OpenAI is privately testing Aardvark, an agentic security researcher that scans full codebases, proposes patches, and has already found critical vulnerabilities and novel CVEs, with free access planned for select open-source projects.
  • Governance steps include a Frontier Risk Council of external defenders and a trusted, tiered access program for vetted cyberdefense users, with no timeline given for when a model will be formally rated 'high'.