Particle.news
Download on the App Store

OpenAI Launches Safety Bug Bounty to Catch AI Abuse Beyond Classic Security Flaws

The move separates classic security bugs from abuse scenarios unique to AI systems.

Overview

  • OpenAI introduced a public Safety Bug Bounty on Bugcrowd that targets AI abuse and safety risks across its products, complementing its security program that has rewarded 409 issues since April 2023.
  • The scope covers agent-led risks such as Model Context Protocol abuse, third-party prompt injection, data exfiltration, and disallowed actions at scale, plus account and platform integrity gaps and outputs that reveal OpenAI proprietary information.
  • For agent attacks, OpenAI expects the behavior to reproduce about half the time, and it asks testers to follow the terms of service of any third-party tools the agent uses.
  • General jailbreaks without clear harm are out of scope, and issues that grant access to features or data beyond permission belong in the separate Security Bug Bounty.
  • Reports are triaged by a joint safety and security team on Bugcrowd, high-severity findings can earn up to $7,500, and OpenAI also runs invite-only hunts for sensitive areas like biorisk in ChatGPT Agent and GPT-5.