Overview
- OpenAI introduced a public Safety Bug Bounty on Bugcrowd that targets AI abuse and safety risks across its products, complementing its security program that has rewarded 409 issues since April 2023.
- The scope covers agent-led risks such as Model Context Protocol abuse, third-party prompt injection, data exfiltration, and disallowed actions at scale, plus account and platform integrity gaps and outputs that reveal OpenAI proprietary information.
- For agent attacks, OpenAI expects the behavior to reproduce about half the time, and it asks testers to follow the terms of service of any third-party tools the agent uses.
- General jailbreaks without clear harm are out of scope, and issues that grant access to features or data beyond permission belong in the separate Security Bug Bounty.
- Reports are triaged by a joint safety and security team on Bugcrowd, high-severity findings can earn up to $7,500, and OpenAI also runs invite-only hunts for sensitive areas like biorisk in ChatGPT Agent and GPT-5.