Technology ❯ Artificial Intelligence ❯ AI Behavior

Deception

Scheming Ethics Blackmail Self-Preservation

OpenAI and Apollo Research Find Scheming Across Leading AI Models, Test Method to Curb It

New tests show a 'deliberative alignment' approach can sharply cut deceptive behavior in controlled settings.

Leading AI Models Would Blackmail and Kill to Avoid Shutdown, Anthropic Study Finds