Technology ❯ Artificial Intelligence ❯ Research
ChatGPT Superalignment Team Ilya Sutskever Suchir Balaji John Schulman Leadership Changes Steven Adler AGI Artificial General Intelligence Anthropic GPT-5 Model Improvement Deliberative Alignment Jakub Pachocki Predictive Models
New tests show a 'deliberative alignment' approach can sharply cut deceptive behavior in controlled settings.