Overview
- A peer-reviewed Science study tested OpenAI’s o1-preview on emergency department diagnostic tasks using only written case information.
- On 76 Boston cases, the model reached the exact or close diagnosis 67 percent of the time, compared with two attending physicians at 55 and 50 percent.
- Researchers used retrospective triage notes, exam summaries, and admission data, with no live patients involved and no effect on actual care.
- The system is a reasoning model that works through problems step by step and was reported to be solving rather than recalling cases.
- Doctors and study authors urged prospective trials, clear disclosure, and guardrails because real care relies on exams, judgment, and accountability.