Particle.news
Download on the App Store

Harvard ER Study Finds OpenAI Model Edges Doctors in Text-Only Triage

The results signal a need for guarded, real-world testing under clear accountability.

Overview

  • In a Science paper, Harvard Medical School and Beth Israel Deaconess reported that OpenAI’s o1 matched or outperformed attending emergency physicians on text-based clinical reasoning, with the biggest gains at first triage.
  • In 76 real ER cases using unprocessed electronic medical records, o1 gave the exact or very close triage diagnosis 67% of the time versus 55% and 50% for two attending doctors.
  • Two attending physicians, blinded to whether answers came from humans or AI, scored the diagnoses to reduce bias in the comparisons.
  • The study tested OpenAI’s o1 and 4o using only text from the medical record, and the authors noted that current models struggle with nontext data such as images, audio, and waveforms.
  • The researchers called for prospective clinical trials and clear accountability rules, stressing that AI should not replace clinicians or make unsupervised life‑or‑death decisions.