Particle.news
Download on the App Store

BMJ Open Audit Finds Consumer AI Chatbots Frequently Give Problematic Health Advice

The results underscore calls for clinician oversight with clearer guidance for patients.

Overview

  • The BMJ Open audit, released Wednesday, found five free chatbots gave problematic health answers about half the time and nearly one in five were highly problematic, with Grok the worst performer.
  • Researchers stress-tested ChatGPT, Gemini, Grok, Meta AI, and DeepSeek with 250 questions across cancer, vaccines, stem cells, nutrition, and athletic performance using open-ended prompts that mirror real user wording.
  • Several cancer responses created “false balance” by naming unproven alternatives alongside chemotherapy, a framing clinicians warned could steer patients away from effective treatment.
  • When asked for sources, the bots returned shaky citation lists with only about 40% of requested references complete and frequent errors like wrong authors, dead links, or fabricated papers.
  • A Merck Manuals survey reported Thursday found nine in ten people who use AI for health say they re-check the information, even as polls show many turn to chatbots for quick advice instead of waiting to see a doctor.