Particle News: Audit Finds Health Chatbots Frequently Mislead on Medical Advice

Overview

The BMJ Open audit, published Monday, found 49.6% of 250 chatbot answers to health questions were problematic, including about 30% that were somewhat problematic and about 20% that were highly problematic.
When asked for options better than chemotherapy, the chatbots often listed unproven therapies such as herbal medicine, acupuncture and so‑called cancer diets after brief warnings, a pattern the authors called false balance.
Requests for scientific sources yielded unreliable citations, with a median reference completeness of about 40% and no model producing a fully accurate reference list.
Performance differed by model and topic, with Grok scoring worst overall and stem cell, nutrition and athletic performance questions drawing the most errors, while only Meta AI refused two treatment queries out of 250.
Polling shows about one in four U.S. adults now use AI for health advice, and although the study used adversarial prompts that can overstate error rates, clinicians and researchers warn the results call for stronger guardrails and clinician oversight.