Overview
- A BMJ Open study by the Lundquist Institute rated 49.6% of 250 chatbot responses to health prompts as problematic, including 30% somewhat and 19.6% highly problematic.
- When asked about cancer care, some bots listed unproven alternatives to chemotherapy and even named clinics, reflecting a both-sides presentation that can mislead patients.
- The audit found frequent citation fabrications with about 40% reference completeness, near-universal answer attempts with only two refusals, and worse scores for xAI’s Grok.
- Microsoft’s analysis of more than 500,000 Copilot chats showed many conversations about personal symptoms and emotional health, with personal queries rising at night and on mobile.
- Recent polls report that roughly one-quarter to one-third of U.S. adults use AI for health advice, including people facing cost or access barriers, which has researchers calling for guardrails and user education.