Particle News: Study Says Google’s AI Overviews Miss About 1 in 10 Answers; Google Disputes Findings

Overview

Oumi, analyzing 4,326 queries for The New York Times with the SimpleQA test, found Google’s AI Overviews were 85% accurate with Gemini 2 and 91% with Gemini 3.
Extrapolating that error rate to roughly five trillion yearly searches suggests tens of millions of wrong answers every hour.
Google called the methodology flawed and pointed to issues in the OpenAI-built benchmark, while noting its ranking and safety systems screen poor sources.
Separate reporting cites a Google internal test that found Gemini 3 outputs were incorrect 28% of the time, though Google says AI Overviews perform better than the raw model.
Oumi reported weak grounding in the citations, with unsubstantiated links in 37% of Gemini 2 responses and 56% with Gemini 3, and frequent references to Facebook and Reddit.