Particle News: Aletheia Debuts as Gemini-Based AI Math Research Agent With Claimed Research Results

Overview

The agent iteratively generates, verifies, and revises natural‑language proofs using an advanced Gemini Deep Think variant, a novel inference‑time scaling law, and integrated tools.
The authors cite three milestones: an AI‑only paper on eigenweights (Feng26), a human–AI collaboration on bounds for independent sets (LeeSeo26), and a semi‑autonomous run on 700 Erdős problems with four autonomous solutions.
Secondary reports state Aletheia scored 91.9% on the IMO‑Proofbench Advanced benchmark, surpassing the standalone advanced Gemini Deep Think while using less compute.
The announcement builds on prior Gemini results reaching International Mathematical Olympiad gold‑medal competence, shifting focus from contest problems to longer‑horizon research proofs.
The paper proposes standards to label autonomy and novelty in AI‑assisted mathematics, and researchers highlight recurring risks including confirmation bias, technical hallucinations, and alignment‑related friction.