Overview
- MIT CSAIL’s 2025 AI Agent Index examined 30 prominent systems across chat, browser, and enterprise categories after a year of surging interest and deployments.
- Thirteen of the 30 agents operate at frontier autonomy levels, with browser-based agents showing especially high independence in executing multi-step tasks.
- Twenty-one agents provide no disclosure that they are automated, and many mimic human traffic with Chrome-like user-agent strings and residential IPs, while only seven publish stable UA strings or IP ranges.
- Safety transparency is limited: half cite general safety frameworks, about a third have none, 23 report no third-party testing, and only four publish agent-specific system cards (ChatGPT Agent, OpenAI Codex, Claude Code, Gemini 2.5).
- Most agents act as wrappers around models from a few major providers, documentation on robots.txt and CAPTCHA handling is often missing, and recent industry moves toward standards have not closed these transparency gaps.