Overview
- Anthropic said Tuesday it is investigating why Claude Code sessions are burning through token quotas far faster than usual.
- Developers report simple prompts consuming large chunks of a session, including claims that a greeting can cost about 2% and that some Max plans exhaust in under an hour.
- Recent policy shifts reduced session allowances during peak hours for a small share of users, and a short promotion that temporarily doubled off‑peak limits has ended, complicating what people perceive as a change.
- A redditor who reverse‑engineered the app alleges two bugs break prompt caching and inflate token use by 10–20x, which Anthropic staff say they are reviewing but have not confirmed; some users say rolling back to an older version helps.
- Prompt caching matters because missing the five‑minute cache forces full reprocessing, while a one‑hour cache write costs 2x input tokens and cache reads cost 0.1x, so short breaks or silent retries in automated workflows can drain quotas and push users to try rivals.