Particle.news
Download on the App Store

Google Narrows Gemini Quota Rules After Users Report Rapid Drains

It will retain a compute-based quota system while introducing per-prompt caps, free Flash‑Lite use, and clearer usage reporting.

Overview

  • Google moved Gemini to a compute-based quota system at I/O in mid-May and on Friday announced targeted fixes after subscribers reported that heavy requests could exhaust multi-hour allowances.
  • The compute-based system charges quota by processing cost, so simple text prompts use far less quota than video generation, complex coding, or Deep Research tasks.
  • Google capped how much a single prompt can consume on Gemini 3.1 Pro and said failed requests will not count against user quotas to prevent one task from wiping out an entire allowance.
  • The company made 3.1 Flash‑Lite prompts free, patched a bug that let one or two Omni video generations drain accounts, and doubled Omni generation allotments for AI Ultra subscribers.
  • Google pledged clearer usage breakdowns, notifications for heavy tasks, and future pay-as-you-go top-up credits so users can see what uses quota and buy more compute when needed.