Particle News: Google DeepMind Releases Gemma 4, Open AI Models Built for Phones to Workstations

Overview

Gemma 4, which Google announced Thursday, arrives in four sizes—E2B, E4B, 26B Mixture of Experts, and 31B Dense—designed to run from smartphones to high‑end GPUs.
Google is publishing model weights under Apache 2.0, allowing developers to use, modify, and redistribute the models for commercial projects without extra terms beyond attribution.
The E2B and E4B variants run fully offline with near‑zero latency on devices such as Android phones, Raspberry Pi, and NVIDIA Jetson Orin Nano, following work with the Pixel team, Qualcomm, and MediaTek.
Capabilities include multi‑step reasoning, function‑calling with structured JSON output for agent tasks, offline code generation, native image and video input plus audio on the edge models, support for over 140 languages, and context windows up to 256K tokens.
Google cites Arena AI results placing the 31B model at No. 3 and the 26B at No. 6 among open models, with claims of beating much larger systems, and the company is distributing access via Google AI Studio and AI Edge Gallery with weights on Hugging Face, Kaggle, and Ollama after reporting 400M Gemma downloads and 100k community variants.