Particle.news
Download on the App Store

Apple Rolls Out 20B On‑Device AI and Private Cloud GPU Partnership

Apple’s sparse 20‑billion‑parameter on‑device model runs from NAND on high‑end Apple silicon with the top cloud tier using NVIDIA GPUs in Google Cloud under confidential compute.

Overview

  • Apple unveiled a five‑model AFM 3 family that includes two on‑device models and three cloud models, and company leaders clarified the technical details after the WWDC keynote.
  • AFM 3 Core Advanced is a natively multimodal, 20‑billion‑parameter on‑device model that stores the full weights in NAND flash and activates only 1–4 billion parameters per request using Instruction‑Following Pruning to fit within DRAM limits.
  • Apple said AFM Cloud Pro will run on NVIDIA GPUs hosted in Google Cloud under confidential‑compute protections and an auditable hardware ledger so third parties can verify that Apple’s servers cannot be read by cloud providers.
  • Apple executives said the AFMs were custom built for Apple Silicon and trained on proprietary data, and that Google’s Gemini outputs were used for refinement and distillation rather than as deployed Gemini models or Google client code.
  • The rollout ties advanced local AI to high‑end chips—AFM Core Advanced requires top‑tier silicon and more memory—creating a two‑tier user experience that could push buyers toward newer Pro devices and change how Apple balances privacy, latency, and feature access.