Particle News: Apple Reportedly Will Fold Google’s Gemini Into a Hybrid Siri

Overview

Multiple reports say Apple plans a Gemini-infused Siri that runs smaller models on iPhones and uses larger Gemini instances in the cloud for harder requests.
Apple is reportedly distilling Google's large Gemini models into much smaller, quantized variants so they can run within iPhone memory and Neural Engine limits.
For the heaviest inference, Apple would route queries to Google Cloud running on Nvidia hardware and use confidential computing to protect data during processing.
Apple is said to be exploring acquisitions, including talks with Liquid AI, to speed development of on-device model techniques and close performance gaps.
The change marks a shift from Apple’s prior local-only privacy pitch and could alter Siri’s speed, accuracy, and how Apple explains data handling at WWDC 2026.