Particle.news
Download on the App Store

Microsoft Lets RTX 30‑Series GPUs Run Local Windows Language Models

Developers can now use an experimental GPU path to run Microsoft’s small on‑device model locally which could broaden access to on‑device AI.

Overview

  • Microsoft updated the Windows App SDK documentation on June 10–11 to add an experimental “Language Model APIs on GPU” option that lets the APIs run on non‑Copilot+ Windows 11 PCs with NVIDIA GeForce RTX 30‑series or newer GPUs that have at least 6 GB of VRAM.
  • The on‑device model used by these APIs is called Phi Silica and apps can download it via Windows Update so the model runs locally on the machine rather than in the cloud.
  • This GPU route is exposed at the developer/API layer so apps must be built or updated to call the Language Model APIs and users may need Windows Insider or developer settings to test the experimental SDK.
  • Key consumer Copilot+ features such as Windows Recall and Click to Do remain tied to NPU‑equipped Copilot+ machines and are not yet available to GPU‑only PCs.
  • The change widens the pool of PCs that can run local text and prompt features, which could speed developer adoption of local AI, shift OEM positioning, and raise more use of on‑device models for privacy and responsiveness if Microsoft expands support beyond this experimental stage.