Particle.news
Download on the App Store

Microsoft Releases Phi‑4‑Reasoning‑Vision‑15B on Foundry and Hugging Face

The 15B‑parameter model adds prompt‑controlled multi‑step reasoning to high‑resolution visual perception.

Overview

  • Developers can access the model now through Microsoft Foundry and Hugging Face with sample notebooks, API examples, a model card, and a technical paper.
  • The system offers three selectable thinking modes—hybrid, think, and nothink—letting teams balance deeper reasoning against lower latency during runtime.
  • Targeted uses include computer‑use agents that interpret UI screens and return actionable coordinates, as well as visual math, science, and document, chart, or table understanding.
  • Microsoft published internal benchmark results spanning multimodal reasoning, mathematics, and GUI tasks, presented as comparative analysis rather than formal leaderboard claims.
  • Safety alignment draws on public safety datasets and internally generated refusal examples under Microsoft’s Responsible AI principles, with documented limitations and deployment guidance.