Particle.news
Download on the App Store

Microsoft Releases Phi-4-Reasoning-Vision-15B, a 15B-Parameter Model With Controllable Visual Reasoning

Developers can toggle thinking modes to balance latency with multi-step reasoning depth.

Overview

  • Microsoft made the model available today on Microsoft Foundry and Hugging Face, alongside sample notebooks, inference code, a technical paper, and a model card.
  • It fuses high-resolution visual perception with selective, task-aware reasoning to interpret images, diagrams, documents, and UI screens.
  • Highlighted applications include computer-use agents for GUI automation, diagram-based math and scientific reasoning, and document, chart, and table understanding.
  • Reasoning behavior can be set to hybrid, think, or nothink via prompt tokens, letting teams trade speed for deeper analysis in real time.
  • Microsoft published internal benchmark comparisons across multimodal, math, OCR, and computer-use tasks and outlined safety training signals and deployment guidance aligned to its Responsible AI principles.