Particle.news
Download on the App Store

OpenAI Launches GPT-Realtime-2 With Touted 'GPT-5-Class' Voice Reasoning, Plus New Translate and Whisper Models

The update focuses on smarter, longer voice interactions for developer-built agents.

Overview

  • OpenAI introduced three speech models for real-time apps — GPT-Realtime-2, Realtime-Translate, and Realtime-Whisper — with the company describing Realtime-2 as having “GPT-5-class reasoning.”
  • Realtime-2 expands the context window to 128,000 tokens and posts an 11% performance gain over version 1.5 to support longer, more complex conversations.
  • New voice-agent controls include short preambles like “let me check that,” parallel tool calls during a chat, and selectable reasoning effort from minimal to xhigh.
  • Pricing holds for Realtime-2 at $32 per 1 million audio input tokens and $64 per 1 million output tokens, while Translate is $0.034 per minute and Whisper is $0.017 per minute.
  • Microsoft said these models are rolling out in Foundry to power live translation, low-latency transcription, and voice assistants that reason through multi-step tasks.