Particle News: Thinking Machines Unveils Real‑Time Interaction Model That Listens While It Speaks

Overview

Thinking Machines announced a research preview of “interaction models” that keep listening, seeing, and speaking in one continuous flow across audio, video, and text.
The design uses 200‑millisecond micro‑turns for rapid back‑and‑forth and hands slower planning and tool use to a separate background model.
The lab introduced TML‑Interaction‑Small, a mixture‑of‑experts system with 276 billion parameters and 12 billion active per step, and it reported large gains on new timing and temporal tests versus OpenAI’s GPT Realtime‑2 minimal.
Claims include 64.7% accuracy on a time‑aware speech test called TimeSpeak and 35.4% on temporal action counting, though the article does not provide independent verification.
Serving changes include streaming 200‑millisecond chunks via SGLang and a training‑to‑inference “bitwise” match for deterministic outputs, with the lab saying longer sessions and scale remain open work for 2026.