Particle.news
Download on the App Store

Thinking Machines Unveils Real‑Time Interaction Model That Listens While It Speaks

The research preview challenges voice systems that layer latency tricks on turn‑based chatbots.

Overview

  • Thinking Machines announced a research preview of “interaction models” that keep listening, seeing, and speaking in one continuous flow across audio, video, and text.
  • The design uses 200‑millisecond micro‑turns for rapid back‑and‑forth and hands slower planning and tool use to a separate background model.
  • The lab introduced TML‑Interaction‑Small, a mixture‑of‑experts system with 276 billion parameters and 12 billion active per step, and it reported large gains on new timing and temporal tests versus OpenAI’s GPT Realtime‑2 minimal.
  • Claims include 64.7% accuracy on a time‑aware speech test called TimeSpeak and 35.4% on temporal action counting, though the article does not provide independent verification.
  • Serving changes include streaming 200‑millisecond chunks via SGLang and a training‑to‑inference “bitwise” match for deterministic outputs, with the lab saying longer sessions and scale remain open work for 2026.