Thinking Machines announced new Interaction Voice Models

What's new? Thinking Machines previewed its AI for real-time native exchange over audio, video, and text.

· 1 min read
Thinking Machines
Image: Thinking Machines

Thinking Machines has introduced a research preview of its interaction models, a new AI system built to enable real-time, native collaboration across audio, video, and text without relying on external scaffolding. Unlike conventional turn-based models, this system continuously processes and responds to multiple input modalities, supporting features such as simultaneous speech, time-awareness, and real-time dialog management.

The architecture leverages a multi-stream, micro-turn design that processes and generates outputs in 200ms increments, maintaining ongoing two-way exchange with users. The system is composed of a time-aware interaction model and an asynchronous background model that handles extended reasoning and tool usage, and seamlessly integrates background results into conversations.

The preview is currently available to a limited group of researchers, with broader access planned for later in the year. The primary audience includes AI researchers, developers, and organizations exploring advanced human-AI collaboration. Early benchmarks show the TML-Interaction-Small model outperforms existing models in both intelligence and interaction quality, demonstrating low latency and robust multimodal responsiveness. Industry observers have noted that this approach sets a new expectation for naturalistic, real-time AI collaboration. Thinking Machines, the company behind this release, is focused on advancing AI models that prioritize collaborative processes, addressing the limitations of existing turn-based interfaces, and keeping humans in the loop during complex tasks.

Source