Hume AI tests Octave 2 Multilingual text-to-speech model

Hume AI is testing Octave 2 Multilingual internally, a text-to-speech model with low-latency voice synthesis across 10+ languages for real-time audio.

· 2 min read
Hume

Hume AI is preparing to launch Octave 2 Multilingual, the next step in its text-to-speech lineup after the earlier release of the original Octave model. Octave 2 brings support for more than 10 languages, expanding far beyond the first model’s focus on emotionally expressive English voices. Its description mentions expressive and natural voices with low latency, making it suitable for use cases that require fast, real-time voice generation, such as live translation, voicebots, and conversational interfaces.

Example - "Robot & Russian hacker dialogue"

audio-thumbnail
2025Sep29 dc995e9e 5379 48d7 a62a 81593300395e535155 0
0:00
/30.439875

The new model is positioned to benefit a wide range of users, from developers building multilingual apps and real-time translation tools to creators working on podcasts or audiobooks in multiple languages. One of the key advances is the ability to switch between languages and deliver speech that sounds convincingly human, even for languages with distinct phonetics like Russian. In early side-by-side comparisons, Octave 2 reportedly produces more natural-sounding audio than its predecessor, making it difficult to distinguish from an actual human speaker, which is notable for any AI-generated voice system.

Comparison between Octave 2

audio-thumbnail
HumeAI 2025Sep28 c868c6ef b42b 4ea2 9948 bf08f22c762f520360 0
0:00
/15.02

vs Octave 1

audio-thumbnail
HumeAI 2025Sep28 dc995e9e 5379 48d7 a62a 81593300395e524383 7f04edd4 0b53 4699 acd4 951e8756b98f
0:00
/16.886604

The Octave 2 Multilingual model is not yet publicly available but has surfaced in early internal and hidden tests, suggesting a public release is imminent. This fits into Hume AI’s broader product direction, where the company focuses on emotion-rich and context-aware AI voices. If Octave 2’s fast response times and language flexibility hold up at scale, it could quickly attract both commercial and research interest, especially as demand rises for tools that handle real-time, multilingual audio. The discovery of its new features came from testing and observing differences in generated outputs, without public documentation or announcements yet from Hume AI.

As the rollout nears, developers and early adopters should watch closely for further updates and public demos.