xAI has announced the Grok Voice Agent API, opening programmatic access to Grok’s real-time voice capabilities for developers building voice-first applications. The release targets teams working on conversational agents, assistants, and companion-style products that require low-latency speech input and output, with configuration managed through the xAI console.
The Grok Voice Agent API leads the industry in cost-efficiency. Developers are billed at a simple flat rate of $0.05 per minute. pic.twitter.com/pviY5wjnzz
— xAI (@xai) December 17, 2025
The API exposes multiple Grok voices already familiar from Grok voice mode, including Sal, Rex, Eve, and Leo, alongside companion personas such as Mika and Valentin. Developers can control voice selection, system instructions, and behavioral parameters, while also toggling search capabilities that let Grok query the public web or X data during conversations. This positions the API for use cases ranging from customer support and social companions to research assistants that speak and listen in real time.

Under the hood, the Grok Voice Agent API is designed around streaming audio, enabling near real-time speech recognition and synthesis rather than batch transcription or playback. The console interface suggests tight coupling with other Grok services, and early indicators point to future expansion, including broader file handling and media generation endpoints that could unify voice, text, and multimodal workflows under a single API surface.
For xAI, the launch marks a step toward making Grok a developer platform rather than only a consumer feature inside X. By productizing voices and companions through an API, the company is positioning Grok as a competitor to established voice AI stacks, while leveraging its distinctive data sources and persona-driven approach to differentiate in a crowded voice agent market.