xAI tests Arena Mode with Parallel Agents for Grok Build

xAI’s Grok Build, the company’s vibe coding solution first teased by TestingCatalog in early January, is shaping up to be far more ambitious than initially expected. While the local CLI agent was already known, new findings reveal that the remote version is progressing in parallel and will arrive with a suite of features that push Grok Build closer to a full-fledged IDE rather than a simple coding assistant.

BREAKING 🚨: xAI is working on Parallel Agents mode and Aren Mode for the upcoming Grok Build.

With Parallel Agents, users will be able to spawn up to 8 coding agents in parallel, while in Arena mode, we will likely see a tournament-style evaluation. pic.twitter.com/324TDKn3Pm
— TestingCatalog News 🗞 (@testingcatalog) February 15, 2026

The most notable addition is Parallel Agents, a feature that lets users send a single prompt to multiple AI agents simultaneously. The interface exposes two models, Grok Code 1 Fast and Grok 4 Fast, and allows up to four agents per model, meaning users could run eight agents at once.

Once triggered, a dedicated coding session opens where all agent responses are visible side by side, alongside a context usage tracker. This multi-agent approach aligns directly with Elon Musk’s stated vision of Grok spawning “hundreds of specialized coding agents all working together.”

Separately from parallel agents, there are traces of an arena mode buried in the code. Unlike the parallel view, which simply displays multiple outputs for the user to compare manually, this mode appears designed to have agents collaborate or compete to surface the best response, potentially scoring and ranking outputs automatically. This closely mirrors the tournament-style framework already present in Google’s Gemini Enterprise, where an Idea Generation agent ranks results through a structured competition process. If implemented, arena mode would mean xAI is not just letting users see multiple responses but actively building an evaluation layer on top of its multi-agent system.

Beyond agents, the UI is getting a substantial overhaul. Dictation support leans into the vibe coding philosophy. A new set of navigation tabs, Edits, Files, Plans, Search, and Web Page, transforms the interface into something resembling a browser-based IDE, with live code previews and codebase navigation. A Share button and a Comments feature round out the collaboration story. On the integrations side, a GitHub app connection is now visible in settings, though it remains nonfunctional.

Grok 4.20 is projected to arrive next week according to Elon. Yet it seems like there is no expectation for it to push SOTA benchmarks up.

“Grok 4.20 is finally out next week. Will be a significant improvement over 4.1.” https://t.co/3z47LnEfuW pic.twitter.com/jtp1hYEWKT
— TestingCatalog News 🗞 (@testingcatalog) February 15, 2026

There’s also a hidden internal Grok page called “Vibe” serving as a model override tool for xAI staff. With Grok 4.20 training reportedly delayed to mid-February due to infrastructure issues, the timeline for these features remains uncertain, but the groundwork is clearly being laid.