Z AI launched GLM-5, new open-source model on chat and APIs

Z.ai has released GLM-5 on February 11, 2026, positioning it as a flagship open-weight model built for complex systems engineering and long-horizon agent work. The rollout targets developers and teams that need an LLM to plan, execute, and iterate across large codebases and multi-step tool workflows, not just generate snippets.

GLM-5 is framed as a shift from “vibe coding” toward agentic engineering, where the model is expected to handle end-to-end project construction, refactors, deep debugging, and longer task chains with tighter goal consistency. It also keeps a large context window for sustained work across many files, specs, and intermediate artifacts.

GLM-5 from @Zai_org just climbed to #1 among open models in Text Arena!
▫️#1 open model on par with claude-sonnet-4.5 & gpt-5.1-high
▫️#11 overall; scoring 1452, +11pts over GLM-4.7

Test it out in the Code Arena and keep voting, we’ll see how GLM-5 performs for agentic coding… https://t.co/GEwxRiz2wq pic.twitter.com/MajenrS0Qz
— Arena.ai (@arena) February 11, 2026

On the technical side, GLM-5 scales up from the prior generation with a mixture-of-experts design at roughly 744B parameters total and 40B active parameters, and it increases pre-training data from 23T to 28.5T tokens. It also integrates DeepSeek Sparse Attention to reduce serving cost while keeping long-context capacity, and it uses an asynchronous reinforcement-learning setup (slime) to raise post-training throughput for more frequent iteration.

Benchmark disclosures put GLM-5 at the top tier among open-weight models for reasoning, coding, and tool-based tasks, with results that are described as approaching Claude Opus 4.5 on software engineering workloads. Reported scores include:

77.8 on SWE-bench Verified
56.2 on Terminal-Bench 2.0

These scores are alongside strong results on web retrieval and multi-tool planning benchmarks such as BrowseComp and MCP-Atlas.

Weights are published publicly with a permissive license, and the model is also offered through Z.ai’s chat and API stack. Deployment guidance is already oriented to production inference via common serving frameworks like vLLM and SGLang, with support called out for running inference on domestically produced accelerators including Huawei Ascend, alongside additional local silicon options named in the company’s rollout messaging.

Z.ai, the company behind the GLM family, has been iterating rapidly on coding-first and agent-first releases, with GLM-4.7 arriving in late 2025 and earlier GLM-4.5 and multimodal variants forming the base of its current platform lineup. GLM-5 is the clearest signal yet that the company wants its open-weight flagship to compete in real software delivery settings, where long context, tool calling, structured outputs, and sustained execution matter as much as raw benchmark performance.

Source