OpenAI has rolled out GPT-5.3-Codex, positioning Codex as more than a code-writing agent and closer to a general computer-use agent for developers and other professionals. OpenAI says early versions helped debug its own training, manage deployment, and diagnose evaluations, effectively accelerating the model’s path to release. The company is highlighting benchmark gains across coding and agentic tasks, including a new high on SWE-Bench Pro (56.8%) and Terminal-Bench 2.0 (77.3%), plus stronger OSWorld-Verified results (64.7%), alongside fewer tokens used than prior models.
BREAKING 🚨: GPT-5.3-CODEX IS ROLLING OUT ON CODEX CLI AND DESKTOP APP!
— TestingCatalog News 🗞 (@testingcatalog) February 5, 2026
COMPETITION AT SCALE 🔥 pic.twitter.com/MKss047eBo
GPT-5.3-Codex is aimed at end-to-end software work, not just code generation: debugging, deploying, monitoring, writing PRDs, tests, metrics, copy edits, and research tasks. OpenAI also claims it is more reliable on underspecified web requests, producing fuller default sites and UI elements without extra prompting. Codex is now described as more “interactive” inside the Codex app, with more frequent progress updates and optional in-progress steering via a follow-up setting.
BREAKING 🚨: GPT‑5.3‑CODEX WAS USED TO SUPPORT CREATING ITSELF, ACCORDING TO OPENAI'S BLOG!
— TestingCatalog News 🗞 (@testingcatalog) February 5, 2026
It achieves SOTA score of 57% at SWE Bench Pro and 76% on TerminalBench.
"With GPT‑5.3-Codex, Codex goes from an agent that can write and review code to an agent that can do nearly… https://t.co/BWqjYi6Y5t pic.twitter.com/Tlz14JmzQG
Availability is tied to paid ChatGPT plans wherever Codex runs, including the app, CLI, IDE extension, and web, with API access described as coming later. OpenAI says Codex runs 25% faster for users due to inference and infrastructure changes, and that GPT-5.3-Codex was trained and served on NVIDIA GB200 NVL72 systems. On security, OpenAI classifies it as “High capability” for cybersecurity tasks under its Preparedness Framework, is launching a Trusted Access for Cyber pilot, expanding its Aardvark security agent beta, and committing $10M in API credits to support defensive work.