OpenAI unveils Agent Mode in ChatGPT for complex tasks

OpenAI unveiled ChatGPT Agent on July 17, 2025, rolling it out first to Pro, Plus, and Team users, with Enterprise and Education access promised in the coming weeks (EEA and Switzerland pending). A new “Agent Mode” switch inside ChatGPT lets users hand over end-to-end tasks, such as compiling a competitive slide deck or booking a dinner party, from a single prompt.

ChatGPT can now do work for you using its own computer.

Introducing ChatGPT agent—a unified agentic system combining Operator’s action-taking remote browser, deep research’s web synthesis, and ChatGPT’s conversational strengths. pic.twitter.com/7uN2Nc6nBQ
— OpenAI (@OpenAI) July 17, 2025

The agent fuses Operator’s web-action toolkit with Deep Research’s analytical depth. Running inside its own virtual computer, it can choose among a visual browser, text browser, terminal, and direct API connectors to Gmail, GitHub, calendars, and more. It clicks, scrolls, logs in under user supervision, runs code, manipulates files, and returns editable spreadsheets or slides while pausing for confirmations and allowing real-time interruption.

Performance numbers show why OpenAI calls this its most capable model yet: 41.6 pass@1 on Humanity’s Last Exam, 27.4% on FrontierMath, 68.9% on BrowseComp, and more than double Copilot in Excel on SpreadsheetBench. Internal evaluations found its output equal to or better than human analysts on roughly half of complex finance and data-science tasks.

Risk controls match the broader reach. The agent seeks consent before costly actions, confines sensitive inputs to a secure takeover browser, and resists prompt-injection with live monitoring. A single settings toggle clears all browsing data; high-impact steps like sending email stay in “watch mode” for mandatory user review.

Pilot users in consulting and design report substantial time saved creating decks and summarising inboxes, though multi-step checkout flows can still stall, underscoring the value of the pause-and-resume controls. Competitors are chasing similar goals, yet none currently mix browsing, coding, and document editing inside one conversational workflow.

With this launch, the San Francisco-based lab advances its strategy of rapid, public product cycles rooted in system-card transparency and bug-bounty feedback, aiming to establish ChatGPT as a backbone for modern knowledge work while iterating toward broader geographic availability.

Source