OpenAI unveils Agent Mode in ChatGPT for complex task handling

Good things - Agent Mode will work as a mix between Deep Research and Operator with access to connectors, web browser and terminal. Users will be able to schedule Agent Mode tasks; Agent Mode won't be available in the EU initially.

· 2 min read
ChatGPT

OpenAI unveiled ChatGPT Agent on July 17, 2025, rolling it out first to Pro, Plus, and Team users, with Enterprise and Education access promised in the coming weeks (EEA and Switzerland pending). A new “Agent Mode” switch inside ChatGPT lets users hand over end-to-end tasks, such as compiling a competitive slide deck or booking a dinner party, from a single prompt.

The agent fuses Operator’s web-action toolkit with Deep Research’s analytical depth. Running inside its own virtual computer, it can choose among a visual browser, text browser, terminal, and direct API connectors to Gmail, GitHub, calendars, and more. It clicks, scrolls, logs in under user supervision, runs code, manipulates files, and returns editable spreadsheets or slides while pausing for confirmations and allowing real-time interruption.

Performance numbers show why OpenAI calls this its most capable model yet: 41.6 pass@1 on Humanity’s Last Exam, 27.4% on FrontierMath, 68.9% on BrowseComp, and more than double Copilot in Excel on SpreadsheetBench. Internal evaluations found its output equal to or better than human analysts on roughly half of complex finance and data-science tasks.

ChatGPT

Risk controls match the broader reach. The agent seeks consent before costly actions, confines sensitive inputs to a secure takeover browser, and resists prompt-injection with live monitoring. A single settings toggle clears all browsing data; high-impact steps like sending email stay in “watch mode” for mandatory user review.

Pilot users in consulting and design report substantial time saved creating decks and summarising inboxes, though multi-step checkout flows can still stall, underscoring the value of the pause-and-resume controls. Competitors are chasing similar goals, yet none currently mix browsing, coding, and document editing inside one conversational workflow.

With this launch, the San Francisco-based lab advances its strategy of rapid, public product cycles rooted in system-card transparency and bug-bounty feedback, aiming to establish ChatGPT as a backbone for modern knowledge work while iterating toward broader geographic availability.

Source