Just days before Google I/O kicks off, fresh signals from inside the desktop Gemini build point to a sweeping upgrade for the recently launched Mac client, which has lagged behind the web version. The initial release was deliberately pared back, but the next wave appears ready to close that gap.

A Gemini Live mode is being prepared as a floating desktop overlay, allowing Gemini to observe what's happening on screen and respond in real time via a voice model. This positions Google directly against ChatGPT's macOS companion mode and the screen-aware Claude experiments out of Anthropic. A second addition, internally framed as Stream to Cursor, appears to plug into the Magic Pointer concept previewed at The Android Show. Rather than waiting for a prompt, the cursor itself would read context around whatever element it hovers over and surface relevant suggestions, blurring the line between pointing device and agent trigger.
Gemini Desktop
Video generation is also being threaded into the desktop client through what is internally labeled "Veo4 Omni". The naming hints at a single omni-modal output system rolling up under the broader Gemini Omni umbrella.

The most consequential thread is Gemini Spark on desktop. Users would be able to point Spark at local folders and let the agent edit, analyze, move, and rename files within them, with support for skills and connector access to Google Drive and the broader Google services layer. That would extend Spark from a proactive web assistant to a local file-system agent, the territory currently being pursued by OpenAI's Codex desktop work and Anthropic's Claude Code.
Join Dev Mode Discord for more 👀
Taken together, Google appears to be preparing the desktop app to host its full agentic stack rather than serving as a thin wrapper around the chat window. With I/O opening tomorrow, much of this should surface on stage!
Credits: Anonymous Contributor