Google to unveil Imagen 4, Imagen 4 Ultra and Veo 3 models at I/O 2025

· 2 min read
Gemini
Image: Google

Google appears to be gearing up for a major reveal of its next-generation image and video generation models, with new variants of Veo and Imagen slated for release later this month, possibly timed for the annual I/O developer conference on May 20. Preview identifiers such as veo-3.0-generate-preview, imagen-4.0-generate-preview-05-20, and imagen-4.0-ultra-generate-exp-05-20 have surfaced, suggesting both a technical leap and a staged rollout strategy that mirrors past Google Labs test launches.

These versions signal a continuation of Google’s dual-track approach: Veo focusing on video generation and Imagen on photorealistic and stylized image synthesis. The appearance of “preview” and “ultra” tags implies multiple tiers or variants in capability, potentially aligned with user needs across creative, commercial, and research contexts. The naming also strongly indicates that Imagen 4 and Veo 3 are nearing maturity, with Imagen 3.5 and Veo 3 also expected to land in Google Labs for early public testing.

Imagen 4?

While details on their capabilities remain limited, the transition from earlier models like Imagen 3 and Veo 2 suggests a continued emphasis on higher fidelity, longer coherence in generative sequences, and perhaps deeper multimodal integration. Google's recent efforts to integrate AI-generated media into products like NotebookLM and Gemini hint at a broader vision—one where users can fluidly move from text to visuals to video using the same model backbone.

With the I/O event often serving as a launchpad for high-profile AI releases, all signs point to these model updates playing a central role in Google’s 2025 generative media roadmap. Whether these tools will be available to the public at launch or gated behind Labs invites or enterprise tiers is still unclear. But given the pace of competitive developments from OpenAI, Anthropic, and xAI, Google is clearly preparing to reassert its position at the forefront of generative video and image synthesis.