Google rolls out Gemini Omni AI for video generation and editing

What's new? Gemini Omni Flash is a multimodal AI model that takes text, image, video, and audio inputs. It supports video editing and embeds a SynthID watermark.

· 1 min read
Omni

Google has officially introduced Gemini Omni, a multimodal AI model that integrates reasoning abilities with creative generation across video, image, audio, and text inputs. The launch begins with Gemini Omni Flash, which is immediately available to all Google AI Plus, Pro, and Ultra subscribers globally through the Gemini app and Google Flow. Additionally, users of YouTube Shorts and YouTube Create App can access it at no cost, and developer and enterprise access via API is expected within weeks.

Gemini Omni stands out for its capacity to generate high-quality, context-aware videos based on natural language instructions or reference media. Users can perform conversational video edits, maintaining scene continuity and character consistency over multiple editing steps. The model's improved grasp of physics allows for more realistic visual effects and scene changes, supporting both creative storytelling and technical explainers. Gemini Omni also embeds a SynthID digital watermark in all outputs for verification and transparency.

This launch marks a major expansion of Google's multimodal AI services, targeting a broad audience including content creators, educators, and enterprise users seeking advanced video and media generation tools. By offering Gemini Omni Flash across its AI subscription tiers and popular creator platforms, Google aims to compete directly with other AI generation tools on the market, leveraging its existing user base and expertise in responsible AI development.

Source