Gemma 4, the latest open model family from Google, has just been released with a focus on on-device performance for a broad spectrum of users, including researchers, developers, and organizations seeking to deploy AI across diverse hardware environments. This suite of models is now available under the Apache 2.0 license, providing commercial freedom and open access to anyone interested in deploying or fine-tuning the technology.
Build autonomous agents that plan, navigate apps, and execute multi-step tasks – like searching databases or triggering APIs – with native tool use.
— Google DeepMind (@GoogleDeepMind) April 2, 2026
With up to 256K context, it can analyze full codebases and retain complex action histories without losing focus. pic.twitter.com/mYhqC8peVF
The release introduces several model sizes optimized for specific use cases and hardware ranges:
- The E2B and E4B models are designed specifically for edge and mobile devices, enabling multimodal tasks such as speech recognition, vision, and code generation directly on phones, Raspberry Pi, and other low-power devices, supporting offline processing with low latency.
- Larger 26B and 31B models are tailored for high-performance tasks on personal computers and workstations, with options for both speed-focused and quality-focused deployments.
The models support multistep reasoning, agentic workflows, 140+ languages, and extended context windows of up to 256K tokens, surpassing many previous open models in versatility and multilingual reach.
Gemma 4 is the result of close collaborations with mobile hardware leaders and research institutions, aiming to provide a foundation for both commercial and exploratory projects. By releasing these models openly and with broad hardware compatibility, Google is positioning itself as a key player in the democratization of advanced AI capabilities, allowing developers worldwide to integrate state-of-the-art reasoning and multimodal understanding into their applications without restrictive barriers.