AI Edge Gallery is an experimental Android application that allows users to run advanced generative AI models directly on their devices, with an iOS version expected soon. The app became publicly available on May 20, 2025, via the Google AI Edge GitHub repository. It is designed for developers, researchers, and tech enthusiasts who want to experiment with AI models locally, without relying on cloud servers or an internet connection after the initial model setup.
Powered by EmailOctopus
The AI Edge Gallery supports a range of open-source models, including Gemma 3, Gemma 3n, and Qwen 2.5, and enables users to chat, process images, and explore single-prompt responses in the Prompt Lab. Models vary in size from lightweight (~500 MB) to more comprehensive (~4 GB), and users can import their own LiteRT-format models for testing. The app is optimized for Android 10+ devices with at least 6GB RAM and modern chipsets, and it leverages the LiteRT runtime for efficient on-device inference.
1. Download Google AI Edge Gallery
— Paul Couvert (@itsPaulAi) May 27, 2025
Access the official Google AI Edge Gallery GitHub repo.
Go to the "Releases" section, then download and install the .apk file (Android).
The iOS version is coming soon.
Link: https://t.co/KKUgxtyyuV pic.twitter.com/N4lpZJaHYb
Key features include:
- Image-based Q&A
- Text summarization
- Code generation
- Multi-turn conversations
- Real-time performance metrics such as TTFT and decoding speed
All data processing occurs locally, ensuring privacy and security.
AI Edge Gallery is positioned as a practical demonstration of on-device generative AI and the LLM Inference API. The project is open-source under the Apache 2.0 license, and Google encourages community feedback and contributions. Early reactions from the developer and AI communities highlight the app’s potential for privacy-focused, offline AI experimentation, though some note current limitations such as model size constraints and the absence of voice interaction. Future updates are expected to bring iOS support, real-time voice features, and enhanced hardware acceleration.