New Gemini 1.5 models now available with 64% lower token prices and faster speeds

· 2 min read
Gemini

Google has announced the release of two updated production-ready Gemini models: Gemini-1.5-Pro-002 and Gemini-1.5-Flash-8B-Exp-0924. These models are designed for general performance across a wide range of text, code, and multimodal tasks, such as synthesizing information from large PDFs, answering questions about extensive code repositories, and creating content from hour-long videos.

Key improvements in these models include:

  1. A 7% increase in MMLU-Pro benchmark performance
  2. A 20% improvement in MATH and HiddenMath benchmarks
  3. Better performance in vision and code use cases (ranging from 2-7% improvement)

The models also offer more concise responses, with default output lengths 5-20% shorter than previous models, making them easier to use and more cost-efficient.

Additionally, Google is reducing the pricing for Gemini 1.5 Pro, with:

  1. A 64% price reduction on input tokens
  2. A 52% price reduction on output tokens
  3. A 64% price reduction on incremental cached tokens for prompts less than 128K tokens

These changes will be effective October 1st, 2024. The paid tier rate limits for 1.5 Flash and 1.5 Pro are also being increased to 2,000 RPM and 1,000 RPM, respectively.

The models are available for free via Google AI Studio and the Gemini API, and for larger organizations and Google Cloud customers, they are also available on Vertex AI. Developers can access these models to build a variety of applications, leveraging the 2 million token long context window and multimodal capabilities of Gemini 1.5 Pro.

Updated production-ready Gemini models, reduced 1.5 Pro pricing, increased rate limits, and more- Google Developers Blog
Increased accessibility and cost-efficiency for developers enables you to build a wider range of applications using Google’s advanced AI technology.

Google emphasizes its focus on building safe and reliable models, with improvements to the model's ability to follow user instructions while balancing safety. The company will continue to offer a suite of safety filters that developers may apply to Google’s models, allowing them to determine the configuration best suited for their use case.