https://developers.googleblog.com/en/updated-production-ready-gemini-models-reduced-15-pro-pricing-increased-rate-limits-and-more/ Google for Developers Products * Develop * Android * Chrome * ChromeOS * Cloud * Firebase * Flutter * Google Assistant * Google Maps Platform * Google Workspace * TensorFlow * YouTube * Grow * Firebase * Google Ads * Google Analytics * Google Play * Search * Web Push and Notification APIs * Earn * AdMob * Google Ads API * Google Pay * Google Play Billing * Interactive Media Ads Solutions Events Learn Community * Groups * Google Developer Groups * Google Developer Student Clubs * Woman Techmakers * Google Developer Experts * Tech Equity Collective * Programs * Accelerator * Solution Challenge * DevFest * Stories * All Stories Developer Program Blog [ ] Search English * English * Espanol (Latam) * Bahasa Indonesia * Ri Ben Yu * hangugeo * Portugues (Brasil) * Jian Ti Zhong Wen Google for Developers * Products + More * Solutions * Events * Learn * Community + More * Developer Program * Blog * Develop * Android * Chrome * ChromeOS * Cloud * Firebase * Flutter * Google Assistant * Google Maps Platform * Google Workspace * TensorFlow * YouTube * Grow * Firebase * Google Ads * Google Analytics * Google Play * Search * Web Push and Notification APIs * Earn * AdMob * Google Ads API * Google Pay * Google Play Billing * Interactive Media Ads * Groups * Google Developer Groups * Google Developer Student Clubs * Woman Techmakers * Google Developer Experts * Tech Equity Collective * Programs * Accelerator * Solution Challenge * DevFest * Stories * All Stories [English ] Gemini Updated production-ready Gemini models, reduced 1.5 Pro pricing, increased rate limits, and more SEP 24, 2024 Logan Kilpatrick Senior Product Manager Gemini API and Google AI Studio Shrestha Basu Mallick Group Product Manager Gemini API Share * Facebook * Twitter * LinkedIn * Mail * Updated production-ready Gemini models header image Today, we're releasing two updated production-ready Gemini models: Gemini-1.5-Pro-002 and Gemini-1.5-Flash-002 along with: * >50% reduced price on 1.5 Pro (both input and output for prompts <128K) * 2x higher rate limits on 1.5 Flash and ~3x higher on 1.5 Pro * 2x faster output and 3x lower latency * Updated default filter settings These new models build on our latest experimental model releases and include meaningful improvements to the Gemini 1.5 models released at Google I/O in May. Developers can access our latest models for free via Google AI Studio and the Gemini API. For larger organizations and Google Cloud customers, the models are also available on Vertex AI. Improved overall quality, with larger gains in math, long context, and vision The Gemini 1.5 series are models that are designed for general performance across a wide range of text, code, and multimodal tasks. For example, Gemini models can be used to synthesize information from 1000 page PDFs, answer questions about repos containing more than 10 thousand lines of code, take in hour long videos and create useful content from them, and more. With the latest updates, 1.5 Pro and Flash are now better, faster, and more cost-efficient to build with in production. We see a ~7% increase in MMLU-Pro, a more challenging version of the popular MMLU benchmark. On MATH and HiddenMath (an internal holdout set of competition math problems) benchmarks, both models have made a considerable ~20% improvement. For vision and code use cases, both models also perform better (ranging from ~2-7%) across evals measuring visual understanding and Python code generation. A table showcasing benchmark data, demonstrating improved performance for the latest Gemini models, Gemini 1.5 Pro and Gemini 1.5 Flash. The table highlights advancements in various capabilities including reasoning, code, and math We also improved the overall helpfulness of model responses, while continuing to uphold our content safety policies and standards. This means less punting/fewer refusals and more helpful responses across many topics. Both models now have a more concise style in response to developer feedback which is intended to make these models easier to use and reduce costs. For use cases like summarization, question answering, and extraction, the default output length of the updated models is ~5-20% shorter than previous models. For chat-based products where users might prefer longer responses by default, you can read our prompting strategies guide to learn more about how to make the models more verbose and conversational. For more details on migrating to the latest versions of Gemini 1.5 Pro and 1.5 Flash, check out the Gemini API models page. Gemini 1.5 Pro We continue to be blown away with the creative and useful applications of Gemini 1.5 Pro's 2 million token long context window and multimodal capabilities. From video understanding to processing 1000 page PDFs, there are so many new use cases still to be built. Today we are announcing a 64% price reduction on input tokens, a 52% price reduction on output tokens, and a 64% price reduction on incremental cached tokens for our strongest 1.5 series model, Gemini 1.5 Pro, effective October 1st, 2024, on prompts less than 128K tokens. Coupled with context caching, this continues to drive the cost of building with Gemini down. A pricing table for the Gemini 1.5 Flash model, outlining the cost per one million tokens for input and output Increased rate limits To make it even easier for developers to build with Gemini, we are increasing the paid tier rate limits for 1.5 Flash to 2,000 RPM and increasing 1.5 Pro to 1,000 RPM, up from 1,000 and 360, respectively. In the coming weeks, we expect to continue to increase the Gemini API rate limits so developers can build more with Gemini. 2x faster output and 3x less latency Along with core improvements to our latest models, over the last few weeks we have driven down the latency with 1.5 Flash and significantly increased the output tokens per second, enabling new use cases with our most powerful models. Side-by-side graphs charting the latency of Google's Gemini model over time, showing improvements. Updated filter settings Since the first launch of Gemini in December of 2023, building a safe and reliable model has been a key focus. With the latest versions of Gemini (-002 models), we've made improvements to the model's ability to follow user instructions while balancing safety. We will continue to offer a suite of safety filters that developers may apply to Google's models. For the models released today, the filters will not be applied by default so that developers can determine the configuration best suited for their use case. Gemini 1.5 Flash-8B Experimental updates We are releasing a further improved version of the Gemini 1.5 model we announced in August called "Gemini-1.5-Flash-8B-Exp-0924." This improved version includes significant performance increases across both text and multimodal use cases. It is available now via Google AI Studio and the Gemini API. The overwhelmingly positive feedback developers have shared about 1.5 Flash-8B has been incredible to see, and we will continue to shape our experimental to production release pipeline based on developer feedback. We're excited about these updates and can't wait to see what you'll build with the new Gemini models! And for Gemini Advanced users, you will soon be able to access a chat optimized version of Gemini 1.5 Pro-002. posted in: * Gemini * Web * Announcements * Explore * Gemini 1.5 Pro Previous Next Related Posts Mastering Controlled Generation with Gemini 1.5: Schema Adherence for Developers DeepMind Gemini AI Cloud Announcements Mastering Controlled Generation with Gemini 1.5: Schema Adherence for Developers Sept. 3, 2024 How We Built Purrfect Code: A Puzzle Game for Developers Flutter IDX Web How-To Guides How We Built Purrfect Code: A Puzzle Game for Developers Aug. 8, 2024 TensorFlow Lite is now LiteRT Google AI Edge AI Announcements TensorFlow Lite is now LiteRT Sept. 4, 2024 Updates to the Google Photos APIs: Picker API launch and Library API changes Mobile Web Announcements Updates to the Google Photos APIs: Picker API launch and Library API changes Sept. 18, 2024 Gemini 1.5 Flash price drop with tuning rollout complete, and more Gemini AI Announcements Industry Trends Gemini 1.5 Flash price drop with tuning rollout complete, and more Aug. 8, 2024 * Connect + Blog + Instagram + LinkedIn + Twitter + YouTube * Programs + Women Techmakers + Google Developer Groups + Google Developer Experts + Accelerators + Google Developer Student Clubs * Developer consoles + Google API Console + Google Cloud Platform Console + Google Play Console + Firebase Console + Actions on Google Console + Cast SDK Developer Console + Chrome Web Store Dashboard Google for Developers * Android * Chrome * Firebase * Google Cloud Platform * All products * Manage cookies * Terms * Privacy English * English * Espanol (Latam) * Bahasa Indonesia * Ri Ben Yu * hangugeo * Portugues (Brasil) * Jian Ti Zhong Wen