https://ai.google.dev/gemini-api/docs/caching Google AI for Developers * Gemini API Gemma Google AI Edge Tools * Google AI Studio * Colab * JAX * Keras * TensorFlow * Vertex AI Community [ ] * English * Deutsch * Espanol - America Latina * Francais * Indonesia * Italiano * Polski * Portugues - Brasil * Tieng Viet * Turkce * Russkii * `bryt * l`rbyW@ * frsy * hiNdii * baaNlaa * phaasaaaithy * Zhong Wen - Jian Ti * Zhong Wen - Fan Ti * Ri Ben Yu * hangugeo Sign in Docs API Reference Cookbook Google AI Studio Prompt gallery Pricing [ ] Google AI for Developers * * Gemini API + Docs + API Reference + Cookbook + Google AI Studio + Prompt gallery + Pricing * Gemma * Google AI Edge * Tools + More * Community * Overview * Get started * Get an API key * Gemini API quickstart * Google AI Studio quickstart * Getting started tutorials + Overview + Tutorial + Android (on-device) + Downloads * Models * About generative models * Gemini * Gemini API * API overview * API reference * API versions * Release notes * Capabilities * Model tuning + Intro to model tuning + Tuning with AI Studio + Tuning with Python + Tuning with REST * Function calling + Intro to function calling + Tutorial * Embeddings + Intro to embeddings * Safety + Safety settings + Safety guidance * Guides * Prompting + Intro to prompting + Prompting with media files + Prompting strategies + File prompting strategies * System instructions * Context caching * Semantic retrieval * OAuth authentication * Firebase extensions * Migrate to Cloud * Billing FAQs * Tutorials * Function calling + Extract structured data * Embeddings + Anomaly detection + Clustering + Document search + Train a text classifier * Applications + Code assistant + Flutter code generator + Content search + Data exploration agent + Writing assistant + Slides reviewer * Troubleshooting * Troubleshooting guide * Access AI Studio using Workspace * Troubleshooting AI Studio * Request more quota * Community * Discourse forum * PaLM API (legacy) * Migrate to Gemini * PaLM docs * Legal * Terms of service * Available regions * Google AI Studio * Colab * JAX * Keras * TensorFlow * Vertex AI Join the Gemini API Developer Competition! Learn more * Google AI for Developers * Gemini API * Docs Send feedback Context caching guide Stay organized with collections Save and categorize content based on your preferences. The Gemini API context caching feature is designed to reduce the cost of requests that contain repeat content with high input token counts. When to use context caching Context caching is particularly well suited to scenarios where a substantial initial context is referenced repeatedly by shorter requests. Consider using context caching for use cases such as: * Chatbots with extensive system instructions * Repetitive analysis of lengthy video files * Recurring queries against large document sets * Frequent code repository analysis or bug fixing Cost-efficiency through caching Context caching is a paid feature designed to reduce overall operational costs. Billing is based on the following factors: 1. Cache token count: The number of input tokens cached, billed at a reduced rate when included in subsequent prompts. 2. Storage duration: The amount of time cached tokens are stored, billed hourly. 3. Other factors: Other charges apply, such as for non-cached input tokens and output tokens. For the most up-to-date pricing details, refer to the Gemini API pricing page. Get started with context caching soon We'll be launching context caching soon, along with technical documentation and SDK support. Send feedback Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates. Last updated 2024-05-13 UTC. [{ "type": "thumb-down", "id": "missingTheInformationINeed", "label":"Missing the information I need" },{ "type": "thumb-down", "id": "tooComplicatedTooManySteps", "label":"Too complicated / too many steps" },{ "type": "thumb-down", "id": "outOfDate", "label":"Out of date" },{ "type": "thumb-down", "id": "samplesCodeIssue", "label":"Samples / code issue" },{ "type": "thumb-down", "id": "otherDown", "label":"Other" }] [{ "type": "thumb-up", "id": "easyToUnderstand", "label":"Easy to understand" },{ "type": "thumb-up", "id": "solvedMyProblem", "label":"Solved my problem" },{ "type": "thumb-up", "id": "otherUp", "label":"Other" }] Need to tell us more? * Terms * Privacy * Manage cookies * English * Deutsch * Espanol - America Latina * Francais * Indonesia * Italiano * Polski * Portugues - Brasil * Tieng Viet * Turkce * Russkii * `bryt * l`rbyW@ * frsy * hiNdii * baaNlaa * phaasaaaithy * Zhong Wen - Jian Ti * Zhong Wen - Fan Ti * Ri Ben Yu * hangugeo