https://ai.google.dev/gemini-api/docs/caching

Google AI for Developers

  * 

Gemini API Gemma Google AI Edge Tools  

  *  
    Google AI Studio
  *  
    Colab
  *  
    JAX
  *  
    Keras
  *  
    TensorFlow
  *  
    Vertex AI

Community
[                    ]
  * English
  * Deutsch
  * Espanol - America Latina
  * Francais
  * Indonesia
  * Italiano
  * Polski
  * Portugues - Brasil
  * Tieng Viet
  * Turkce
  * Russkii
  * `bryt
  * l`rbyW@
  * frsy
  * hiNdii
  * baaNlaa
  * phaasaaaithy
  * Zhong Wen  - Jian Ti 
  * Zhong Wen  - Fan Ti 
  * Ri Ben Yu 
  * hangugeo

Sign in
Docs API Reference Cookbook Google AI Studio Prompt gallery Pricing
[                    ]
Google AI for Developers

  * 

  * Gemini API
      + Docs
      + API Reference
      + Cookbook
      + Google AI Studio
      + Prompt gallery
      + Pricing
  * Gemma
  * Google AI Edge
  * Tools
      + More
  * Community

  * Overview
  * Get started
  * Get an API key
  * Gemini API quickstart
  * Google AI Studio quickstart
  * Getting started tutorials
      + Overview
      + Tutorial
      + Android (on-device)
      + Downloads
  * Models
  * About generative models
  * Gemini
  * Gemini API
  * API overview
  * API reference
  * API versions
  * Release notes
  * Capabilities
  * Model tuning
      + Intro to model tuning
      + Tuning with AI Studio
      + Tuning with Python
      + Tuning with REST
  * Function calling
      + Intro to function calling
      + Tutorial
  * Embeddings
      + Intro to embeddings
  * Safety
      + Safety settings
      + Safety guidance
  * Guides
  * Prompting
      + Intro to prompting
      + Prompting with media files
      + Prompting strategies
      + File prompting strategies
  * System instructions
  * Context caching
  * Semantic retrieval
  * OAuth authentication
  * Firebase extensions
  * Migrate to Cloud
  * Billing FAQs
  * Tutorials
  * Function calling
      + Extract structured data
  * Embeddings
      + Anomaly detection
      + Clustering
      + Document search
      + Train a text classifier
  * Applications
      + Code assistant
      + Flutter code generator
      + Content search
      + Data exploration agent
      + Writing assistant
      + Slides reviewer
  * Troubleshooting
  * Troubleshooting guide
  * Access AI Studio using Workspace
  * Troubleshooting AI Studio
  * Request more quota
  * Community
  * Discourse forum
  * PaLM API (legacy)
  * Migrate to Gemini
  * PaLM docs
  * Legal
  * Terms of service
  * Available regions

  * Google AI Studio
  * Colab
  * JAX
  * Keras
  * TensorFlow
  * Vertex AI

Join the Gemini API Developer Competition! Learn more

  * Google AI for Developers
  * 
    Gemini API
  * 
    Docs

Send feedback

Context caching guide

Stay organized with collections Save and categorize content based on
your preferences.

The Gemini API context caching feature is designed to reduce the cost
of requests that contain repeat content with high input token counts.

When to use context caching

Context caching is particularly well suited to scenarios where a
substantial initial context is referenced repeatedly by shorter
requests. Consider using context caching for use cases such as:

  * Chatbots with extensive system instructions
  * Repetitive analysis of lengthy video files
  * Recurring queries against large document sets
  * Frequent code repository analysis or bug fixing

Cost-efficiency through caching

Context caching is a paid feature designed to reduce overall
operational costs. Billing is based on the following factors:

 1. Cache token count: The number of input tokens cached, billed at a
    reduced rate when included in subsequent prompts.
 2. Storage duration: The amount of time cached tokens are stored,
    billed hourly.
 3. Other factors: Other charges apply, such as for non-cached input
    tokens and output tokens.

For the most up-to-date pricing details, refer to the Gemini API
pricing page.

Get started with context caching soon

We'll be launching context caching soon, along with technical
documentation and SDK support.

Send feedback

Except as otherwise noted, the content of this page is licensed under
the Creative Commons Attribution 4.0 License, and code samples are
licensed under the Apache 2.0 License. For details, see the Google
Developers Site Policies. Java is a registered trademark of Oracle
and/or its affiliates.

Last updated 2024-05-13 UTC.

[{ "type": "thumb-down", "id": "missingTheInformationINeed",
"label":"Missing the information I need" },{ "type": "thumb-down",
"id": "tooComplicatedTooManySteps", "label":"Too complicated / too
many steps" },{ "type": "thumb-down", "id": "outOfDate", "label":"Out
of date" },{ "type": "thumb-down", "id": "samplesCodeIssue",
"label":"Samples / code issue" },{ "type": "thumb-down", "id":
"otherDown", "label":"Other" }] [{ "type": "thumb-up", "id":
"easyToUnderstand", "label":"Easy to understand" },{ "type":
"thumb-up", "id": "solvedMyProblem", "label":"Solved my problem" },{
"type": "thumb-up", "id": "otherUp", "label":"Other" }] Need to tell
us more?

  * Terms
  * Privacy
  * Manage cookies

  * English
  * Deutsch
  * Espanol - America Latina
  * Francais
  * Indonesia
  * Italiano
  * Polski
  * Portugues - Brasil
  * Tieng Viet
  * Turkce
  * Russkii
  * `bryt
  * l`rbyW@
  * frsy
  * hiNdii
  * baaNlaa
  * phaasaaaithy
  * Zhong Wen  - Jian Ti 
  * Zhong Wen  - Fan Ti 
  * Ri Ben Yu 
  * hangugeo