https://ai.google.dev/gemma/docs/paligemma Gemma * Gemini API Gemma Google AI Edge Tools * Google AI Studio * Colab * JAX * Keras * TensorFlow * Vertex AI Community [ ] * English * Deutsch * Espanol - America Latina * Francais * Indonesia * Italiano * Polski * Portugues - Brasil * Tieng Viet * Turkce * Russkii * `bryt * l`rbyW@ * frsy * hiNdii * baaNlaa * phaasaaaithy * Zhong Wen - Jian Ti * Zhong Wen - Fan Ti * Ri Ben Yu * hangugeo Sign in Overview Docs [ ] Gemma * * Gemini API * Gemma + Overview + Docs * Google AI Edge * Tools + More * Community * Overview * Model Card * Gemma model family * CodeGemma + Overview + Model card + Quickstart + AI Assisted coding with CodeGemma * PaliGemma + Overview + Model card + Fine-tuning PaliGemma with JAX * RecurrentGemma + Overview + Model card + Releases * Guides * Gemma setup * Get started with Gemma using Keras * Basic tuning with Gemma using Keras * Distributed tuning with Gemma using Keras * Get started with Gemma using PyTorch * Chat with Gemma * Formatting and system instructions * Gemma C++ Tutorial * Inference using JAX and Flax * Fine-tuning using JAX and Flax * Integrations * Vertex AI * Dataflow ML * Google Kubernetes Engine (GKE) * LangChain * Responsible Generative AI Toolkit * Overview * Set responsible policies * Tune models for safety * Create input and output safeguards * Conduct safety evaluations * Build transparency artifacts * Analyze model behavior * Community * Discord * Legal * Terms of use * Prohibited use * Google AI Studio * Colab * JAX * Keras * TensorFlow * Vertex AI * Home * Gemma * Docs Send feedback Stay organized with collections Save and categorize content based on your preferences. PaliGemma PaliGemma is a lightweight open vision-language model (VLM) inspired by PaLI-3, and based on open components like the SigLIP vision model and the Gemma language model. PaliGemma takes both images and text as inputs and can answer questions about images with detail and context, meaning that PaliGemma can perform deeper analysis of images and provide useful insights, such as captioning for images and short videos, object detection, and reading text embedded within images. There are two sets of PaliGemma models, a general purpose set and a research-oriented set: * PaliGemma - General purpose pretrained models that can be fine-tuned on a variety of tasks. * PaliGemma-FT - Research-oriented models that are fine-tuned on specific research datasets. Important: Most PaliGemma models require tuning in order to produce useful results, except for the paligemma-3b-mix variant. Make sure you perform fine-tuning on these models and test the output before deploying them to end users. Key benefits include: * multiple_stop Multimodal comprehension Simultaneously understands both images and text. * build Versatile base model Can be fine-tuned on a wide range of vision-language tasks. * explore Off-the-shelf exploration Comes with a checkpoint fine-tuned on on a mixture of tasks for immediate research use. Learn more View the model card PaliGemma's model card contains detailed information about the model, implementation information, evaluation information, model usage and limitations, and more. View on Kaggle View more code, Colab notebooks, information, and discussions about PaliGemma on Kaggle. Run in Colab Run a working example for fine-tuning PaliGemma with JAX in Colab. Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates. Last updated 2024-05-14 UTC. [{ "type": "thumb-down", "id": "missingTheInformationINeed", "label":"Missing the information I need" },{ "type": "thumb-down", "id": "tooComplicatedTooManySteps", "label":"Too complicated / too many steps" },{ "type": "thumb-down", "id": "outOfDate", "label":"Out of date" },{ "type": "thumb-down", "id": "samplesCodeIssue", "label":"Samples / code issue" },{ "type": "thumb-down", "id": "otherDown", "label":"Other" }] [{ "type": "thumb-up", "id": "easyToUnderstand", "label":"Easy to understand" },{ "type": "thumb-up", "id": "solvedMyProblem", "label":"Solved my problem" },{ "type": "thumb-up", "id": "otherUp", "label":"Other" }] Need to tell us more? * Terms * Privacy * Manage cookies * English * Deutsch * Espanol - America Latina * Francais * Indonesia * Italiano * Polski * Portugues - Brasil * Tieng Viet * Turkce * Russkii * `bryt * l`rbyW@ * frsy * hiNdii * baaNlaa * phaasaaaithy * Zhong Wen - Jian Ti * Zhong Wen - Fan Ti * Ri Ben Yu * hangugeo