https://blog.google/technology/google-deepmind/gemini-computer-use-model/

Skip to main content
 
The Keyword
Introducing the Gemini 2.5 Computer Use model
Share
 
Twitter
 
Facebook
 
LinkedIn
 
Mail
Copy link
[https://blog.google/]

  * Home
  * Product news
    Product news
      + Android, Chrome & Play
          o Android
          o Chrome
          o Chromebooks
          o Google Play
          o Wear OS
          o See all
      + Platforms & Devices
          o Fitbit
          o Google Nest
          o Pixel
          o See all
      + Explore & Get Answers
          o Gemini
          o Maps
          o News
          o Search
          o Shopping
          o See all
      + Connect & Communicate
          o Classroom
          o Photos
          o Registry
          o Translate
      + In the Cloud
          o Google Workspace
          o More on the Cloud Blog
          o Google Cloud
          o See all
    See all product updates
      + Android, Chrome & Play
          o Android
          o Chrome
          o Chromebooks
          o Google Play
          o Wear OS
        See all
      + Platforms & Devices
          o Fitbit
          o Google Nest
          o Pixel
        See all
      + Explore & Get Answers
          o Gemini
          o Maps
          o News
          o Search
          o Shopping
        See all
      + Connect & Communicate
          o Classroom
          o Photos
          o Registry
          o Translate
      + In the Cloud
          o Google Workspace
          o More on the Cloud Blog
          o Google Cloud
        See all
    See all product updates
  * Company news
    Company news
      + Outreach & initiatives
          o Arts & Culture
          o Education
          o Entrepreneurs
          o Public Policy
          o Sustainability
          o See all
      + Technology
          o AI
          o Developers
          o Health
          o Google DeepMind
          o Google Labs
          o Safety and security
          o See all
      + Inside Google
          o Data centers and infrastructure
          o Doodles
          o Googlers
          o Life at Google
          o See all
      + Around the globe
          o Google in Asia
          o Google in Europe
          o Google in Latin America
          o See all
      + Authors
          o Sundar Pichai, CEO
          o Demis Hassabis, CEO and Co-Founder, Google DeepMind
          o Kent Walker, SVP
          o James Manyika, SVP
          o Ruth Porat, President & Chief Investment Officer
          o See all
      + Outreach & initiatives
          o Arts & Culture
          o Education
          o Entrepreneurs
          o Public Policy
          o Sustainability
        See all
      + Technology
          o AI
          o Developers
          o Health
          o Google DeepMind
          o Google Labs
          o Safety and security
        See all
      + Inside Google
          o Data centers and infrastructure
          o Doodles
          o Googlers
          o Life at Google
        See all
      + Around the globe
          o Google in Asia
          o Google in Europe
          o Google in Latin America
        See all
      + Authors
          o Sundar Pichai, CEO
          o Demis Hassabis, CEO and Co-Founder, Google DeepMind
          o Kent Walker, SVP
          o James Manyika, SVP
          o Ruth Porat, President & Chief Investment Officer
        See all
  * Feed

Subscribe
[Global (English)                          ]
Subscribe
The Keyword

  * Home
  * Product news
    Product news
      + Android, Chrome & Play
          o Android
          o Chrome
          o Chromebooks
          o Google Play
          o Wear OS
          o See all
      + Platforms & Devices
          o Fitbit
          o Google Nest
          o Pixel
          o See all
      + Explore & Get Answers
          o Gemini
          o Maps
          o News
          o Search
          o Shopping
          o See all
      + Connect & Communicate
          o Classroom
          o Photos
          o Registry
          o Translate
      + In the Cloud
          o Google Workspace
          o More on the Cloud Blog
          o Google Cloud
          o See all
    See all product updates
  * Company news
    Company news
      + Outreach & initiatives
          o Arts & Culture
          o Education
          o Entrepreneurs
          o Public Policy
          o Sustainability
          o See all
      + Technology
          o AI
          o Developers
          o Health
          o Google DeepMind
          o Google Labs
          o Safety and security
          o See all
      + Inside Google
          o Data centers and infrastructure
          o Doodles
          o Googlers
          o Life at Google
          o See all
      + Around the globe
          o Google in Asia
          o Google in Europe
          o Google in Latin America
          o See all
      + Authors
          o Sundar Pichai, CEO
          o Demis Hassabis, CEO and Co-Founder, Google DeepMind
          o Kent Walker, SVP
          o James Manyika, SVP
          o Ruth Porat, President & Chief Investment Officer
          o See all
  * Feed

  * 
  * Press corner
  * RSS feed

Subscribe
Breadcrumb

 1.  
 2. Technology
 3. Google DeepMind

Introducing the Gemini 2.5 Computer Use model

Oct 07, 2025

*
Share
 
Twitter
 
Facebook
 
LinkedIn
 
Mail
Copy link
[https://blog.google/]

Available in preview via the API, our Computer Use model is a
specialized model built on Gemini 2.5 Pro's capabilities to power
agents that can interact with user interfaces.

Google DeepMind
Read AI-generated summary

General summary

Google is releasing the Gemini 2.5 Computer Use model via the Gemini
API, enabling developers to build agents that can interact with user
interfaces. This model outperforms others in web and mobile control
benchmarks with lower latency. You can access it now on Google AI
Studio and Vertex AI to start building and share feedback in the
Developer Forum.

Summaries were generated by Google AI. Generative AI is experimental.
Share
 
Twitter
 
Facebook
 
LinkedIn
 
Mail
Copy link
[https://blog.google/]
Gemini Computer Use

Earlier this year, we mentioned that we're bringing computer use
capabilities to developers via the Gemini API. Today, we are
releasing the Gemini 2.5 Computer Use model, our new specialized
model built on Gemini 2.5 Pro's visual understanding and reasoning
capabilities that powers agents capable of interacting with user
interfaces (UIs). It outperforms leading alternatives on multiple web
and mobile control benchmarks, all with lower latency. Developers can
access these capabilities via the Gemini API in Google AI Studio and
Vertex AI.

While AI models can interface with software through structured APIs,
many digital tasks still require direct interaction with graphical
user interfaces, for example, filling and submitting forms. To
complete these tasks, agents must navigate web pages and applications
just as humans do: by clicking, typing and scrolling. The ability to
natively fill out forms, manipulate interactive elements like
dropdowns and filters, and operate behind logins is a crucial next
step in building powerful, general-purpose agents.

How it works

The model's core capabilities are exposed through the new
`computer_use` tool in the Gemini API and should be operated within a
loop. Inputs to the tool are the user request, screenshot of the
environment, and a history of recent actions. The input can also
specify whether to exclude functions from the full list of supported
UI actions or specify additional custom functions to include.

Gemini 2.5 Computer Use Model flow

Diagram of AI agent loop: Initial task leads to a screenshot/context,
which is sent to the Model, which returns a response to the computer
environment to execute an action.

The model then analyzes these inputs and generates a response,
typically a function call representing one of the UI actions such as
clicking or typing. This response may also contain a request for an
end user confirmation, which is required for certain actions such as
making a purchase. The client-side code then executes the received
action.

After the action is executed, a new screenshot of the GUI and the
current URL are sent back to the Computer Use model as a function
response restarting the loop. This iterative process continues until
the task is complete, an error occurs or the interaction is
terminated by a safety response or user decision.

The Gemini 2.5 Computer Use model is primarily optimized for web
browsers, but also demonstrates strong promise for mobile UI control
tasks. It is not yet optimized for desktop OS-level control.

Check out a few demos below to see the model in action (shown here at
3X speed).

Prompt: "From https://tinyurl.com/pet-care-signup, get all details
for any pet with a California residency and add them as a guest in my
spa CRM at https://pet-luxe-spa.web.app/. Then, set up a follow up
visit appointment with the specialist Anima Lavar for October 10th
anytime after 8am. The reason for the visit is the same as their
requested treatment."

Prompt: "My art club brainstormed tasks ahead of our fair. The board
is chaotic and I need your help organizing the tasks into some
categories I created. Go to sticky-note-jam.web.app and ensure notes
are clearly in the right sections. Drag them there if not."

How it performs

The Gemini 2.5 Computer Use model demonstrates strong performance on
multiple web and mobile control benchmarks. The table below includes
results from self-reported numbers, evaluations run by Browserbase
and evaluations we ran ourselves. Evaluation details are available in
the Gemini 2.5 Computer Use evaluation info and in Browserbase's blog
post. Unless otherwise indicated, scores shown are for computer use
tools exposed via API.

Gemini 2.5 Computer Use outperforms leading alternatives on multiple
benchmarks

Benchmark performance table: Gemini 2.5 Computer Use leads in
Online-Mind2Web, WebVoyager, and AndroidWorld benchmarks.

The model offers leading quality for browser control at the lowest
latency, as measured by performance on the Browserbase harness for
Online-Mind2Web.

Gemini 2.5 Computer Use delivers high accuracy while maintaining low
latency

Latency vs. Quality scatterplot: Gemini 2.5 Computer Use is lowest in
latency and highest in accuracy (70%+ accuracy, ~225 sec latency).

How we approached safety

We believe that the only way to build agents that will benefit
everyone is to be responsible from the start. AI agents that control
computers introduce unique risks, including intentional misuse by
users, unexpected model behavior, and prompt injections and scams in
the web environment. Thus, it is critical to implement safety
guardrails with care.

We have trained safety features directly into the model to address
these three key risks (described in the Gemini 2.5 Computer Use
System Card).

Further, we also provide developers with safety controls, which
empower developers to prevent the model from auto-completing
potentially high-risk or harmful actions. Examples of these actions
include harming a system's integrity, compromising security,
bypassing CAPTCHAs, or controlling medical devices. The controls:

  * Per-step safety service: An out-of-model, inference-time safety
    service that assesses each action the model proposes before it's
    executed.
  * System instructions: Developers can further specify that the
    agent either refuses or asks for user confirmation before it
    takes specific kinds of high-stakes actions. (Example in
    documentation).

Additional recommendations for developers on safety measures and best
practices can be found in our documentation. While these safeguards
are designed to reduce risk, we urge all developers to thoroughly
test their systems before launch.

How early testers have used it

Google teams have already deployed the model to production for use
cases including UI testing, which can make software development
signficantly faster. Versions of this model have also been powering
Project Mariner, the Firebase Testing Agent, and some agentic
capabilities in AI Mode in Search.

Users from our early access program have also been testing the model
to power personal assistants, workflow automation, and UI testing,
and have seen strong results. In their own words:

How to get started

Starting today, the model is available in public preview, accessible
via the Gemini API on Google AI Studio and Vertex AI.

  * Try it now: In a demo environment hosted by Browserbase.
  * Start building: Dive into our reference and documentation (see
    Vertex AI docs for enterprise use) to learn how to build your own
    agent loop locally with Playwright or in a cloud VM with
    Browserbase.
  * Join the community: We're excited to see what you build. Share
    feedback and help guide our roadmap in our Developer Forum.

POSTED IN:

  * Google DeepMind
  * Gemini Models
  * AI

Related stories

 
[GoogleOlym]
Company announcements

Teaming up with LA28, Team USA and NBCUniversal for the LA28 Olympic
and Paralympic Games

By Marvin Chow
Oct 08, 2025
 
[PXL_202510]
Data Centers and Infrastructure

Google is powering Belgium's digital future with a two-year EUR5
billion investment in AI infrastructure.

Oct 08, 2025
 
[Sep_AI_Rec]
AI

The latest AI news we announced in September

By Keyword Team
Oct 08, 2025
 
[Bonjour_Ai]
Search

AI Mode is now available in more languages and locations around the
world.

By Hema Budaraju
Oct 07, 2025
 
[GoogleAIPl]
Google One

Google AI Plus is coming to 36 more countries.

By Kylan Nieh
Oct 07, 2025
 
[Devoret_He]
Company announcements

Googler Michel Devoret awarded the Nobel Prize in Physics

By Hartmut Neven
Oct 07, 2025
.
Jump to position 1 Jump to position 2 Jump to position 3 Jump to
position 4 Jump to position 5 Jump to position 6
[newsletter]

Let's stay in touch. Get the latest news from Google in your inbox.

Subscribe No thanks

Follow Us

  *  
  *  
  *  
  *  
  *  

 

  * Privacy
  * Terms
  * About Google
  * Google Products
  * About the Keyword

  * Help
  * [Global (English)                          ]