LongCut logo

Gemini 3.0 Computer Use: Google's FULLY FREE Browser Use AI Agent! Automate ANYTHING! (Ranked #1)

By WorldofAI

Summary

Topics Covered

  • Gemini 3.0 Tops Computer Use Benchmarks
  • Agents Automate CRM End-to-End
  • AI Reorganizes Whiteboards Autonomously
  • Anti-Gravity IDE Enables Live Automation
  • Extract Events into Tables Live

Full Transcript

Just a month ago, Google had introduced its very own computer use model that's built on top of the Gemini 2.5 Pro. This specialized model leverages Gemini's advanced visual understanding and reasoning capabilities to power agents that can directly interact with user interfaces both on the web and mobile. In benchmarks for web and mobile control, it already outperformed computing models all while running at low latency with the Gemini 2.5 Pro. But ever since Gemini has taken things

further with the launch of two new models under the Gemini 3.0 series, it has drastically improved the computer use agent and the results are honestly insane. Gemini 3.0 is exceptionally strong at computer use task as well as UI automation. For example, the Gemini 3.0 Flash scores an 81.2 2 percentage on the MMU Pro, a leading multimodal understanding benchmark, and a 69.1 percentage on the screen understanding benchmark, putting it ahead of many top tier proprietary models. On stage hands

evaluation, it ranks first overall, providing to be both the most accurate and fastest model when measured by speed per task. Just take a look at the computer use agent in action. This is where the agent is starting off by navigating to an intake form and visually understanding the layout of this page of the CRM dashboard. It is able to read the form fields, extract all the relevant pet and owner details, and applies logical filtering to identify only pets with California

residency. Next, it opens up a system that logs in just like a human would. It maps the extracted information to the correct CRM fields. It creates the new guest profiles and it verifies that each record is successfully added and you can see that it is doing it quite fast on a local computer with the Gemini 3.0 powering it. Once the guests have been created, the agent moves to scheduling interfaces and it selects the correct specialist, finds the available time slots and it schedules a follow-up

residency. Next, it opens up a system that logs in just like a human would. It maps the extracted information to the correct CRM fields. It creates the new guest profiles and it verifies that each record is successfully added and you can see that it is doing it quite fast on a local computer with the Gemini 3.0 powering it. Once the guests have been created, the agent moves to scheduling interfaces and it selects the correct specialist, finds the available time slots and it schedules a follow-up

meeting using the same treatment details originally requested. All of this happens end to end through the user interface with no APIs, no custom integration, and it's something that you can access completely for free. But before we get started, allow me to introduce today's video sponsor, Zapier. As we wrap up this year, the biggest advantage you can give yourself for 2026 isn't working harder, it's building better systems. One Zapier workflow I'm setting up right now is my 2026 planning

and execution system. Every form submission, email click or inbound request gets captured automatically, enriched and qualified and Zapier orchestrates what happens next. For example, inside Slack, Zapier can automatically create support tickets, route them to the right team, and even resolve them using AI agents end to end without manual handoffs. And this is thanks to Zapier workflows. This is what AI orchestration actually looks like. With 8,000 plus integrations, Zapier

connects the tools you already use and runs logic reliably at scale. So, if you want to start 2026 with systems that actually run your work for you, build it with Zapier. Use the links in the description below and start automating today. Here is another demo where the agent opens up a shared digital whiteboard and visually scans all the sticky notes. It understands the text on each note, reasons about each task, what they represent, and matches it to the correct category where it categorizes it

in three different categories like promotion, setup, and volunteers. If any notes are out of place, the agent will physically drag and drop them in the correct spot, reorganizing the board in real time. By the end, you're going to see that it has done a great job in keeping a refined workspace that is clean, structured, and ready to use. All done autonomously. thanks to the computer use agent. So to access these models, you can do so in a couple of ways. You can use it through browserbase

which is an AI browser automation framework and essentially Google has a partnership where they have their models hosted within the Gemini browser supported by the browserbased framework where you can use the Gemini 3.0 flash completely for free to automate web-based tasks. You can deploy this locally and access it through your API. Another way is through using it through the Google AI studio. And lastly, you can use it through Google's anti-gravity. This is Google's IDE,

which uses the computer use agent that's powered by the Gemini 3.0 Flash. You can have it powered by Gemini 3.0 Pro, but it's better with the Flash. And this is where it's going to bring faster, more accurate UI automation directly within the editor. To showcase this new computer use model in action, you can have it review a pull request on GitHub and you'll see how fast this model is and how accurate it is. In this case, it is requested to find the most recent

opened non-draft PR on GitHub of the browserbased stage hand project. And it is requested to make sure that the combination evolves in the PR validation is passed. So you can see that all of the actions will be performed and it's going to log all of them like clicking uh one of the add focus map to preserve semantic elements. it is uh performing a tool call like accessing a certain section going through the checks and it will validate and make sure that it is something that has the combination

evolves in the PR validation that is passed and once it has approved that you can see that it completes the task quite quickly and this is something that it did really quick thanks to the Gemini 3.0 flash powering the computer use model. Here's another prompt where I'm going to tell it to go over to my YouTube channel and find me the most popular video on it. So, you can see that once you send in the prompt, it is going to deploy the computer use agent and it will navigate to my website

really quickly. This is something that I ran with the other computer use model and it took a long time to even get to this stage. In this case, it took only approximately 10 seconds to get there. And now it is going to navigate and find the most popular model within the video section tab. And it is smart enough to click the popular tab whereas the other models weren't. And you can see that it has found that the Gemini 3.5 model is the most popular video. Now this

really quickly. This is something that I ran with the other computer use model and it took a long time to even get to this stage. In this case, it took only approximately 10 seconds to get there. And now it is going to navigate and find the most popular model within the video section tab. And it is smart enough to click the popular tab whereas the other models weren't. And you can see that it has found that the Gemini 3.5 model is the most popular video. Now this

framework that is powered by the Gemini 3.0 computer use can be deployed locally as well with the browserbased framework. This is one of the ways or you can use stage hand which is their open- source tool that uses the Gemini computer use as well. Another way you can access it is simply just heading over to the Google AI studio and using it through the build mode or through the studio itself where you're going to be able to have it used and use its computer use

capabilities whenever you're working with certain tasks that require it to use computer use. Another way to use the computer use model with the Gemini 3.0 Flash is within anti-gravity, which is Google's free IDE. You can send in any sort of prompt within the agent manager, and it will showcase a live preview of what it's actually doing. In this case, I'm having it extract contents off of a website such as Wikipedia, and you can actually allow it to open up certain

things like a browser. And in this case, you can see that it is going to showcase a live preview of what it's doing. heading over to the Wikipedia page and it will actually showcase where is actually clicking and you can communicate with it live in action. So in this case you can work with it to see if it's actually doing the correct thing but if it isn't you can tell it that I want it to work in a certain way. I want you to actually focus on clicking or automating a certain section of the

computer. So, here's a task that I'm going to complete within the agent manager of anti-gravity where I'm telling it to go over to any public university website and find all upcoming AI related events happening in the next 60 days. For each event, collect the title, the date, time, location, and virtual link. And essentially what we're going to do is have it so that once the information is extracted, it is going to then open up and extract the details and organize them into a clean open table

computer. So, here's a task that I'm going to complete within the agent manager of anti-gravity where I'm telling it to go over to any public university website and find all upcoming AI related events happening in the next 60 days. For each event, collect the title, the date, time, location, and virtual link. And essentially what we're going to do is have it so that once the information is extracted, it is going to then open up and extract the details and organize them into a clean open table

sorted by the date as well. So this way you're going to have it sort through all of the contents of different dates for events and it's going to categorize it from the closest time it is going to actually happen. and you're going to be able to see it live in action as it scrapes and finds all the contents. And what's nice is that it will actually showcase what it's talking about live in preview. So you can view the screenshot of where it's at and you can confirm if

it should continue with that step. So if you see currently it is requiring multi-page navigation and you can see that thanks to the computer use agent it's able to navigate and it's able to use its capabilities to find the correct uh contents that we're looking for. It will also use semantic reasoning to decide what counts as AI related workflows or events and it can even handle PDFs event pages as well as calendars. Now what's crazy is that it has now saved all the events in a JSON

format and not just only that it has created an HTML which will showcase all the events and I believe it is debugging this currently to load the correct JSON format for the data directly within this HTML file. And there we go. It looks like it is still working on it, but it has now saved all of the most recent events that are happening at that university like the Jarvis challenge, which is probably the most closest one, January 5th, 2026. Right now, it is the 27th of 2025. But you can see the

quality of output of this computer use agent directly built within anti-gravity. So, you have a lot of means to access all of these different tools completely for free and get the best automation out of it. Or you can consider joining our private Discord where you can access multiple subscriptions to different AI tools for free on a monthly basis, plus daily AI news and exclusive content, plus a lot more. If you like this video and would love to support the channel, you can

consider donating to my channel through the super thanks option below. But that's basically guys for today's video on the new Gemini computer use. I'll leave all these links in the description below so that you can easily get started. But with that thought guys, thank you guys so much for watching. Make sure you subscribe to the second channel if you haven't already. Join the newsletter, join the Discord, follow me on Twitter, and lastly, make sure you guys subscribe, turn on notification

bell, like this video, and please take a look at our previous videos because there is a lot of content that you will truly benefit from. But with that thought, guys, have an amazing day, spread positivity, and I'll see you guys really shortly. He suffers.

Loading...

Loading video analysis...