LongCut logo

Proactive Agents – Kath Korevec, Google Labs

By AI Engineer

Summary

Topics Covered

  • Full Video

Full Transcript

[music]

I'm so excited to be here. I love New York and I love meeting everybody here. And I am Kath Corbec. I'm from Google Labs and I work on this little team called ADA and I'm going to be talking about some of the stuff that we've been doing on this project called Jewels. So, a few months ago in my household, our dishwasher broke. And while it was being repaired, my husband decided that he was going to do all the dishes. And so, he told me he was going to do this. But

every single night, I found myself reminding him to do the dishes. And you can imagine that got old pretty fast. And I realized that even though I wasn't physically washing the dishes, I was still carrying this mental load. And I know a lot of you can probably relate to this. I was keeping track of whether or not that task was done, following up, making sure that things kept moving. And I realized in that moment that that's exactly where we are with asynchronous

agents today. They can handle some of the work, but we're still the ones as developers carrying that mental load and monitoring them. So here's the truth. Humans, we are serial processors, not parallel ones. We can juggle multiple goals, but we execute them in sequence, not all at once. When you manually kick off a task in jewels, you're usually waiting to be able to move on. And it's that pause, it's that gap in attention where we really lose momentum. And this

agents today. They can handle some of the work, but we're still the ones as developers carrying that mental load and monitoring them. So here's the truth. Humans, we are serial processors, not parallel ones. We can juggle multiple goals, but we execute them in sequence, not all at once. When you manually kick off a task in jewels, you're usually waiting to be able to move on. And it's that pause, it's that gap in attention where we really lose momentum. And this

is actually backed up by science where uh humans actually think we think we're multitaskers but we're actually executing many tasks very rapidly. But switching between these tasks comes with a huge cost. It can cost up to 40% of your productive time. So that's like half a day lost to switching contexts and reloading. So if humans are uniters, what's the solution here with agents? So for async agents, in order in order for them to succeed, developers can't be expected to babysit them.

We've all seen that post on Twitter of 16 different cloud code tasks running in parallel on 16 different terminals on three different huge browsers or huge monitors. And when I first saw this, I thought, God forbid that is the DevX of the future. I want to I don't want to manage work. I don't want to manage my agents. I want to be a coder. I want to build. And so, we need to think we need uh uh collaborators in our system that we can trust. Agents that really

understand context, can anticipate our needs, and they know really when to step in. And then uh I think finally we're reaching that point with models where they're getting better and better at executing end to end as long as they understand what our goals are clearly. And that's where trust really becomes this unlock where you can trust the system to know what's missing to fill in the gaps and to really keep progress moving forward while you manage on something else where where while you

focus on what matters most. And essentially we want jewels to do the dishes without being asked. So most AI developer tools today are fundamentally reactive. You open up your CLI or your ID and you ask the agent to do something and it responds or it waits for you to start typing and then it autocompletes a suggestion. And there's a benefit to this model. It's very efficient. It only uses compute when you explicitly ask for it. But the real question I'm asking myself is is this

how I want to manage AI? And if you think about in the future, imagine a world where compute is not a limiting factor anymore. Instead of a single reactive assistant for instructions, you could have dozens of small proactive agents working with you in parallel, quietly looking for patterns, noticing friction, and taking on the boring tasks that you don't want to do before you even ask. It can do things like fixing authentication bugs that you've been avoiding, uh, updating configs, flagging

potential order, uh, errors, preparing, uh, migrations, and all of this can happen in the background triggered off of things in my natural workflow. So, I really think there are four essential ingredients that make up proactive systems today. There's observation. The agent has to really continually understand what is happening and of what your code changes are, what your patterns are, what your workflow is, etc. to get context about your entire project. And then there's

personalization. And this one's difficult. It has to learn how you work, what you care about, what you tend to ignore, what your preferences are, the code that you absolutely don't want to ever touch. And then it has to be timely as well. If it comes in too soon, it's going to interrupt you. And if it's too late, then the moment is lost. And it also has to work seamlessly across your workflow. It has to insert itself into spaces where you naturally work already

personalization. And this one's difficult. It has to learn how you work, what you care about, what you tend to ignore, what your preferences are, the code that you absolutely don't want to ever touch. And then it has to be timely as well. If it comes in too soon, it's going to interrupt you. And if it's too late, then the moment is lost. And it also has to work seamlessly across your workflow. It has to insert itself into spaces where you naturally work already

in your terminal, in your repository, in your IDE, not forcing you to go somewhere else to some application that's secret or that you forgot about. So bringing all these tools together, you can imagine, is not trivial. [laughter] >> So is running this presentation. Um, and uh, you you want to be able to ask your agent to understand your workflow and anticipate your needs and then intervene at exactly the right moment without breaking your workflow. And that's when it really starts to feel

like magic. The interesting thing is pro these proactive systems, they're all around us today. One of my favorite examples is Google Nest where you put it in your house, you install it, and then you configure it, and then it starts to learn your habits as you leave the house, as you come back, uh, as you go to sleep, as you wake up in the morning. And then pretty soon, you don't have to think about climate control in your house anymore because it's learned what

like magic. The interesting thing is pro these proactive systems, they're all around us today. One of my favorite examples is Google Nest where you put it in your house, you install it, and then you configure it, and then it starts to learn your habits as you leave the house, as you come back, uh, as you go to sleep, as you wake up in the morning. And then pretty soon, you don't have to think about climate control in your house anymore because it's learned what

your habits are. Another one is your own body. your heart rate elevates as you go for a run or start to work out or it anticipates that you're about to fall and so it reacts before you consciously think I'm going to put my hand out. So when you look at it like that proactivity is actually not that proactivity for AI is actually not that futuristic. It's very familiar and it is very human and that's exactly the point. What we're building is tools that behave more like a good collaborator and less

like command line utilities. So we're already doing this in this tool called jewels which is this uh proactive asynchronous autonomous coding agent from Google labs. And we're doing this in kind of three levels of of uh proactivity. Level one is where a collaboration really starts to emerge. And this is how Jules works today where it can detect things like missing tests, unused dependencies, unsafe patterns, and then it starts to automatically fix those things as it's doing other other

tasks that you've asked it to do. This is sort of like this attentive sue chef in your workflow where it's keeping the kitchen clean, the knives sharp, the kitchen uh stocked so that you can focus on what comes next. And that's the beginning of proactive software. At level two, the agent becomes more contextually aware of the entire project. It observes how you work, the code you write. If you're a back-end engineer, maybe you need help with React. If you're a designer, maybe it

wants you to may maybe it'll help uh uh write the database schema. And then it learns what your frameworks are and what your deployment style is, etc. And this is the kitchen manager. This is the person in your workflow keeping the rhythm and anticipating what you need next. And then comes level three. And this is what we're working on pretty hard right now going into December. And I'll show you a little bit of what we're what we're going to be shipping in

December in a minute. But level three is where things start to converge around that context. It's where the agent starts to understand not just context, but also consequence. How these choices are actually affecting the users of your products, the performance, and the outcomes. And at that level, we have this thing jewels. We also have an agent called Stitch, which is a design agent. and another one we're building called insights which is a data agent and they're all coming together to build

this collective intelligence across your application. Jules can see what's breaking in the software. Stitch understands how users are interacting with it and insights connects behaviors from real world signals like analytics, telemetry and conversion rates. And then together they can propose improvements across boundaries of how the system all works together. doing things like performance fixes to improve UX and then design changes to prevent regressions and then all of that is organized based

on live data. So the trick here is that the human stays firmly in the loop. You're observing what the agents are doing. You're refining when you when they when you need to intervene and then you're redirecting it when it has when it has been misdirected. So level three isn't really about autonomy anymore. It's actually about alignment to your project. A a agents and humans collaborating together across the full life cycle of your project. So right now Jules is focused on this

code awareness piece. It understands the environment, the frameworks and the project structures and we're moving towards more of that system awareness. So things that we're introducing in Jules now, we've added something called memory which I'm sure a lot of you are familiar with. It's the ability for Jules to write its own memories and you can edit them and interact with them. It can edit them and it understands that and builds this memory and context and

knowledge of of your project as you work with it. We've added a critic agent which works adversarially with Jules to make sure that the code is is high quality but then also does a full code review. And then we've added verification where Jules will write a playwright script, take a screenshot and then put that back into the trajectory for you to validate. And then we're also doing things like adding uh a to-do bot that will look through your code and look through your repository and pick up

on anything that where you've said this is a to-do I want to get to in the future and it will start to proactively work on those things with that context. We're also adding in things like best practices where Jules will understand best practices and start to suggest those and also environment setup. We have an environment agent that we use internally for running evals and we're extending that externally to better understand how environment how your environments work and and set those up

for you. And then we also are adding something called a just in time context. It's like a jewels cheat sheet where if it's doing something very specific it can and gets stuck. It can just immediately look at that cheat sheet instead of reaching out to you. So, this is all moving Jules very close to being that proactive teammate, not just this reactive assistant. Okay, so this morning I was talking to my team back in San Francisco and I was thinking, okay, I'm going to do a live demo, but the

for you. And then we also are adding something called a just in time context. It's like a jewels cheat sheet where if it's doing something very specific it can and gets stuck. It can just immediately look at that cheat sheet instead of reaching out to you. So, this is all moving Jules very close to being that proactive teammate, not just this reactive assistant. Okay, so this morning I was talking to my team back in San Francisco and I was thinking, okay, I'm going to do a live demo, but the

live demo gods did not align with me this morning. We still have CLS that are being pushed to staging right now. So, I'm going to walk you through a little bit of this. And if you know Jed, he's going to, I think, be talking tomorrow. We're gonna um affectionately try to fix Jed's code here. Um, so this is a view of of proactivity and this is this is Jules where you prompt it and the first thing you that you do when you configure and enable proactivity is Jules will

index your entire uh codebase. It'll index your directory and start looking for things that it can do and then it'll that'll show up on the screen. So right here we're looking at a little bit more in this um in this repository ADK Python and uh and it's indexed the repository and it's found a bunch of to-dos. It's found a bunch of best practices that it can update and it's giving me some signal about what it's finding. And so you can see the signal is high confidence, medium confidence, and low.

And so it's actually telling me what it thinks it can achieve based on what's in my code and what it wants to do. And that's so it has high confidence in green, medium and purple, low and yellow way down at the bottom. Um, [clears throat] and so I can go through this and I can manually click these and say I want to start these. And so I don't have to think about the prompt. I don't have to look at the code. I don't I I can do kind of less cognitive load here. We're working on something to just

start these automatically. And so that's coming in the future. But I can also delete these. I can say, "Hey, this one isn't isn't for me. Isn't good." And so once it gets started on a task, I can kind of drill into it and see a little bit more. I can peek into the code that it is suggesting uh that uh it's suggesting it work on. I can find the location of that code. And it also gives me some rationale about why it wants to work on that code, why what it's doing,

etc. And so it's giving me a lot more context and helping me trust that it knows what to do here. Okay. So that's proactivity. that's coming in December and hopefully we'll be able to give that to everybody here. We're very excited about it. And I want to tell you a little story about uh something my husband and I were working on just to kind of set set wrap things up. We uh tinker a bunch with hardware and we live on this slow street in the middle of San Francisco in Hashbury

District and so on Halloween we get a lot of people walking by our house and so we were trying to take advantage of that with our Halloween decorations and so we built this 6 foot animatronic head that sits in the front of our house. It's this old Victorian house and he sculpted it out of foam, epoxy and fiberglass. And then I our our kids also called this lovingly the bald head. And it's based off of if you ever see saw Peewee Herman from the 80s. It's based

off of the PeeWee Herman Peewee's Big Adventures head. Um so while my husband was doing this I was spending my time working with Jules on updating the firmware, controlling the stepper motors, working on the um on the LEDs and the sensors. And for me that's the fun part for me is like really getting creative with what the LEDs are doing. So I wanted to focus on that, the LED animations, but I ended up spending most of my time actually fixing bugs and swapping libraries and doing things like

that. So what I would do is I would prompt Jules, I'd wait 10 minutes and then I would repeat. And I found that process very very tedious. And what I wanted was actually Jules to do the research. I wanted it to handle the the ugly parts where it was researching how to fix a bug. Uh doing the debugging itself. And I wanted it to do this so that I could focus on the creative parts. I wanted the eyes to move and like follow people as they walk down the street and like have lasers coming out

that. So what I would do is I would prompt Jules, I'd wait 10 minutes and then I would repeat. And I found that process very very tedious. And what I wanted was actually Jules to do the research. I wanted it to handle the the ugly parts where it was researching how to fix a bug. Uh doing the debugging itself. And I wanted it to do this so that I could focus on the creative parts. I wanted the eyes to move and like follow people as they walk down the street and like have lasers coming out

of its eyes and stuff like I mentioned it was Halloween. It was very scary. Uh and and this but but I couldn't really do as much of that. and I ended up actually not shipping as much as I wanted to with this animatronic bald head. And so it's that gap that we actually want to close. It's the space between with jewels, it's the space between that tool friction and creative freedom that we're trying to unlock with these kinds of proactive agents. So what I really want you guys to take

away from it and I give this advice to the the folks on on the Jules team a lot is that the product we build today actually won't be the project the products that we have in the future and I think a lot of us know that but in reality I want everybody in this room and everyone building working with AI to be able to take those big steps. I think the patterns that we rely on today, Git uh your your idees, even the code, how we think about the code itself might not

exist a year from now, might not exist six months from now. And that's the exciting part for me. It's sort of we get to invent the future right now. We get to describe and decide how software is made and built. Uh kind of all the people in this room. So my my challenge to you is to not be afraid to question the old ways of how you're building software cuz really the future is coming faster than any of us know. It's probably already here and the cool thing is we get to build it together. Thank

you. [music]

[music] >> [music]

Loading...

Loading video analysis...