Demis Hassabis & Josh Woodward tell us why Gemini 3.0 puts Google in front of the A.I. race
By Hard Fork
Summary
## Key takeaways - **Gemini 3's Custom Interfaces**: Gemini 3 builds custom interfaces for questions, like an interactive tutorial on Vincent Van Gogh with images and elements, or a mortgage calculator for homes over a million dollars. [02:23], [02:39] - **Benchmark Leap on Hard Exam**: Gemini 3 Pro scores 37.5% on Humanity's Last Exam, a graduate-level interdisciplinary test, up from 21.6% on Gemini 2.5 Pro, beating the previous model across over a dozen benchmarks. [03:05], [03:26] - **Inbox Management Agent**: The Gemini agent looks through your inbox, understands contents, proposes replies, organizes emails, and helps control your inbox in ways users couldn't before. [04:05], [04:17] - **Enhanced Reasoning Steps**: Gemini 3 excels at reasoning and thinking many steps at once without losing track of thought, unlike previous models. [08:14], [08:21] - **AGI Timeline Unchanged**: Gemini 3 is on track with expected progress toward AGI, still 5 to 10 years away, requiring one or two more breakthroughs in consistency, reasoning, memory, and world models. [11:15], [12:14] - **No AI Bubble for Google**: While some AI seed investments seem bubbly, Google's integration into multi-billion user products like Workspace, Android, and YouTube offers immediate returns and massive potential in green fields like robotics and drug discovery. [21:34], [22:03]
Topics Covered
- Gemini Builds Custom Interfaces
- Reasoning Excels Multi-Step Thinking
- AGI Timeline Stays 5-10 Years
- AI Bubble Hits Overvalued Startups
Full Transcript
Well, Casey, we have a special emergency podcast episode today about the launch of Gemini 3.
>> Yes, Kevin, hotly awaited, much discussed among AI nerds here in Silicon Valley. We are finally about to get our
Valley. We are finally about to get our hands on the genuine article.
>> Yeah. So, normally we wouldn't break our Friday publication schedule to publish a special episode just about a new model coming out uh from one of the big AI companies. They're releasing models all
companies. They're releasing models all the time. But there are a couple reasons
the time. But there are a couple reasons that we thought it was worth doing this this week uh to talk about this model Gemini 3 in particular. The first is that we got uh some time with Dennis Assabis and Josh Woodward, two of the
leading AI executives at Google. Dennis
of course is the CEO of Google DeepMind, which is their in in-house AI lab. And
Josh Woodward is the uh VP of the Gemini team and some other stuff there at Google. So we were excited to talk to
Google. So we were excited to talk to them and ask them about this big new model release. Um, but I think there are
model release. Um, but I think there are a couple other reasons we were interested in doing this as well.
>> Yeah. I mean, one big thing, Kevin, is just that maybe more than other model releases, this one seems to have the attention of Google's competitors. We're
hearing a lot of whispers from folks who work at other AI labs that h it seems like Gemini 3 has managed to figure some things out in a way that may be bad for
their businesses. And I think around the
their businesses. And I think around the AI industry, there's sort of this feeling that uh Google, which kind of struggled in AI for a couple years there. They had the launch of Bard and
there. They had the launch of Bard and the first versions of Gemini, which had some issues. And I think they were seen
some issues. And I think they were seen as sort of catching up to the state-of-the-art. And now I think the
state-of-the-art. And now I think the question is like, is this them sort of returning to the top of the AI leaderboard? Is this them taking their
leaderboard? Is this them taking their crown back? Um, so we'll get into all
crown back? Um, so we'll get into all that with Demis and Josh. Uh but let's just talk Casey about what we know about Gemini 3. They held a briefing uh early
Gemini 3. They held a briefing uh early this week and told us a little bit about the the new model and what it can do. So
what did we learn about Gemini 3?
>> Yeah. Well, so in terms of what it can do, which is always the most interesting to me, Google shared a few different things. Um, one, in addition to saying
things. Um, one, in addition to saying all the things you would expect, like it's better at coding and it's better at vibe coding, it also is going to do some new things around generating interfaces
for you when you ask it a question. So
nowadays, you ask most chatbots a question, it'll spit back an answer in text, maybe it shows you an image.
According to the Google folks, Gemini 3 is just going to start building custom interfaces for you. So they showed an example where somebody wanted to learn about Vincent Van Gogh, the painter, and Gemini 3 just sort of like coded up an
interactive tutorial that had all sorts of like images and interactive elements.
They showed another example that involved building a mortgage calculator for buying a home over a million dollars, which is the lowest amount of money that anyone at Google can imagine spending on a home. So these are the
kinds of things that you can expect to find in Gemini 3, Kevin.
>> Yeah. So I would say the theme of the briefing and of the materials that Google shared ahead of the uh Gemini 3 launch was uh this is just kind of better than their last model Gemini 2.5
Pro in basically all respects. Some of
the benchmarks that caught uh my attention uh one was this benchmark test called humanity's last exam uh which is sort of a very hard interdisciplinary exam that consists of a bunch of
questions like basically a graduate student or PhD level. Um, and their previous model, Gemini 2.5 Pro, got about a 21.6%
on that test. And Gemini 3 Pro gets a 37.5% on that test. That's basically the story of all of these benchmarks. They they
gave, you know, more than a dozen examples of various benchmarks where the new model just beats the old one handily. Um, and, you know, to a lot of
handily. Um, and, you know, to a lot of people, I think that may not um matter.
Most people uh who are using Google's AI products are probably not out there trying to solve like novel problems in physics. But their basic pitch for this
physics. But their basic pitch for this is just like this is a state-of-the-art model. Anything that you could do with
model. Anything that you could do with Chat GPT or Claude or even the older versions of Gemini, you can do better with Gemini 3 Pro. They also talked about testing what they're calling the
Gemini agent, which is going to be able to do one thing in particular that I've been waiting for somebody to do forever, which is look through your inbox, understand its contents, propose
replies, kind of um, you know, organize like emails together, and really sort of help you get your inbox under control in a way that um, I personally have never been able to. So, we basically only saw
a few animated gifts about that, but that will definitely be one of the first things that I try when I get my hands on Gemini 3.
>> Yeah. And they are not, we should say, rolling this out to everyone right away.
It's going to be available uh this week for uh users in the Gemini app and also in uh the AI mode, which is sort of the the tab off the off to the side of the
main Google search engine. Um it will also be available for developers in various products. Um but they're not
various products. Um but they're not sort of saying when this will come to things like the Gemini integrations in Google Docs or Gmail. Uh these very popular things that um you know are used
by billions of people a day. But I
thought it was interesting that they have brought this model to Google search albeit in this AI mode that's not sort of the main search bar. Um that to me
suggests that they feel like they can u serve this model cheaply enough to make it potentially something that billions of people could use um and that that would not melt their their servers and
incur you know billions of dollars of costs.
>> Yeah. So far they say that the usage keeps going up for AI overviews and uh every quarter they continue to make more money. So seems to be working out for
money. So seems to be working out for them.
Not working out for the rest of the web but it's working out well for Google.
>> Yeah. But I think that's like obviously Google's big advantage here over their competitors is that, you know, they have products that are used by billions of people a day and they can kind of shove
Gemini 3 into those products over time and just get more and more usage and get more data and and use that to improve their models. So, which is why we always
their models. So, which is why we always tell students when they ask us for advice, step one, build an illegal monopoly.
>> Yes. And speaking of students, the other notable announcement that Google uh is making this week is that they are giving all US college students uh a year of
free access to a paid version of Gemini.
Um which uh is I think a smart move. I
feel a little gross about it. Like
essentially telling students, hey, uh why don't you uh why why don't you use this to maybe do some of your homework?
Maybe help you with your exams. Uh we'll give you the first hit for free. Yeah.
You know, I I was also struck during the briefing that we had this morning that I believe three different people uh used the phrase learn anything. This seems
like it has become a very prominent plank of Google's messaging is they are presenting Gemini as a learning tool. Um
which I maybe is just sort of a euphemism for a do your homework tool. I
don't know.
>> Yes. Okay. So that is what we know about Gemini 3. We will be doing our own
Gemini 3. We will be doing our own testing and reviewing of Gemini 3 once it is fully out uh on Tuesday. But for
now, we wanted to just kind of give you the basics and also bring you our interview with Demis Abus and Josh Woodward of Google Debind. And before we get to that, we should obviously make our AI disclosures. I work for the New
York Times company which is suing OpenAI and Microsoft over the training of large language models.
>> And my boyfriend works at Anthropic.
>> Dennis and Josh, welcome to Hardfork.
>> Great to be here. So, two years ago, Sundar Pachai told us that Bard, rest in peace, uh was a souped-up Civic uh that was in a race with more powerful cars.
What kind of car is Gemini 3?
>> That's a good one. Dennis, do you want to take it?
>> Well, um I hope it's a bit faster than a Honda Civic. Um you know, I don't really
Honda Civic. Um you know, I don't really think of it in terms of cars. Maybe it's
one of those cool drag racers.
>> Yeah. Yeah. So, people are really excited about this model. Uh we um have been hearing from folks that have been sort of early testing it. Um obviously
you guys have shown off a lot of the benchmarks. Very impressive. Um what can
benchmarks. Very impressive. Um what can Gemini do on a concrete level that previous AI models couldn't?
>> Well, I I'll jump in maybe a couple of things that stand out. One, we're
starting to see this model really excel on reasoning and being able to think many steps uh at the same time.
Sometimes models in the past would lose lose their train of thought, lose track.
Um, this one's way better at that. The
other thing you'll see tomorrow as well is all kinds of new generative interfaces. Uh, this is our best model
interfaces. Uh, this is our best model yet at being able to create new types of interfaces. It gives people really a
interfaces. It gives people really a custom sort of design and sort of answer to their questions. And then maybe the third thing I would say is we've put a lot of investment in coding itself. And
so a lot of the coding examples, you'll see some new products coming out like Google anti-gravity will also kind of showcase that.
>> There's been some discussion that for average users, the chat use case can feel solved, that sort of average users of products like Gemini kind of almost
can't even think of a question to ask it that will generate something that feels meaningfully different from what they were able to get in the last model. To
what extent does that feel true to you in Gemini 3? And to what extent do you think average folks are really going to notice a difference?
>> Yeah, one of the things I guess we're seeing in some of the testing and Dis feel free to chime in too is I think these are really for us this is a model that it's more concise. It's more
expressive. It starts to present information in a way that's much easier to understand. And I think for most
to understand. And I think for most people that's going to be a big immediate effect. And then I think what
immediate effect. And then I think what starts to get interesting is how these models start to interact with other types of information. So we talk a lot about how students are going to be able
to learn with this model or even how this model can connect to other types of data you might have in other Google products with your permission. These are
the ways I think we're starting to show kind of it's going beyond just the standard text kind of Q&A back and forth. Yeah, I think I'd add to that
forth. Yeah, I think I'd add to that just like you know its general reliability on things is incredibly you know you'll notice that when you use it.
Um I think also we work quite hard on the persona which we call it internally like the style of it. I think it's more succinct. I think it's more to the
succinct. I think it's more to the point. It's helpful. I feel like it's
point. It's helpful. I feel like it's got a better style about it. I I find it more pleasant to to to brainstorm with and use. Um and then I think you know I
and use. Um and then I think you know I think there are various things where there's almost a step change. But I feel like it's crossed a sort of threshold of usefulness on things like Vibe coding.
I've been getting back into my games programming. I'm going to I'm going to
programming. I'm going to I'm going to set myself some projects over Christmas on that because I feel like it's actually got to a point where it's incredibly useful uh and and capable on
front end and things like this um that perhaps previous versions weren't so good at. Dennis, the last time we had
good at. Dennis, the last time we had you on the show in May, uh you said that you think we're 5 to 10 years away from AGI and that there might be a few significant breakthroughs needed between
uh here and there. Has Gemini 3 and observing how good it is changed any of those timelines or does it incorporate any of those breakthroughs that you thought would be necessary?
>> No, I think it's I think it's sort of dead on track if you if you if you see what I mean. I we're really happy with this progress. I think it's an
this progress. I think it's an absolutely amazing model. uh and is is right on track of what I was expecting and and the trajectory we've been on actually for the last couple of years since the beginning of Gemini which I
think's been the fastest progress of anybody in the industry and I think we're going to continue doing that trajectory and we we we expect that to continue but on top of that I still think there'll be one or two more things
that are required to really get the the consistency across the board that you'd expect from a general intelligence um and also improvements still on reasoning on memory Um, and perhaps things like world model
ideas that you also know we're working on with Simmer and Genie. Um, they will build on top of Gemini, but but extend it in various ways. And I think some of those ideas are going to be required as
well to fully solve physical intelligence and things like that. So,
I'm I'm both are true. I I'm really happy with the progress of Gemini 3. I
think people are going to be pretty pretty pleasantly surprised. Um but it's on track of what we were expecting the progress to be and I think that means still 5 to 10 years with with one or two
more perhaps uh breakthroughs required.
>> You mentioned uh Gemini 3's style.
There's been a lot of discussion recently about AI companions, the relationships people are developing with them. How do you think about Gemini 3's
them. How do you think about Gemini 3's personality and what kind of relationship do you want users to have with it? I I would say in the app
with it? I I would say in the app itself, um Casey, we're really interested in kind of we see it on the team a lot as almost like a a tool or it's something you're using to kind of work through and kind of cut through
your day. And so whether it's kind of if
your day. And so whether it's kind of if it's helping on different types of questions you have or helping you create things, that's really where we see it really kind of excelling um and kind of
the direction we want to see it. I think
if you zoom out, if you look at Gemini or some of our other projects like Notebook LM or Flow, we're really kind of trying to think through how does AI really be this superpower kind of super
tool in your toolbox that you can use whether it's for writing or researching or creating films or whatnot. And so
that's really more where we're where we're focused. Um I think over time
we're focused. Um I think over time we're really interested on the team to be able to track things like how many tasks did we help you complete in your day? Um, that's a new type of metric
day? Um, that's a new type of metric that I think we get excited about and sort of a way that the original sort of Google search worked. You would come to it, you would sort of try to get uh an answer or sent to a page and sort of
move on from there.
>> Well, that that all sounds very good and responsible, but I'm wondering about all the viral engagement you're leaving on the table by not making this thing an erotic companion. Um, big oversight.
erotic companion. Um, big oversight.
>> No comment.
Um, some of your competitors have been very nervous in the days and weeks leading up to Gemini 3. I think they've started hearing the same rumblings that uh that that we have about this model being quite good and maybe the narrative
shifting from sort of Google playing catch-up in AI to now sort of being on top of of the race or at least in a in a leadership position there. Do you feel like Google is ahead in the AI race right now?
>> Look, it's a as you guys know very well, it's a ferocious, you know, competitive environment. um probably the most
environment. um probably the most competitive there's ever been. So one
can never you know it's almost really the only important thing is your rate of progress right from where you are and that's what we're focusing on and we're very happy about that. I mean I don't really see it as a sort of like you know
we were we're back in the lead or something like that. We we've always pioneered the research part of this. I
think it's like getting into our groove in making sure that downstream reflected in all of our products and I think we're really getting into our stride there. I
think you saw that actually last IO I would say. Um, and we're getting better
would say. Um, and we're getting better and better at that. like with GDM being sort of the engine room of Google and uh and of course there's a Gemini app, there's notebook LM, these AI first products, but there's also powering up
all these amazing existing Google products whether that's maps, YouTube, Android, you know, search of course with uh AI first uh uh features and actually
in some cases reimagining things from an AI first perspective with you know often Gemini under the hood and that's going amazingly well and I think we're only midway through that evolution but it's
very exciting to see how, you know, much value and excitement our users are getting when they see each of those new features and, you know, for example, Workspace and Gmail and so on. There's
it's almost almost endless possibilities there. So, um, we're really excited
there. So, um, we're really excited about that as well as all of these uh AI first uh products that we're also um imagining and and prototyping.
We had a historian on the show last week who was using uh an unreleased Google model in AI Studio and it had sort of blown his mind with how it was able to
transcribe these very old documents and reason correctly about you know what kind of you know what was the measurements of the sugar in this sort of 1800s fur trade in Canada. Do you
think you can tell us once and for all was this man using Gemini 3?
>> Not sure about that one.
>> Okay. I I will say um the model is though quite amazing at making these connections and I don't know if the historian was using kind of photos of old documents or diaries or whatnot.
>> That's what he was doing.
>> He's very good at this. Um and uh you know someone like me who has pretty poor handwriting, you could take us a page of notes and it'll kind of take that and run with it uh with no problem, no sweat. So
sweat. So >> you mentioned that uh on this call that you're going to be integrating this into search in the AI mode that sort of is is a side tab on the main Google search engine. Does that mean that you found a
engine. Does that mean that you found a way to serve this model more efficiently and cheaply than previous models? I
think we're we're always on the cutting.
I think I feel like the thing we do really well apart from the overall performance of our models and getting better and better at that is is is the efficiency of our models and the distillation techniques and many many
other techniques that we sort of created and pioneered that we're now putting to use. Um obviously we we it's necessary
use. Um obviously we we it's necessary for us because we have extreme use cases of things like AI overviews and others that we have to serve billions of users.
Uh and then of course um some of our cloud customer enterprise customers really appreciate that efficiency cost efficiency too. So we've always tried to
efficiency too. So we've always tried to be on this par frontier of cost to performance and wherever you want to be on that frontier. If you value performance most or if you value uh cost
uh the most then they'll be one of the models in the model family for you. So
of course we're only announcing pro today but we are um uh also working on the other family of uh models for the 3.0. O era. So you'll see a lot more
3.0. O era. So you'll see a lot more about that pretty soon.
>> Yeah. Uh it seems like every time we see the release of a new frontier model, we get to revisit the discussion about scaling laws and are we beginning to see diminishing returns and I can predict a
few Twitter accounts that will probably have something to say about this over the next few days. So I thought I would just sort of ask you before we have that discourse, how are you guys thinking about that in relation to Gemini 3?
Yeah, we're very happy with the the progress Gemini 3 represents over 2.5.
So I would say uh uh sort of actually referencing what we discussed earlier that that the the progress is basically what we're expecting and on track and we're and we're really pleased with it.
Um but that that's not to say that it's like there is some kind of diminishing returns. People when they hear
returns. People when they hear diminishing returns they think of is it zero or exponential, right? But there's
also in between. So there can be diminishing. It's not like going to like
diminishing. It's not like going to like exponentially double with every era, but it's not um uh it's but it's still well worth doing, right? And and and extremely good return on that
investment. So, I think we're in that
investment. So, I think we're in that era. Uh and then, you know, as I said, I
era. Uh and then, you know, as I said, I my suspicion is although we'll see is that still one or two more breakthroughs are required, research breakthroughs are required to get all the way to AGI. But
in the meantime, you're going to obviously need as scaled as possible versions of these foundation models, multimodal foundation models that we're building today and still seeing great progress on.
>> Uh, which of the many benchmarks that you showed off today do you feel like is going to matter most to the average user?
>> Oh, that's a good question. I I think most people don't look at the benchmarks as closely as we do, but the benchmarks are always a proxy, right? So, you look
at something like cracking the 1500 ELO on LM Marina. Um, that's great, but what really matters is kind of the user satisfaction in the products, too. And I
think what's been encouraging to us is these are still moving in the same direction. They're good proxies for each
direction. They're good proxies for each other. And so, ultimately, I think we'll
other. And so, ultimately, I think we'll we'll put out all the benchmarks and we're very proud of them and they represent amazing progress, but you also have to be able to translate that into product experiences that matter. And so,
we try to do both with every one of these releases.
>> Any new dangerous capabilities or safety concerns? uh that come with the
concerns? uh that come with the increased power of the model.
>> I think well we've done we've taken quite a long time on this model to because it's it's frontier and um you know has some new capabilities and it's it's very capable as you can see from
the benchmarks and um and as as Josh said we don't we don't you know we make sure to not overindex internally on those benchmarks. They're just a proxy
those benchmarks. They're just a proxy for overall performance and that's why we care about them across the board and then ultimately how how our users experience them. Um but we spend a lot
experience them. Um but we spend a lot of time on on on testing safety testing all the different dimensions uh with the safety institutes and also external testers that we work with as well uh as
well as of course doing a ton of internal testing. So I would say this is
internal testing. So I would say this is our most thoroughly tested uh model so far.
>> Do you want to mention any of those sort of new capabilities that popped up whether or not it was like for a safety thing? Was there something in there
thing? Was there something in there where you thought, "Okay, yeah, we definitely need to make sure we're sending this to a bunch of >> Well, look, it's just making sure we we've worked really hard on things like tool call usage and function calling and
and these kinds of things. Obviously,
they're super important for coding capabilities and and developers want that and so on and it's very important in general for reasoning. Um, but it also makes them more capable for for for
um riskier things too like cyber. So we
have to be you know we have to be sort of doubly cautious as we improve those dimensions for all the good use cases that we're continually checking on all those kinds of measures that um they
can't be they can't be misused.
>> Are we in an AI bubble?
>> Uh I think uh we it's it's too binary a question I would say. I I think uh I mean my view on this, this is just strictly my own opinion, is that there are some parts of the of the AI industry
that are probably in a bubble. Um you
know, if you look at like seed investment rounds being multi10 billion rounds with basically nothing, it seems um I mean there's talented teams, but it
seems like uh that that might be the first signs of some kind of bubble. Uh
on the other hand, you know, I think there's a lot of amazing work and value to at least from our perspective that we see that not only are there all the new product areas. So Gemini app, notebook
product areas. So Gemini app, notebook LM, but thinking more forward, robotics, gaming, I mean there's incredible uses of and and not just Gemini, but some of our other models, Genie, you can imagine
my my old games paying background, you know, I'm itching to to to think about what could be done there. And I and drug discovery, we're doing with isomeorphic and Whimo. And so there's all these new
and Whimo. And so there's all these new green field areas. They're going to take a while to mature into massive multiundred billion dollar businesses, but I think that there's actually potential for half a dozen to a dozen
there that that that I think Alphabet will be involved with, which I'm really excited about. Um, but also immediate
excited about. Um, but also immediate returns. We got of course the engine
returns. We got of course the engine room, you know, this is the engine room part of Google where we're pushing this into all of these incredible, you know, multi-billion user products that people
use every day. And there's there's just almost we have so many ideas. It's just
about execution. Like how would you re reorganize workspace around that Android, YouTube, there's just so much potential there. And I think a lot of
potential there. And I think a lot of that will also bring in uh near-term near-term revenue and and and direct returns while we're also investing in uh
the future. Uh not to speak of, you
the future. Uh not to speak of, you know, cloud revenue and TPUs and all of that. Uh uh which I think is also going
that. Uh uh which I think is also going to be huge. So I feel really good about where we are as Alphabet whether or not there's a bubble or not. I think our job
is to be uh winning in both cases, right? If there's no bubble and and
right? If there's no bubble and and things carry on, then we're going to take advantage of that opportunity. But
if there is some sort of bubble and there's a retrenchment, I think we'll also be best placed to take advantage of that scenario as well.
All right, let's imagine it's Thanksgiving coming up and it's it's the Bay Area and one of our listeners uh you know changes the subject from politics which is upsetting everyone to AI give give people something to be excited
about and someone say hey I heard Gemini 3 just came out like what could it actually do what's the example that you would have our listeners show their friends whether it's on their phone and their laptop to be get a load of this
and save Thanksgiving >> yeah I don't I don't know if it'll save Thanksgiving but it could probably provide some laughs you know We're our imagery models in Gemini are still best in the world. So what we what I would
say grab your phone can be, you know, iPhone, Android, doesn't matter. Pull it
out, you can take a selfie, uh put yourself in it and edit it. People are
still doing that at huge amounts. Um and
it's great fun. And then I think you can then show off any kind of other capabilities in the new Gemini 3 alongside it. So, this is what we're
alongside it. So, this is what we're seeing people kind of coming for a lot of these interesting use cases and then starting to try other parts of the app, too.
>> You heard it here. Nano Banano will save Thanksgiving dinner. Um,
Thanksgiving dinner. Um, >> gentlemen, thank you. It's great to talk. Um, and thanks for making the
talk. Um, and thanks for making the time.
>> Thanks. Thanks for having us.
>> Oh, good. Thank you.
Loading video analysis...