LIVE: Nvidia CEO Jensen Huang Keynote Address
By Bloomberg Television
Summary
## Key takeaways - **Nvidia's New Computing Model: Accelerated Computing**: Nvidia introduced a new computing model, accelerated computing, to address problems that general-purpose computers cannot solve, especially as Moore's Law has slowed down. This new model requires reinventing algorithms and rewriting applications, a process that has taken nearly 30 years. [03:05], [04:23] - **Nvidia Arc: Revolutionizing 6G and AI**: Nvidia is partnering with Nokia to develop Nvidia Arc, a new product line for 6G telecommunications. This technology integrates AI for RAN (Radio Access Network) to improve spectral efficiency and enables cloud computing at the edge for wireless telecommunications. [14:14], [15:13] - **Quantum Computing's Next Leap: NVLink and CUDA Q**: Nvidia is advancing quantum computing with NVLink, an interconnect architecture directly connecting quantum processors with GPUs for error correction and AI calibration. This, along with the extended CUDA Q platform, enables hybrid simulations and a fused quantum-classical accelerated supercomputing platform. [21:47], [24:04] - **AI as Work, Not Just Tools**: Unlike previous technologies that created tools, AI represents 'work' itself, acting as workers that can utilize tools. This shift allows AI to address new segments of the economy and significantly boost productivity, especially in light of a global labor shortage. [33:08], [34:36] - **AI Factories: The Future of Manufacturing**: Nvidia is building 'AI factories' that are specialized, high-throughput systems designed to produce valuable AI tokens cost-effectively. These factories are a fundamental shift from traditional data centers and are essential for scaling AI development and deployment. [37:07], [46:19] - **Blackwell GPUs Drive AI Growth and Manufacturing**: The new Blackwell GPUs, manufactured in America, are the engine for the AI age, with projected cumulative sales of half a trillion dollars through 2026. This signifies a major advancement in AI infrastructure and a return to large-scale manufacturing in the US. [58:01], [01:00:03]
Topics Covered
- AI is "Work," Not Just a Tool, Reshaping Global Economy.
- AI Factories: Meeting Exponential Compute Demand Beyond Moore's Law.
- Extreme Co-Design: 10x Performance, 10x Lower Cost AI.
- Physical AI: Orchestrating Robots in Digital Twin Factories.
- Robo-Taxis: A Global Computing Platform on Wheels.
Full Transcript
AI factories are rising. Built in
America
for scientists, engineers, and dreamers
across universities,
startups, and industry.
>> I think we want to try to reach new
heights as a civilization,
discovering the nature of the universe.
>> And now, American innovators are
clearing the way for abundance.
saving lives,
shaping vision into reality,
lending us a hand,
and delivering the future.
We will soon power it all with unlimited
clean energy.
We will extend humanity's reach
to the stars.
This is America's next Apollo moment.
Together,
we take the next great leap to boldly go
where no one has gone before.
And here
is where it all begins.
Welcome to the stage, Nvidia founder and
CEO, Jensen Wong.
Washington DC.
Washington DC. Welcome to GTC.
It's hard not to be sentimental and
proud of America. I got to tell you
that. Was that video amazing?
Thank you.
Nvidia's creative team does an amazing
job. Welcome to GTC. We have a lot to
cover with you today. Um, GTC is where
we talk about industry,
science computing
the present, and the future. So, I've
got a lot to cover with you today, but
before I start, I want to thank all of
our partners who helped sponsor this
great event. You'll see all of them
around the show. They're here to meet
with you and uh uh really great. We
couldn't do what we do without all of
our ecosystem partners. Uh, this is the
Super Bowl of AI, people say. And
therefore, every Super Bowl should have
an amazing pregame show. What do you
guys think about the pregame show and
our allstar
allstar athletes and allstar cast? Look
at these guys.
Somehow I turned out the buffest. What
do you guys think?
I don't know if I had something to do
with that.
Nvidia invented a new computing model
for the first time in 60 years. As you
saw in the video, a new computing model
rarely comes about. It takes an enormous
amount of time and set of conditions. We
observed, we invented this computing
model because we wanted to solve
problems that generalpurpose computers,
normal computers could not. We also
observed that someday transistors will
continue the number of transistors will
grow but the performance and the power
of transistors
will slow down that Moore's law will not
continue beyond limited by the laws of
physics and that moment has now arrived
dinard scaling has stopped it's called
dinard scaling dinard scaling has
stopped nearly a decade ago and in fact
the transistor performance and its power
associated
has slowed tremendously and yet the
number of transistor continued. We made
this observation a long time. Apply
parallel computing, add that to a
sequential processing CPU that we could
extend the capabilities of computing
well beyond well beyond and that moment
has really come. We have now seen that
inflection point. Accelerated computing
its moment has now arrived. However,
accelerated computing is a fundamentally
different programming model. You can't
just take a CPU software software
written by hand executing sequentially
and put it onto a GPU and have it run
properly. In fact, if you just did that,
it actually runs slower. And so you have
to reinvent new algorithms, you have to
create new libraries, you have to in
fact rewrite the application, which is
the reason why it's taken so long. It's
taken us nearly 30 years to get here.
But we did it one domain at a time.
This is the treasure of our company.
Most people talk about the GPU. The GPU
is important, but without a programming
model that sits on top of it, and
without dedication to that programming
model, keeping it compatible over
generations, we're now CUDA 13 coming up
with CUDA 14.
hundreds of millions of GPUs running in
every single computer perfectly
compatible. If we didn't do that, then
developers wouldn't target this
computing platform. If we didn't create
these libraries, then developers
wouldn't know how to use the algorithm
and use the architecture to its fullest.
One application after another. I mean,
these this is really the this is really
the treasure of our company. KU Litho,
computational lithography.
It took us nearly seven years to get
here with KU Litho and now TSMC uses it,
Samsung uses it, ASML uses it. This is
an incredible library for computational
lithography. The first step of making a
chip. Sparse solvers for CAE
applications.
Co-op, a numerical optimization has
broken just about every single record.
The traveling salesperson problem. How
to connect millions of products with
millions of customers in the supply
chain. Warp Python solver for CUDA for
simulation. QDF a dataf frame approach
basically accelerating
SQL dataf frame pro dataf frame
databases. Um this library is the one
that started AI alto together. CUD coup
DNN
the the the library on top of it called
Megatron Core made it possible for us to
simulate and train extremely large
language models. The list goes on. Uh
Monai really really important the number
one medical imaging
AI framework in the world. Uh, by the
way, we're not going to talk a lot about
healthcare today, but be sure to see
Kimberly's keynote. She's going to talk
a lot about the work that we do in
healthcare. And the list goes on. Uh,
genomics processing, Ariel, pay
attention. We're going to do something
really important here today. Um, coup
quantum quantum computing. This is just
a representative of 350 different
libraries in our company. And each one
of these libraries redesigned the
algorithm necessary for accelerated
computing. Each one of these libraries
made it possible for all of the
ecosystem partners to take advantage of
accelerated computing. And each one of
these libraries opened new markets for
us. Let's take a look at what CUDA X can
do.
Ready? Go.
Heat.
Heat.
Heat
up here.
Heat. Heat.
Heat. Heat.
Heat.
Heat.
Is that amazing?
Every
everything you saw was a simulation.
There was no art, no animation. This is
the beauty of mathematics. This is deep
computer science, deep math, and it's
just incredible how beautiful it is.
Every industry was covered from
healthcare and life sciences to
manufacturing robotics autonomous
vehicles, computer graphics, even video
games. That first shot that you saw was
the first application Nvidia ever ran.
And that's where we started in 1993. And
we kept believing in what we were trying
to do. And it took, it's hard to imagine
that you could see that first virtual
fighter scene come alive and that same
company believed that we would be here
today. It's just a really, really
incredible journey. I want to thank all
the NVIDIA employees for everything that
you've done. It's really incredible.
We have a lot of industries to cover
today. I'm going to cover AI,
6G,
quantum,
models,
enterprise computing, robotics, and
factories. Let's get started. We have a
lot to cover, a lot of big announcements
to make, a lot of new partners that
would very much surprise you.
Telecommunications
is the backbone, the lifeblood of our
economy, our industries, our national
security. And yet,
ever since the beginning of wireless
where we defined the technology, we
defined the global standards, we
exported American technology all around
the world so that the world can build
on top of American technology and
standards. It has been a long time since
that's happened. wireless technology
around the world largely today deployed
on foreign technologies.
Our fundamental communication fabric
built on foreign technologies
that has to stop and we have an
opportunity to do that especially during
this fundamental platform shift. As you
know computer technology is at the
foundation of literally every single
industry. It is the single most
important instrument of science. It's
the single most important instrument of
industry.
And I just said we're going through a
platform shift. That platform shift
should be the once-in-a-lifetime
opportunity for us to get back into the
game for us to start innovating with
American technology. Today, today we're
announcing that we're going to do that.
We have a big partnership with Nokia.
Nokia is the second largest
telecommunications maker in the world.
It's a three trillion dollar industry.
Infrastructure is hundreds of billions
of dollars. There are millions of base
stations around the world.
If we could partner, we could build on
top of this incredible new technology
fundamentally based on accelerated
computing and AI. and for United States,
for America to be at the center of the
next revolution in 6G. So today we're
announcing that Nvidia has a new product
line. It's called the Nvidia Arc. The
aerial radio network computer aerial RAM
computer arc. Arc is built from three
fundamental new technologies. the gray
CPU, the Blackwell GPU, and our ConnectX
Melanox Connect X networking designed
for this application. And all of that
makes it possible for us to run this
library, this CUDA X library that I
mentioned earlier called Aerial. Ariel
is essentially
a wireless
communication system running on top of
CUDA X. We're going to we're going to
create for the first time a
softwaredefined
programmable computer that's able to
communicate wirelessly and do AI
processing at the same time. This is
completely revolutionary. We call it
Nvidia Arc. And Nokia
is going to work with us to integrate
our technology, rewrite their stack.
This is a company with 7,000 fundamental
essential 5G patents.
Hard to imagine any greater leader in
telecommunications. So, we're going to
partner with Nokia. They're going to
make Nvidia Arc their future base
station. Nvidia Arc is also compatible
with Airscale, the current Nokia base
stations. So what that means is we're
going to take this new technology and
we'll be able to upgrade millions of
base stations around the world with 6G
and AI. Now 6G and AI is really quite
fundamental in the sense that for the
first time we'll be able to use AI
technology
AI for RAN to make radio communications
more spectral efficient
doing using artificial intelligence
reinforcement learning adjusting the
beam forming in real time in context
depending on the surroundings and the
traffic and the mobility the weather all
of that could be taken into account so
that we could improve spectral
efficiency. Spectral efficiency consumes
about 1 and a half to 2% of the world's
power. So improving spectral efficiency
not only improves the amount of data we
can put through wireless networks
without increasing the amount of energy
necessary. The other thing that we could
do
AI for RAN is AI on RAM. This is a brand
new opportunity. Remember the internet
enabled communications but amazingly
smart companies AWS built a cloud
computing system on top of the internet.
We are now going to do the same thing on
top of the wireless telecommunications
network. This new cloud will be an edge
industrial robotics cloud. This is where
AI on RAN, the first is AI for RAN to
improve radio radio spectrum efficiency.
The second is AI on RAN essentially
cloud computing for wireless
telecommunications.
Cloud computing will be able to go right
out to the edge where data centers are
not are not because we have base
stations all over the world. This
announcement is really exciting. Justin
Hodar, the CEO, I think he's somewhere
in the room. Thank you for your
partnership. Thank you for helping
United States bring telecommunication
technology back to America. This is
really a fantastic, fantastic
partnership. Thank you very much.
That's the best way to celebrate Nokia.
Let's talk about quantum computing.
1981
particle physicist quantum physicist
Richard Feman imagined a new type of
computer that can simulate nature
directly to simulate nature directly
because nature is quantum. He called it
a quantum computer. 40 years later the
industry has made a fundamental
breakthrough. 40 years later, just last
year, a fundamental breakthrough. It is
now possible to make one
logical cubit. One logical cubit. One
logical cubit that's coherent, stable,
and error corrected in past. Now that
one logical cubit consists of could be
sometimes tens, sometimes hundreds of
physical cubits all working together. As
you know, cubits, these particles are
incredibly fragile.
They could be unstable very easily. Any
observation, any sampling of it, any
environmental condition causes it to
become decoherent. And so it takes
extraordinarily well-controlled
environments and now also a lot of
different physical cubits for them to
work together and for us to do error
correction on these what are called
auxiliary or syndrome cubits for us to
error correct them and infer what that
logical cubit state is.
There are all kinds of different types
of quantum computers. Superconducting,
photonic, trapped ion, stable atom, all
kinds of different ways to create a
quantum computer. Well, we now realize
that it's essential for us to connect a
quantum computer directly to a GPU
supercomput so that we could do the
error correction, so that we could do
the artificial intelligence calibration
and control of the quantum computer and
so that we could do simulations
collectively working together. the right
algorithms running on the GPUs, the
right algorithms running on the QPUs and
the two processors, the two computers
working side by side. This is the future
of quantum computing. Let's take a look.
There are many ways to build a quantum
computer. Each uses cubits, quantum bits
as its core building block.
But no matter the method, all cubits,
whether superconducting cubits, trapped
ions, neutral atoms, or photons, share
the same challenge. They're fragile and
extremely sensitive to noise. Today's
Qbits remain stable for only a few
hundred operations. But solving
meaningful problems requires trillions
of operations. The answer is quantum
error correction. Measuring disturbs a
cubit which destroys the information
inside it. The trick is to add extra
cubits in tangle so that measuring them
gives us enough information to calculate
where errors occurred without damaging
the cubits we care about. It's brilliant
but needs beyond state-of-the-art
conventional compute.
That's why we built NVQLink, a new
interconnect architecture that directly
connects quantum processors with NVIDIA
GPUs.
Quantum error correction requires
reading out information from Qbits,
calculating where errors occur and
sending data back to correct them.
MVQLink is capable of moving terabytes
of data to and from quantum hardware,
the thousands of times every second
needed for quantum error correction.
At its heart is CUDAQ, our open platform
for quantum GPU computing. Using MVQL
link and CUDAQ, researchers will be able
to do more than just error correction.
They will also be able to orchestrate
quantum devices and AI supercomputers to
run quantum GPU applications.
Quantum computing won't replace
classical systems. They will work
together fused into one accelerated
quantum supercomputing platform.
Wow, this is a really long stage.
You know, CEOs, we don't just sit at our
desk typing. It's this is a physically
job. Physical job. So, so today we're
announcing the MV MVQ link MVQL link and
it's made possible by two things. Of
course, this interconnect that does
quantum computer control and
calibration,
quantum error correction as well as
connects two computers, the QPU and our
GPU supercomputers to do hybrid
simulations.
It is also completely scalable. It
doesn't just do error correction for
today's number of few cubits. It does
error correction for tomorrow where
we're going to essentially scale up
these quantum computers from the
hundreds of cubits we have today to tens
of thousands of cubits, hundreds of
thousands of cubits in the future. So we
now have an architecture that can do
control, co- simulation, quantum error
correction and scale into that future.
The industry support has been incredible
between the invention of CUDA Q.
Remember CUDA was designed for GPU CPU
accelerated computing. Basically using
both processors to do use the right tool
to do the right job. Now CUDA Q has been
extended beyond CUDA so that we could
support QPU and have the two processors
QPU and the GPU work and have
computation move back and forth within
just a few microsconds. The essential
latency to be able to cooperate with the
quantum computer. So now CUDAQ is such
an incredible breakthrough adopted by so
many different developers. We are
announcing today 17 different quantum
computer industry companies supporting
the MVQ link and and I'm so excited
about this eight different DOE labs
Berkeley Brook Haven Fermy Labs in
Chicago Lincoln Laboratory Los Alamos
Oakidge Pacific Northwest San Diego
Lancho Lab just about every single DOE
lab has engaged us working with our
ecosystem of quantum computer companies
and these quantum controllers so that we
could integrate quantum computing in
into the future of science.
Well, I have one more additional
announcement to make. Today, we're
announcing that the Department of Energy
is partnering with NVIDIA to build seven
new AI supercomputers to advance our
nation's science.
I have to have a shout out for Secretary
Chris Wright. He has brought so much
energy to the DOE, a surge of energy, a
surge of passion to make sure that
America leads science. Again as I
mentioned computing is the fundamental
instrument of science and we are going
through several platform shifts on the
one hand we're going to accelerated
computing that's why every future
supercomputer will be GPUbased
supercomputer
we're going to AI so that AI and
principled solvers principled simulation
principal physics simulation is not
going to go away but it could be
augmented enhanced scaled use surrogate
models AI models working together. We
also know that principal solvers,
classical computing, could be enhanced
to understand the state of nature using
quantum computing. We also know that in
the future, we have so much signal, so
much data we have to sample from the
world, remote sensing is more important
than ever. And these laboratories are
impossible to experiment at the scale
and speed we need to unless they're
robotic factories, robotic laboratories.
So all of these different technologies
are coming into science at exactly the
same time. Secretary Wright understands
this and he wants the DOE to take this
opportunity to supercharge themselves
and make sure the United States stay at
the forefront of science. I want to
thank all of you for that. Thank you.
Let's talk about AI.
What is AI? Most people would say that
AI is a chatbot and it it's rightfully
so. There's no question that chat GPT is
at the forefront of what people would
consider AI. However, just as you see
right now, these scientific
supercomputers are not going to run
chatbots. They're going to do basic
science. Science, AI, the world of AI is
much much more than a chatbot. Of
course, the chatbot is extremely
important and AGI is fundamentally
critical. Deep computer science,
incredible computing, great
breakthroughs are still essential for
AGI. But beyond that, AI is a lot more.
AI is in fact, I'm going to describe AI
in a couple different ways. That's first
way, the first way you think about AI is
that it has completely reinvented the
computing stack.
The way we used to do software was hand
coding. Hand coding software running on
CPUs.
Today AI is machine learning training
data inensive programming if you will
trained and learned by AI that runs on a
GPU. In order to make that happen, the
entire computing stack has changed.
Notice you don't see Windows up here.
You don't see CPU up here. You see a
whole different a whole fundamentally
different stack. Everything from the
need for energy. And this is another
area where our administration, President
Trump gets deserves enormous credit. His
pro- energy initiative, his recognition
that this industry needs energy to grow.
It needs energy to advance. and we need
energy to win. His recognition of that
and putting the weight of the nation
behind pro- energy growth completely
changed the game. If this didn't happen,
we could have been in a bad situation.
And I want to thank President Trump for
that.
On top of energy are these GPUs and
these GPUs are connected into built into
infrastructure that I'll show you later.
On top of this infrastructure which in
consists of giant data centers like
easily many times the size of this room
enormous amount of energy which then
transfer transforms the energy through
this new machine called GPU
supercomputers to generate numbers.
These numbers are called tokens.
the language, if you will, the
computational unit, the vocabulary of
artificial intelligence. You can
tokenize almost anything. You can
tokenize, of course, the English word.
You can tokenize images. That's the
reason why you're able to recognize
images or generate images, tokenize
video, tokenize 3D structures. You could
tokenize chemicals and proteins and
genes. You could tokenize cells,
tokenize almost anything with structure,
anything with information content.
Once you could tokenize it, AI can learn
that language and the meaning of it.
Once it learns the meaning of that
language, it can translate. It can
respond just like you respond just like
you interact with chat GBT. And it could
generate just as chat GBD can generate.
So all of the fundamental things that
you see Chad GPD do, all you have to do
is imagine what if it was a protein,
what if it was a chemical, what if it
was a 3D structure like a factory, what
if it was a robot and the token was
understanding behavior
and tokenizing motion and action. All of
those concepts are basically the same
which is the reason why AI is making
such extraordinary progress and on top
of these models are applications.
Transformers
transformers is not a universal model.
It's incredibly effective model but
there's no one universal model. It's
just that AI has universal impact. There
are so many different types of models.
There's in the last several years we
enjoyed the invention and went through
the innovation breakthroughs of
multimodality.
There's so many different types of
models. There's CNN models, competition
neuronet network models, there's state
space models, there graph neuronet
network models, multimodal models of
course, all the different tokenizations
and token methods that I just described.
You could have models that are spatial
in its understanding optimized for
spatial awareness. You could have models
that are optimized for long sequence
recognizing subtle information over a
long period of time. There are so many
different types of models.
On top of these models, architectures on
top of these model architectures are
applications.
the software of the past and this is a a
profound understanding a profound
observation of artificial intelligence
that the software industry of the past
was about creating tools. Excel is a
tool.
Word is a tool. A web browser is a tool.
The reason why I know these are tools is
because you use them. The tools industry
just as screwdrivers and hammers. The
tools industry is only so large. In the
case of IT tools, they could be database
tools. These IT tools is about a
trillion dollars or so. But AI is not a
tool.
AI is work.
That is the profound difference. AI is
in fact workers that can actually use
tools. One of the things I'm really
excited about is the work that Irvin's
doing at Perplexity. Perplexity using
web browsers to book vacations or do
shopping. Basically an AI using tools.
Cursor is an AI anantic AI system that
we use at NVIDIA. Every single software
engineer at NVIDIA uses Cursor. Has
improved our productivity tremendously.
It's basically a partner for every one
of our software engineers to generate
code and it uses a tool and the tool it
uses is called VS code. So cursor is an
AI agentic AI system that uses VS code.
Well, all of these different industries
whether it's chat bots or digital
biology where we have AI assistant
researchers or what is a robo taxi
inside a robo taxi? Of course, it's
invisible, but obviously there's a
there's a AI chauffeur. That chauffeur
is doing work. And the tool that it uses
to do that work is the car. And so
everything that we've made up until now,
the whole world, everything that we've
made up until now are tools. Tools for
us to use. For the very first time,
technology is now able to do work and
help us be more productive. The list of
opportunities go on and on, which is the
reason why AI addresses
the segment of the economy that it has
never addressed. It is a few trillion
dollars that sits underneath the tools
of a hundred trillion dollar global
economy. Now, for the first time, AI is
going to engage that hundred trillion
dollar economy and make it more
productive, make it grow faster, make it
larger. We have a severe shortage of
labor. Having AI that augments labor is
going to help us grow. Now what's
interesting about this from a technology
industry perspective also is that in
addition to the fact that AI is new
technology that addresses new segments
of the economy AI in itself is also a
new industry
this token as I was explaining earlier
these numbers
after you tokenize all these different
modalities of information there's a
factory that needs to produce these
numbers unlike the computer industry
indry and the chip industry of the past.
Notice if you look at the chip industry
of the past, the chip industry
represents about 5 to 10%
maybe less 5% or so of a multi- trillion
dollar few trillion dollar IT industry.
And the reason for that is because it
doesn't take that much computation to
use Excel. It doesn't take that much
computation to use browsers. It doesn't
take that much computation to use word.
We do the computation. But in this new
world,
there needs to be a computer that
understands context all the time. It
can't precomputee that because every
time you use the computer for AI, every
time you ask the AI to do something, the
context is different. So, it has to
process all of that information.
Environmental, for example, in the case
of a self-driving car, it has to process
the context of the car. context
processing. What is the instruction
you're asking the AI to do? Then it's
got to go and break down the problem
step by step, reason about it, and come
up with a plan and execute it. Every
single one of that step requires
enormous number of tokens to be
generated which is the reason why we
need a new type of system and I call it
an AI factory. It's an AI factory for
short. It's unlike a data center of the
past. It's an AI factory because
this factory produces one thing unlike
the data centers of the past that does
everything. Stores files for all of us,
runs all kinds of different
applications. You could use that data
center like you can use your computer
for all kinds of applications. You could
use it to play game one day. You could
use it to browse the web. You could use
it, you know, to do your accounting. And
so that is a computer of the past, a
universal generalpurpose computer.
The computer I'm talking about here is a
factory. It runs basically one thing. It
runs AI and its purpose, its purpose is
designed to produce tokens that are as
valuable as possible. Meaning they have
to be smart. And you want to produce
these tokens at incredible rates because
when you ask an AI for something, you
would like it to respond. And notice
during peak hours, these AIs are now
responding slower and slower because
it's got a lot of work to do for a lot
of people. And so you wanted to produce
valuable tokens at incredible rates and
you wanted to produce it cost
effectively. Every single word that I
used are consistent with an AI factory,
with a car factory or any factory. It is
absolutely a factory. And these
factories, these factories never existed
before. And inside these factories are
mountains and mountains of chips.
Which brings
to today.
What happened in the last couple years?
And in fact, what happened this last
year? Something fairly profound happened
this year. Actually, if you look in the
beginning of the year, everybody has
some attitude about AI. That attitude is
generally this is going to be big. It's
going to be the future. And somehow a
few months ago, it kicked into
turbocharge. And the reason for that is
several things.
The first is that we in the last couple
years have figured out how to make AI
much much smarter
rather than just pre-training.
Pre-training basically says let's take
all of the all of the information that
humans have ever created. Let's give it
to the AI to learn from. It's
essentially memorization and
generalization.
It's no it's not unlike going to school
back when we were kids. the first stage
of learning. Pre-training was never
meant just as preschool
was never meant to be the end of
education.
Pre-training, preschool was simply
teaching you the basic skills of
intelligence so that you can understand
how to learn everything else. Without
vocabulary, without understanding of
language and how to communicate, how to
think, it's impossible to learn
everything else. The next is
post-training. Post-training after
pre-training is teaching you skills.
Skills to solve problem. Break down
problems. Reason about it. How to solve
math problems. How to code. How to think
about these problems step by step. Use
first principal reasoning. And then
after that is where computation really
kicks in. As you know for many of us,
you know, we went to school and that's
in my case decades ago. But ever since
I've learned more, thought about more.
And the reason for that is because we're
constantly grounding oursel in new
knowledge. We're constantly doing
research and we're constantly thinking.
Thinking is really what intelligence is
all about. And so now we have three
fundamental technology skills. We have
these three technologies. Pre-training,
which still requires enormous enormous
amount of computation. We now have post
training which uses even more
computation and now thinking puts
incredible amounts of computation load
on the infrastructure because it's
thinking on our behalf for every single
human. So the amount of computation
necessary for AI to think inference is
really quite extraordinary. Now I used
to hear people say that inference is
easy. Nvidia should do training. Nvidia
is going to do, you know, they are
really good at this, so they're going to
do training. The inference was easy. How
could thinking be easy? Regurgitating
memorized content is easy. Regurgitating
the multiplication table is easy.
Thinking is hard, which is the reason
why these three scales, these three new
scaling laws, which is all of it in in
full steam, has put so much pressure on
the amount of computation. Now another
thing has happened from these three
scaling laws. We get smarter models and
these smarter models need more compute.
But when you get smarter models, you get
more intelligence.
People use it
as if anything happens. I want to be the
first one out.
Jazz. I'm sure it's fine. Probably just
lunch. My stomach.
Was that me?
And so, so where was I? The smarter your
models are, the smarter your models are,
the more people use it. It's now more
grounded. It's able to reason. It's able
to solve problems it never learn how to
solve before because it could do
research. Go learn about it. come back,
break it down, reason about how to solve
your how to answer your question, how to
solve your problem, and go off and solve
it. The amount of thinking is making the
models more intelligent. The more
intelligent it is, the more people use
it. The more intelligent it is, the more
computation is necessary. But here's
what happened.
This last year,
the AI industry turned the corner.
Meaning that the AI models are now smart
enough. They're making they're worthy.
They're worthy to pay for. Nvidia pays
for every license of Cursor. And we
gladly do it. We gladly do it because
Curser is helping a several hundred,000
employee software engineer or AI
researcher be many, many times more
productive. So, of course, we'd be more
than happy to do that. These AI models
have become good enough that they are
worthy to be paid for. Cursor, 11 Labs,
Syntheasia, A Bridge, Open Evidence, the
list goes on. Of course, open AI, of
course, Claude. These models are now so
good that people are paying for it. And
because people are paying for it and
using more of it, and every time they
use more of it, you need more compute.
We now have two exponentials.
These two exponentials, one is the
exponential compute requirement of the
three scaling law. And the second
exponential, the more pe the smarter it
is, the more people use it, the more
people use it, the more computing it
needs. Two exponentials now putting
pressure on the world's computational
resource
at exactly the time when I told you
earlier that Moore's law has largely
ended. And so the question is what do we
do? If we have these two exponential
demands growing and if we don't if we
don't find a way to drive the cost down
then this positive feedback system this
circular feedback system essentially
called the virtuous cycle essential for
almost any industry
essential for any platform industry. It
was essential for Nvidia. We have now
reached the virtual cycle of CUDA.
The more applications, the more the more
applications people create, the more
valuable CUDA is. The more valuable CUDA
is, the more CUDA computers are
purchased. The more CUDA p computers are
purchased, more developers want to
create applications for it. That virtual
cycle
for Nvidia has now been achieved after
30 years. We have achieved that also. 15
years later, we've achieved that for AI.
AI has now reached the virtual cycle and
so the more you use it because the AI is
smart and we pay for it the more profit
is generated the more profit generated
the more computes put to on the on the
grid the more compute is put into AI
factories the more comput the AI becomes
smarter the smarter more more people use
it more applications use it the more
problems we can solve this virtual cycle
is now spinning what we need to do is
drive the cost down tremendously ly so
that one the user experience is better
when you prompt the AI it responds to
you much faster and two so that we keep
this virtual cycle going by driving its
cost down so that it could get smarter
so that more people use it so that so on
so forth that virtual cycle is now
spinning but how do we do that when
Moore's law has really reached this
limit well the answer is called
co-design
you can't just design chips
and hope that things on on top of it is
going to go faster. The best you could
do in designing chips is add I don't
know 50% more transistors every couple
of years
and if you added more transistors just
you know we can add more transistors and
TSMC's got a lot of transistor
incredible company we just keep adding
more transistors however that's all in
percentages not exponentials
we need to compound exponentials to keep
this virtual cycle going extreme code
design is the only company in the world
that literally starts from a blank sheet
of paper and can think about new
fundamental architecture, computer
architecture, new chips, new systems,
new software, new model architecture,
and new applications all at the same
time. So many of the people in this room
are here because you're different parts
of that layer, that different parts of
that stack and working with NVIDIA.
We fundamentally rearchitect everything
from the ground up and then because AI
is such a large problem, we scale it up.
We created a whole computer, a computer
for the first time that has scaled up
into an entire rack. That's one
computer, one GPU. And then we scale it
out by inventing a new AI Ethernet
technology we call Spectrum Ethernet.
Everybody will say Ethernet is Ethernet.
Ethernet's hardly Ethernet.
Ethernet spectrum X Ethernet is designed
for AI performance and it's the reason
why it's so successful. And even that's
not big enough. We'll fill this entire
room of AI supercomputers and GPUs.
That's still not big enough because the
number of applications and the number of
users for AI is continuing to grow
exponentially. And we connect multiple
of these data centers together and we
call that scale across spectrum XGS
gigascale X spectrum X gigascale XGS. By
doing so, we do code design at such a
such an enormous level, such an extreme
level that the performance benefits are
shocking. Not 50% better each
generation, not 25% better each
generation, but much much more. This is
the most extreme co-designed computer
we've ever made and quite frankly made
in modern times. Since the IBM system
360, I don't think a computer has been
ground up, reinvented like this ever.
This system was incredibly hard to
create. I'll show you the benefits in
just a second. But essentially what
we've done, essentially what we've done,
we've created otherwise
Hey Janine, you can come out. It's
you have to have to meet me like
halfway.
All right. So, this is kind of like
Captain America shield.
So, MVLink 72, MVLink72,
if we were to create one giant chip, one
giant GPU, this is what it would look
like. This is the level of wafer scale
processing we would have to do.
It's incredible. All of this, all of
these chips are now put into one giant
rack.
Did I do that or did somebody else do
that? Into that one giant rack.
You know, sometimes I don't feel like
I'm up here by myself.
Just
this one giant rack makes all of these
chips work together as one. It's
actually completely incredible. And I'll
show you the benefits of that. The way
it looks is this. So, thanks Janine.
I I like this. All right, ladies and
gentlemen. Janine Paul.
I got it. In the future next, I'm just
going to go like Thor.
It's like when you're at home and and
you can't reach the remote and you just
go like this and somebody brings it to
you. That's Yeah. Same idea.
It never happens to me. I'm just
dreaming about it. I'm just saying.
Okay. So, so anyhow, anyhow, um we
basically this is what we created in the
past. This is MVLink MVLink 8. Now,
these models are so gigantic. The way we
solve it is we turn this model, this
gigantic model, into a whole bunch of
experts. It's a little bit like a team.
And so, these experts are good at
certain types of problems. And we
collect a whole bunch of experts
together. And so, this giant multi-
trillion dollar AI model has all these
different experts and we put all these
different experts on a GPU. Now, this is
NVLink 72.
We could put all of the chips into one
giant fabric and every single expert can
talk to each other. So the master the
the primary expert could talk to all of
the true work and all of the necessary
contexts and prompts and bunch of data
that we have to bunch of tokens that we
have to send to all of the experts. The
experts would whichever one of the
experts are selected to solve the answer
would then go off and try to respond and
then it would go off and do that layer
after layer after layer. Sometimes
eight, sometimes 16 and sometimes these
experts, sometimes 64, sometimes 256.
But the point is there are more and more
and more experts. Well, here MVLink 72,
we have 72 GPUs. And because of that, we
could put four experts in one GPU. The
most important thing you need to do for
each GPU is to generate tokens, which is
the amount of bandwidth that you have in
HBM memory. We have one H one GPU
generating thinking for four experts
versus here because each one of the
computers can only put eight GPUs. We
have to put 32 experts into one GPU. So
this one GPU has to think for 32 experts
versus this system each GPU only has to
think for four. And because of that the
speed difference is incredible. And this
just came out. This is the benchmark
done by semi analysis. They do a really
really thorough job and they benchmarked
all of the GPUs that are benchmarkable
and it turns out it's not that many. If
you look at the list of looks list of
GPUs you could actually benchmark is
like 90% Nvidia. Okay. And but so we're
comparing ourselves to ourselves but the
second best GPU in the world is the H200
and runs all the workload.
Grace Blackwell per GPU is 10 times the
performance.
Now, how do you get 10 times the
performance when it's only twice the
number of transistors?
Well, the answer is extreme code design.
And by understanding the nature of the
future of AI models and we're thinking
across that entire stack, we can create
architectures for the future. This is a
big deal. It says we can now respond a
lot faster. But this is the even bigger
deal. This next one, look at this. This
says
that the lowest cost tokens in the world
are generated by Grace Blackwell
MVLink72. The most expensive computer.
On the one hand, GB200 is the most
expensive computer. On the other hand,
its token generation capability is so
great that it produces it at the lowest
cost because the tokens per second
divided by the t by the total cost of
ownership of Grace Blackwell is so good.
It is the lowest cost way to generate
tokens. By doing so, delivering
incredible performance, 10 times the
performance, incre delivering 10 times
lower cost, that virtual cycle can
continue. Which then brings me to this
one. I just saw this literally
yesterday. This is uh the CSP capex.
People are asking me about capex these
days and um this is a good way to look
at it. In fact, the capex of the top six
CSPs and this one, this one is Amazon,
Corewave Google Meta Microsoft and
Oracle. Okay, these CSPs together are
going to invest this much in capex. And
I would I would tell you the timing
couldn't be better. And the reason for
that is now we have the Grace Blackwell
MVLink72 in all volume production,
supply chain everywhere in the world is
manufacturing it. So we can now deliver
to all of them this new architecture so
that the capex invests in instruments
computers that deliver the best TCO. Now
underneath this there are two things
that are going on. So when you look at
this it's actually fairly extraordinary
and it's fairly extraordinary anyhow.
But what's happening under underneath is
this there are two platform shifts
happening at the same time.
One platform shift is going from general
purpose computing to accelerated
computing. Remember accelerated
computing as I mentioned to you before
it does data processing, it does image
processing, computer graphics, it does
com comput computation of all kinds. It
runs SQL, runs spark, it runs, you know,
you you ask it, you tell us what you
need to have run, and I'm fairly certain
we have an amazing library for you. You
could be, you know, a data center trying
to make masks to manufacture
semiconductors. we have a great library
for you. And so underneath irrespective
of AI, the world is moving from general
purpose computing to accelerated
computing irrespective of AI. And in
fact, many of the CSPs already have
services that have been here long ago
before AI. Remember, they were invented
in the era of machine learning.
classical machine learning algorithms
like XG boost, algorithms like um uh
data frames that are used for recommener
systems, collaborative filtering,
content filtering, all of those
technologies were created in the old
days of general purpose computing. Even
those algorithms, even those
architectures are now better with
accelerated computing. And so even
without AI, the world's CSPs are going
to invest into acceleration. Nvidia's
GPU is the only GPU that can do all of
that plus AI. And ASIC might be able to
do AI, but it can't do any of the
others.
Nvidia could do all of that, which
explains why it is so safe to just lean
into Nvidia's architecture. We have now
reached our virtual cycle, our
inflection point. And this is quite
extraordinary. I have many partners in
the room and all of you are part of our
supply chain and I know how hard you
guys are working. I want to thank all of
you how hard you are working and thank
you very much.
Now I'm going to show you why
this is what's going on in our company's
business. We're seeing extraordinary
growth for Grace Blackwell for all the
reasons that I just mentioned. It's
driven by two exponentials. We now have
visibility.
I think we're probably the first
technology company in history to have
visibility into half a trillion dollars
of cumulative blackwell and early ramps
of Reubin
through 2026. And as you know, 2025 is
not over yet and 2026 hasn't started.
This is how much business is on the
books. Half a trillion dollars worth so
far. Now, this is out of that. We've
already shipped 6 million of the
Blackwells in the first several
quarters. I guess the first four
quarters of production, three and a half
quarters of production. We still have
one more quarter to go for 2025. And
then we have four quarters. So the next
five quarters there's $500 million $500
billion half a trillion dollars. That's
five times the growth rate of Hopper.
That kind of tells you something. This
is Hopper's entire life. This doesn't
include China and and um and Asia. So
this is just uh the West. Okay? This is
just uh we're excluding China. So Hopper
in its entire life 4 million GPUs.
Blackwell. Each one of the Blackwells
has two GPUs in it in one large package.
20 million GPUs of Blackwells in the
early parts of Reuben. Incredible
growth. So, I want to thank all of our
supply chain partners. Everybody, I know
how hard you guys are working. I made a
video to celebrate your work. Let's play
it.
The age of AI has begun.
Blackwell is its engine, an engineering
marvel.
In Arizona, it starts as a blank silicon
wafer.
Hundreds of chip processing and
ultraviolet lithography steps build up
each of the 200 billion transistors
layer by layer on a 12in wafer. In
Indiana, HBM stacks will be assembled in
parallel. HBM memory dies with 1,024
IO's are fabricated using advanced EUV
technology through silicon via is used
in the back end to connect 12 stacks of
HBM memory and base dye to produce HBM.
Meanwhile, the wafer is scribed into
individual Blackwell dye, tested and
sorted, separating the good dyes to move
forward. The chip on wafer on substrate
process attaches 32 Blackwell dyes and
128 HBM stacks on a custom silicon
interposer wafer.
Metal interconnect traces are etched
directly into it, connecting Blackwell
GPUs and HBM stacks into each system and
package unit, locking everything into
place. Then the assembly is baked,
molded, and cured, creating the GB300
Blackwell Ultra Super Chip.
In Texas, robots will work around the
clock to pick and place over 10,000
components onto the Grace Blackwell PCB.
In California, Connect X8 Super Nix for
scaleout communications and Bluefield 3
DPUs for offloading and accelerating
networking, storage, and security are
carefully assembled into GB300 compute
trays.
MVLink is the breakthrough high-speed
link that Nvidia invented to connect
multiple GPUs and scale up into a
massive virtual GPU.
The MVLink switch tray is constructed
with MVLink switch chips providing 14.4
terabytes per second of all to all
bandwidth. MVLink spines form a custom
blindmated back plane with 5,000 copper
cables connecting all 72 black wells or
144 GPU dies into one giant GPU
delivering 130 terabytes per second of
all to all bandwidth. nearly the global
internet's peak traffic.
Skilled technicians assemble each of
these parts into a rack scale AI
supercomput.
In total, 1.2 million components, 2 m of
copper cable, 130 trillion transistors,
weighing nearly 2 tons.
From silicon in Arizona and Indiana to
systems in Texas, Blackwell and future
Nvidia AI factory generations will be
built in America,
writing a new chapter in American
history and industry.
America's return to making and
reindustrialization,
reignited by the age of AI.
The age of AI has begun.
Made in America.
Made for the world.
We are manufacturing in America again.
It is incredible. The first thing that
President Trump asked me for is bring
manufacturing back. Bring manufacturing
back because it's it's necessary for
national security. bring manufacturing
back because we want the jobs and we
want that part of the economy. And nine
months later, nine months later, we are
now manufacturing in full production
Blackwell in Arizona.
Extreme Blackwell GB 200 MV Grace
Blackwell Envy 72 extreme code design
gives us 10x generationally. It's
utterly incredible. Now, the part that's
really incredible is this. This is the
first AI supercomputer we made. This is
in 2016 when I delivered it to a startup
in San Francisco which turned out to
have been OpenAI. This was the computer.
And in order to do the create that
computer, we designed one chip.
We designed one new chip in order for us
to do code design. Now, look at all of
the chips we have to do. This is what it
takes. You're not going to take one chip
and make a computer 10 times faster.
That's not going to happen. The way to
make computers 10 times faster that we
can keep increasing the performance
exponentially, we can keep driving cost
down exponentially is extreme code
design and working on all these
different chips at the same time. We now
have Ruben back home. This is Ruben.
This is the Vera Rubin and and uh Ruben.
Ladies and gentlemen, Ruben
This is this is our third generation
MVLink 72 rack scale computer. Third
generation GB200 was the first one. All
of our partners around the world, I know
how hard you guys worked. It was
insanely hard. It was insanely hard to
do. Second generation, so much smoother.
And this generation, look at this.
Completely cableless.
completely cableless. And this is this
is all back in the lab now. This is the
next generation Reuben. While we're
shipping GB300's,
we're preparing Reuben to be in
production. You know, this time next
year, maybe slightly earlier. And so,
every single year, we are going to come
up with the most extreme code design
system so that we can keep driving up
performance and keep driving down the
token generation cost. Look at this.
This is just an incredibly beautiful
computer. Now,
so this is amazing. This is 100 pedlops.
I know this doesn't mean anything. 100
pedlops. But
compared to the DGX1 I delivered to
OpenAI 10 years ago, 9 years ago,
it's 100 times the performance right
here versus 100 times of that
supercomput. 100 times a 100 of those,
let's see, a 100 of those would be like
25 of these racks all replaced by this
one thing.
One
Vera Rubin Okay. So this is this is the
compute tray
and this is
so Vera Rubin super chip.
Okay. And this is the compute tray. This
Oh right here.
It's incredibly easy to install. Just
flip these things open. Shove it in.
Even I could do it. Okay. And this is
the ver Vera Rubin compute tray. If you
decide you wanted to add a special
processor, we've added another
processor. It's called a context
processor because the amount of context
that we give AIS are larger and larger.
We wanted to read a whole bunch of PDFs
before it answer a question. Wanted to
read a whole bunch of archive papers,
watch a whole bunch of videos. Go learn
all this before you answer a question
for me. All of that context processing
could be added. And so you see on the
bottom eight connectx9
new super nicks you have CX you have uh
CPXs eight of them you have uh blue
field 4 this new data processor two Vera
CPUs and four Reuben packages or eight
Reuben GPUs all of that in this one node
completely cableless.
100% liquid cooled. And then this new
processor, I won't talk too much about
it today. I don't have enough time, but
this is completely revolutionary. And
the reason for that is because your AIS
need to have more and more memory.
You're interacting with it more. You
wanted to remember our last
conversation, everything that you've
learned on my behalf. Please don't
forget it when I come back next time.
And so all of that memory is going to
create this thing called KV caching. and
that KV caching retrieving it. You might
have noticed every time you go into your
your your AIS these days, it takes
longer and longer to refresh and
retrieve all of the previous
conversations and and the reason for
that is we need a revolutionary new
processor and that's called Blue Fuel 4.
Next is the the ConnectX switch uh
excuse me the MVLink switch which is
right here.
Okay, this is the MVLink switch. This is
what makes it possible for us to con
connect all of the computers together.
And this switch is now several times
the bandwidth of the entire world's peak
internet traffic. And so that spine is
going to communicate and carry all of
that data simultaneously to all of the
GPUs. On top of that, on top of that,
this is the this is the Spectrum X
switch. And this Ethernet switch was
designed so that all of the processors
could talk to each other at the same
time and not gum up the network. Gum up
the network. That's very technical.
Okay. So, um, so these are the these
three combined. And then this is the
quantum switch. This is for Infiniband.
This is Ethernet. We don't care what
language you would like to use, whatever
standard you like to use. We have great
scale out fabrics for you. Whether it's
Infiniban or Quant or Spectrum Ethernet,
this one uses silicon photonics and is
completely co-acked options. Basically,
the laser comes right up to the silicon
and connects it to our chips. Okay, so
this is the Spectrum X Ethernet. And so
now, let's talk about Thank you. Oh,
this is this is what it looks like. This
is a rack.
This is two and a half. This is two uh
2000. This is two tons.
1.5 million parts.
And the spine, this spine right here
carries the entire internet traffic in
one second. Same speed moves across all
of these different processors. 100%
liquid cooled. All for the, you know,
fastest token generation rate in the
world. Okay, so that's what a rack looks
like. Now that's one rack. A gigawatt
data center would have
you know call it
let's see 16 racks would be a th00and
um and then 500 of those. So whatever
500 time 16 and so call it 9,000 of
these 8,000 of these would be a one
gigawatt data center. Okay. And so
that's a future AI factory. Now we used,
as you notice, Nvidia started out by
designing chips and then we started to
design systems and we designed AI
supercomputers. Now we're designing
entire AI factories. Every single time
we move out and we integrate more of the
problem to solve, we come up with better
solutions. We now build entire AI
factories.
This is going this AI factory is what
we're building for Vera Rubin and we
created a technology that makes it
possible for all of our partners to
integrate into this factory digitally.
Let me show it to you.
The next industrial revolution is here
and with it a new kind of factory.
AI infrastructure is an ecosystem scale
challenge
requiring hundreds of companies to
collaborate.
NVIDIA Omniverse DSX is a blueprint for
building and operating gigascale AI
factories.
For the first time, the building, power,
and cooling are co-designed with
NVIDIA's AI infrastructure stack.
It starts in the Omniverse digital twin.
Jacob's engineering optimizes compute
density and layout to maximize token
generation according to power
constraints.
They aggregate SIM ready open USD assets
from Seammen's Schneider Electric Train
and Vertive into PTC's product life
cycle management.
Then simulate thermals and electricals
with CUDA accelerated tools from EAB
and Cadence.
Once designed, NVIDIA partners like
Bectal and Vertive deliver
pre-fabricated modules factory-built,
tested, and ready to plug in. This
shrinks build time significantly,
achieving faster time to revenues.
When the physical AI factory comes
online,
the digital twin acts as an operating
system.
Engineers prompt AI agents from FIDRA
and Emerald AI, previously trained in
the digital twin to optimize power
consumption and reduce strain on both
the AI factory and the grid.
In total, for a 1 gawatt AI factory, DSX
optimizations can deliver billions of
dollars in additional revenue per year
across Texas,
Georgia, and Nevada.
NVIDIA's partners are bringing DSX to
life. In Virginia, NVIDIA is building an
AI factory research center using DSX to
test and productize Vera Rubin from
infrastructure to software.
With DSX, NVIDIA partners around the
world can build and bring up AI
infrastructure faster than ever.
completely completely in digital long
long before Vera Rubin exists as a real
computer we've been using it as a
digital twin computer now long before
these AI factories exist we will use it
we will design it we'll plan it we'll
optimize it and we'll operate it as a
digital twin and so all of our partners
that are working with us I'm incredibly
happy for all of you supporting us And
Gio is here and G ver Vernova is here.
Schlider I I think um I think uh uh
Olivia is here. Olivia Blum is here. Um
uh uh Seaman's incredible partners.
Okay. Roland Bush, I think he's
watching. Hi Roland. And so anyways, uh
really really great partners working
with us.
In the beginning we had CUDA and we have
all these different ecosystems of
software partners. Now we have Omniverse
DSX and we're building AI factories and
again we have these incredible ecosystem
of partners working with us. Let's talk
about models.
Open source models particularly in the
last couple years several things have
happened. One, open source models have
become quite capable because of
reasoning capabilities. It has become
quite capable because they're
multimodality and they're incredibly
efficient because of distillation. So
all of these different capabilities have
become uh has made open source models
for the very first time incredibly
useful for developers. They are now the
lifeblood of startups.
Lifeblood of startups in different
industries because obviously as I
mentioned before each one of the
industries have its own use case it own
use cases it own data it owned data it
own flywheels. All of that capability,
that domain expertise needs to have the
ability to embed into a model. Open
source makes that possible. Researchers
need open-source. Developers need
open-source. Companies around the world,
we need open source. Open- source models
is really, really important.
The United States has to lead in open
source as well. We have amazing
proprietary models. We have amazing
proprietary models. We need also amazing
open source models. Our country depends
on it. Our startups depend on it. And so
Nvidia is dedicating ourselves to go do
that. We are now the largest the largest
we lead in open-source contribution. We
have 23 models in leaderboards. We have
all these different domains from
language models the physical AI models.
I'm going to talk about robotics models
to biolog biology models. Each one of
these models has enor enormous teams and
that's one of the reasons why we built
supercomputers for ourselves to enable
all these models to be created. We have
number one speech model, number one
reasoning model, number one physical AI
model. The number of downloads is really
really terrific. We are dedicated to
this and the reason for that is because
science needs it, researchers need it,
startups need it and companies need it.
I'm delighted that AI startups build on
Nvidia. They do so for several reasons.
One, of course, our ecosystem is rich.
Our tools work great. All of our tools
work on all of our GPUs. Our GPUs are
everywhere. It's literally in every
single cloud. It's available on prem.
You could build it yourself. You could
you could, you know, build up a an
enthusiast gaming PC with multiple GPUs
in it and you could download our
software stack and it it just works. We
have the benefit of rich developers who
are making this ecosystem richer and
richer and richer. So, I'm really
pleased with all of the startups that
we're working with. I'm I'm thankful for
that. It is also the case that many of
these startups are now starting to
create even more ways to enjoy our GPUs.
the Cordwaves, Nscale, Nimbius, Llama,
Lambda, all of these companies, Crusoe
companies are building these new GPU
clouds to serve the startups and I
really appreciate that this is all
possible because Nvidia is everywhere.
We integrate our libraries. All of the
CUDA X libraries I tal talked to you
about. All the open-source AI models
that I talked about. All of the models
that I talked about, we integrated into
AWS, for example, really love working
with Matt. We integrated into Google
Cloud, for example, really love working
with Thomas. Each one of these clouds
integrate NVIDIA GPUs and our computing,
our libraries as well as our models.
Love working with Satia over at
Microsoft Azure. love working with uh
Clay at Oracle. Each one of these clouds
integrate the NVIDIA stack. As a result,
wherever you go, whichever cloud you
use, it works incredibly. We also
integrate Nvidia libraries into the
world SAS so that each one of these SAS
will eventually become agentic SAS. I
love Bill McDormer's vision for Service
Now. There. Yeah, there you go.
I think that might have been Bill.
Hi, Bill. And so, Service Now, what is
it? 85% of the world's enterprise
workloads, workflows. SAP, 80% of the
world's commerce. Christian Klein and I
are working together to integrate NVIDIA
libraries, CUDA X and Nemo and Neotron,
all of our AI systems into SAP. working
with Cassine over at Synopsis
accelerating the world CAE, CAD, EDA
tools so that they could be faster and
could scale helping them create AI
agents. One of these days I would love
to hire a AI agent as designers to work
with our ASIC designers essentially the
cursor of Synopsis if you will. We're
working with uh Annie Rude. Annie Rude
here. I saw him earlier today. He was
part of the pregame show. Cadence doing
incredible work accelerating their
stack, creating AI agents so that we can
have cadence AI as designers and system
designers working with us. Today we're
announcing a new one.
AI
will supercharge productivity. AI will
transform just about every industry. But
AI will also supercharge
cyber security challenges, the bad AIs.
And so we need an incredible defender. I
can't imagine a better defender than
Crowd Strike. George George is here. Uh
he was here. Yep, I saw him earlier.
We are partnering with Crowdstrike to
make cyber security speed of light to
create a system that has cyber security
AI agents in the cloud but also
incredibly good AI agents on prem or at
the edge. This way you whenever there's
a threat you are moments away from
detecting it. We need speed and we need
a fast agentic AI super a super smart
AIs.
I have a second announcement. This is
the single fastest enterprise enterprise
company in the world.
Probably the single most important
enterprise stack in the world today.
Palunteer ontology.
Anybody from Palunteer here? I just I
was just talking to Alex earlier.
This is Palenter ontology. They take
information,
they take data, they take human judgment
and they turn it into business insight.
We work with Palanteer to accelerate
everything Palanteer does so that we
could do data processing data processing
at a much much larger scale and more
speed whether it's structured data of
the past and of course we'll have
structured data human recorded data
unstructured data and process that data
for our government for national security
and for enterprises around the world
process that data at speed of light and
to find insight from it. This is what
it's going to look like in the future.
Palunteer is going to integrate Nvidia
so that we could process at the speed of
light and at extraordinary scale.
Okay, Nvidia and Palunteer.
Let's talk about physical AI. Physical
AI requires three computers. Just as it
takes two computers to train a language
model, one that's to train it, evaluate
it, and then inference it. Okay, so
that's the large GB 200 that you see. In
order to do it for physical AI, you need
three computers. You need the computer
to train it. This is GB, the Grace
Blackwell, MVLink72. We need a computer
that does all of the simulations that I
showed you earlier with Omniverse DSX.
It basically is a digital twin for the
robot to learn how to be a good robot
and for the factory to essentially be a
digital twin. That computer is the
second computer, the omniverse computer.
This computer has to be incredibly good
at generative AI and it has to be good
at computer graphics, sensor simulation,
ray tracing, signal processing, this
computer is called the omniverse
computer. And once we train the model,
simulate that AI inside a digital twin
and that digital twin could be a digital
twin of a factory as long as well as a
whole bunch of digital twins of robots.
Then you need to operate that robot. And
this is the robotic computer. This is
this one goes into a self-driving car.
Half of it could go into a robot. Okay?
Or you could actually have, you know,
robots that are quite agile and quite
quite fast in operations. And it might
take two of these computers. And so this
is the Thor Jetson Thor robotics
computer. These three computers all run
CUDA. And it makes it possible for us to
advance physical AI. AI that understand
the physical world, understand laws of
physic causality
permanence, you know,
physical AI. We have incredible partners
working with us to create the physical
AI of factories. We're using it
ourselves to create our factory in
Texas. Now, once we create the robotic
factory, we have a bunch of robots that
are inside it. And these robots also
need the physical AI, applies physical
AI and works inside the digital twin.
Let's take a look at it.
America is re-industrializing,
reshoring manufacturing across every
industry. In Houston, Texas, Foxcon is
building a state-of-the-art robotic
facility for manufacturing NVIDIA AI
infrastructure systems.
With labor shortages and skills gaps,
digitalization, robotics, and physical
AI are more important than ever.
The factory is born digital
in Omniverse.
Foxcon engineers assemble their virtual
factory in a seaman's digital twin
solution developed on Omniverse
Technologies. Every system, mechanical,
electrical, plumbing, is validated
before construction.
Seaman's plant simulation runs design
space exploration optimizations
to identify ideal layout.
When a bottleneck appears, engineers
update the layout with changes managed
by Seaman's team center.
In Isaac sim, the same digital twin is
used to train and simulate robot AIS.
In the assembly area, fanic manipulators
build GB300 tray modules
by manual manipulators from FII and
Skilled AI install bus bars into the
trays
and AMRs shuttle the trays to the test
pods.
Then Foxcon uses Omniverse for
large-scale sensor simulation where
robot AIs learn to work as a fleet.
In Omniverse, vision AI agents built on
NVIDIA Metropolis and Cosmos.
Watch the fleets of robots and workers
from above to monitor operations
and alert Foxcon engineers of anomalies
and safety violations.
or even quality issues.
And to train new employees, agents power
interactive AI coaches for easy worker
onboarding.
The age of US re-industrialization is
here with people and robots working
together.
That's the the future of manufacturing,
the future of factories. I want to thank
our partner Foxcon Younglu, the CEO, is
here, but all of these ecosystem
partners make it possible for us to
create the future of robotic factories.
The factory is essentially a robot
that's orchestrating robots
to build things that are robotic. You
know this is the amount of software
necessary to do this is so intense that
unless you could do it inside a digital
twin to dis to plan it to design it to
operate inside a digital twin the hopes
of getting this to work is nearly
impossible. I'm so happy to see also
that Caterpillar, my friend Joe Joe
Creed and his hundredyear-old company is
also incorporating digital twins and the
way they manufacture. Um these factories
will have future robotic systems and one
of the most advanced is figure. Brett
Atcock is here today. He just he founded
a company three and a half years ago.
They're worth almost $40 billion. Today
we're working together in training the
the AI, training the robot, simulating
the robot and of course the robotic
computer that goes into figure really
amazing. Uh I had the benefit of seeing
it. Uh it's really quite quite
extraordinary. It is very likely that
humano robots and uh my friend Elon is
also working on this that this is likely
going to be one of the largest consumer
new consumer electronics markets and
surely one of the largest industrial
equipment market. Peggy Johnson and the
folks at Agility are working with us on
robots for warehouse automation. The
folks at Johnson Johnson working with us
again training the robot, simulating it
in digital twins and also operating the
robot. These John Johnson and Johnson
surgical robots are even going to
perform surgery that are completely
noninv noninvasive surgery at a
precision the world's never seen before.
And of course, the cutest robot ever,
the cutest robot ever, the Disney robot.
And this is this is um something really
close to our heart. We're working with
Disney research on a entirely new
framework and sim simulation platform uh
based on revolutionary technology called
Newton. And that Newton uh simulator
makes it possible for the the robot to
learn how to be a good robot inside a
physically aware physically based
environment. Let's take a look at it.
Excuse
me.
Fluffy.
Blue, ladies and gentlemen. Disney Blue.
Tell me that's not adorable. He's not
adorable.
We all want one. We all want one. Now
remember everything you were just seeing
that is not animation. It's not a movie.
It's a simulation. That simulation is an
omniverse. Omniverse. The digital twin.
So these digital twins of factories,
digital twins of warehouses, digital
twins of surgical rooms, digital twins
where blue could learn how to manipulate
and navigate and you know interact with
the world. All completely done in real
time. This is going to be the largest
consumer electronics product line in the
world. Some of them are just really
working incredibly well now. This is a
future of human or robotics and of
course blue. Okay.
Now, human robots is still in
development. But meanwhile,
there's one robot that is clearly at an
inflection point and it is basically
here and that is a robot on wheels. This
is a robo taxi. A robo taxi is
essentially an AI chauffeur. Now, one of
the things that we're doing today, we're
announcing the NVIDIA drive Hyperion.
This is a big deal.
We created this architecture so that
every car company in the world could
create cars, vehicles could be
commercial, could be passenger, could be
dedicated to robo taxi. Create vehicles
that are robo taxi ready. The sensor
suite with surround cameras and radars
and LAR
make it possible for us to achieve the
highest level of surround cocoon sensor
perception and redundancy necessary for
the highest level of safety.
Hyperion drive drive Hyperion is now
designed into Lucid Mercedes-Benz my
friend Ola Ken Canel Kenas the folks at
Stalantis and there are many other cars
coming and once you have a basic
standard platform then developers of AV
systems and there's so many talented
ones wave wabby Aurora Momenta Neuro
there's so many of them we ride there's
so many of them that can then take their
AV AV system and run it on the standard
chassis. Basically, the standard chassis
has now become a computing platform on
wheels. And because it's standard and
the sensor suite is comprehensive, all
of them could deploy their AI to it.
Let's take a quick look.
Okay, that's the be that's beautiful San
Francisco. And as you could see, as you
could see, robo taxis inflection point
is about to get here. And in the future,
a trillion miles a year that are driven,
a 100 million cars made each year.
There's some 50 million taxis around the
world. It's going to be augmented by a
whole bunch of robo taxis. So, it's
going to be a very large market to
connect it and deploy it around the
world. Today, we're announcing a
partnership with Uber. Uber, Derek, Dara
K, Dara is going to we're working
together to connect these Nvidia drive
Hyperion cars into a global network and
now in the future you'll, you know, be
able to hail up one of these cars and
the ecosystem is going to be incredibly
rich and we'll have Hyperion or Robo
taxi cars all over the world. This is
going to be a new computing platform for
us and I'm expecting it to be quite
successful.
Okay.
So this is what we talked about today.
We talked about a large large number of
things we spoke about. Remember at the
core of this is two or two platform
transitions from general purpose
computing to accelerated computing.
NVIDIA CUDA and those suite of libraries
called CUDA X has enabled us to address
practically every industry and we're at
the inflection point. It is now growing
as a virtual cycle would suggest. The
second inflection point is now upon us.
The second platform transition AI from
classical handwritten software to
artificial intelligence. two platform
transitioning happening at the same time
which is the reason why we're feeling
such incredible growth. quantum quantum
computing. We spoke about open models.
We spoke about we spoke about enterprise
with crowd strike and uh palunteer
accelerating their platforms. Uh we
spoke about robotics a new potentially
one of the largest consumer electronics
and industrial manufacturing sectors.
And of course we spoke about 6G. Nvidia
has new platforms for 6G. We call it
ARC. We have a new platform for robotics
cars. We call that Hyperion. We have new
platforms
even for factories. Two types of
factories. The AI factory we call that
DSX. And then factories with AI we call
that mega. And so now we're also
manufacturing in America. Ladies and
gentlemen, thank you for joining us
today and thank you for allowing me to
bring
Thank Thank you for for allowing us to
bring GTC to Washington DC. We're going
to do it hopefully every year. And thank
you all for your service and making
America great again. Thank you.
We start with a handshake. Solid and
true. One step at a time, we're breaking
through. Brick by brick, we're stacking
dreams high. Side by side, we'll touch
the sky. Handshakes and high hopes we're
making our way. Shoulder to shoulder
come what may shed vision brighter than
the sun.
Friendship and business rolling as one.
Plans on paper but hearts in sink.
Building together faster than you think.
Laughter's the glue in the grind we
share. We've got the spark. We're going
somewhere. Handshakes and high hopes.
We're making our way shoulder to come
with me. Share vision brighter than the
sun.
Friendship and business.
Loading video analysis...