黃仁勳GTC DC演講!NVIDIA 宣布美國製 AI 超級工廠計畫|Jensen Huang: NVIDIA Announces U.S.-Made AI Superfactory
By New SciTech 新科技
Summary
## Key takeaways - **Accelerated Computing is the New Standard**: Moore's Law has ended, and Denard scaling stopped a decade ago. Nvidia's accelerated computing model, powered by GPUs and CUDA, is now essential for solving problems beyond the capabilities of traditional CPUs, marking a fundamental shift in computing. [01:34], [03:05] - **AI Factories: The Future of Production**: AI is not just a chatbot; it's a new industry creating 'AI factories' that produce valuable 'tokens' cost-effectively. These factories are unlike traditional data centers, designed specifically for the immense computational demands of AI. [25:50], [32:53] - **Quantum-GPU Hybrid Computing Arrives**: Nvidia is connecting quantum processors directly to GPU supercomputers via NVQ Link and CUDA-Q. This hybrid approach is crucial for quantum error correction, AI calibration, and enabling complex simulations that were previously impossible. [16:00], [18:03] - **U.S. Manufacturing Reborn with AI Superfactories**: Nvidia is establishing AI supercomputing and manufacturing facilities in Arizona and Texas, signifying a major comeback for U.S.-based manufacturing. This initiative redefines the tech industry's future by integrating AI into the production process. [29:00], [44:45] - **6G and AI: Revolutionizing Telecommunications**: Nvidia's partnership with Nokia to create the Nvidia Ark platform leverages AI for 6G, aiming to restore U.S. leadership in global telecommunications. This technology will enable more spectral efficiency and create a new edge cloud computing layer on wireless networks. [08:49], [10:00] - **Physical AI: Robots Learn to Interact with the World**: Physical AI requires three specialized computers for training, simulation (Omniverse digital twin), and operation. This advancement is driving the development of humanoid robots and autonomous vehicles, with Nvidia's platforms enabling AI to understand and interact with the physical world. [19:05], [21:24]
Topics Covered
- Moore's Law is dead; accelerated computing is the future.
- AI transforms the economy by doing work, not just being tools.
- AI factories: a new computational infrastructure for token generation.
- Extreme co-design compounds exponentials to drive AI's virtuous cycle.
- Physical AI and digital twins are revolutionizing factories and robotics.
Full Transcript
welcome to the stage Nvidia founder and CEO Jensen Wang
Washington DC
Washington DC welcome to GTC
it's hard not to be sentimental and proud of America
I gotta tell you that was that video amazing
thank you
Nvidia's creative team does an amazing job
welcome to GTC we have a lot to cover with you today
um GTC is where we talk about industry science
computing
the present and the future
so I've got a lot to cover with you today
but before I start I want to thank all of our partners
who helped sponsor this great event
you'll see all of them
so they're here to meet with you and uh
a really great we couldn't do what we do
without all of our ecosystem partners
this is the Super Bowl of AI
people say and therefore
every Super Bowl should have an amazing pre game show
what do you guys think about the pre game show
and our all all star
all star athletes and all star cast look at these guys
somehow I turned out the buffest
what do you guys think
I don't know if I had something to do with that
Nvidia invented a new computing model
for the first time in 60 years
as you saw in the video
a new computing model rarely comes about
it takes an enormous amount of time
and set of conditions we observed
we invented this computing model
because we wanted to solve problems that general
purpose computers normal computers could not
we also observed that someday transistors will continue
the number of transistors will grow
but the performance
and the power of transistors will slow down
that Moore's Law will not continue beyond
be limited by the laws of physics
and that moment has now arrived
Denard scaling has stopped
it's called Denard scaling
Denard scaling has stopped nearly a decade ago
and in fact
the transistor performance and its power associated
has slowed tremendously
and yet the number of transistor continued
we made this observation a long time ago
and for 30 years
we've been advancing this form of computing
we call accelerated computing
we invented the GPU we invented the
the programming model called CUDA
and we observed that if we could add a processor
that takes advantage of more and more
and more transistors apply parallel computing
add that to a sequential processing CPU
that we could extend the capabilities of computing
well beyond well beyond
and that moment has really come
we have now seen that inflection point
accelerated computing its moment has now arrived
however accelerated computing
is a fundamentally different programming model
you can't just take a CPU software
software written by hands
executing sequentially
and put it onto a GPU and have it run properly
in fact if you just did that
it actually runs slower
and so you have to reinvent new algorithms
you have to create new libraries
you have to in fact
rewrite the application
which is the reason why it's taken so long
it's taken us nearly 30 years to get here
but we did it one domain at a time
this is the treasure of our company
most people talk about the GPU
the GPU is important
but without a programming model that sits on top of it
and without dedication to that programming model
keeping it compatible over generations
we're now CUDA 13
coming up CUDA 14
hundreds of millions of GPUs
running in every single computer
perfectly compatible if we didn't do that
then developers wouldn't target this computing platform
if we didn't create these libraries
then developers wouldn't know how to use the algorithm
and use the architecture to its fullest
one application after another
I mean these
this is really the
this is really the treasure of our company
cuLitho computational lithography
it took us nearly seven years to get here with cuLitho
and now TSMC uses it Samsung uses it
ASML uses it this is an incredible library
for computational lithography
the first step of making a chip
sparse solvers for CAE applications
coop a numerical optimization
has broken just about every single record
the traveling salesman problem
how to connect millions of products
with millions of customers in the supply chain
Warp Python Solver for CUDA for Simulation
cuDF a data frame approach
basically accelerating sequel Dataframe
Dataframe databases
this library is the one that started AI all together
cuDNN
the the
the library on top of it called Megatron Core
made it possible for us to simulate and train
extremely large language models
the list goes on uh
MONAI really
really important the number one medical imaging
AI framework in the world
by the way
we're not gonna talk a lot about healthcare today
but be sure to see Kimberly's keynote
she's gonna talk a lot about the work that we do
in healthcare and the list goes on
genomics processing
Aerial, pay attention
we're gonna do something really important here today
um cuQuantum
quantum computing this is just a representative
of 350 different libraries in our company
and each one of these libraries
redesign the algorithm
necessary for accelerated computing
each one of these libraries
made it possible for all of the ecosystem partners
to take advantage of accelerated computing
and each one of these libraries
opened new markets for us
let's take a look at what CUDA X can do
is that amazing
every
everything you saw was a simulation
there was no art no animation
this is the beauty of mathematics
this is deep computer science
deep math and just incredible how beautiful it is
every industry was covered
from healthcare and life sciences to manufacturing
robotics autonomous vehicles
computer graphics even video games
that first shot that you saw
was the first application Nvidia ever ran
and that's where we started in 1993
and we kept believing in what we were trying to do
and it took it's hard to imagine that
you could see
that first virtual fighter scene come alive
and that same company believed
that we would be here today
it's just a really really incredible journey
I want to thank all the Nvidia employees
for everything that you've done it's really incredible
we have a lot of industries to cover today
I'm gonna cover AI
6G
Quantum models
enterprise computing robotics
and factories let's get started
we have a lot to cover
a lot of big announcements to make
a lot of new partners that would very much surprise you
telecommunications is the backbone
the lifeblood of our economy
our industries our national security and yet
ever since the beginning of wireless
where we defined the technology
we defined the global standards
we exported American technology all around the world
so that the world can build
on top of American technology and standards
it has been a long time since this happened
wireless technology around the world
largely today deployed on foreign technologies
our fundamental communication fabric
built on foreign technologies
that has to stop and we have an opportunity to do that
especially during this fundamental platform shift
as you know
computer technology is at the foundation of literally
every single industry
it is the single most important instrument of science
it's the single most important instrument of industry
and I just said we're going through a platform shift
that platform shift should be the once in a
lifetime opportunity for us to get back into the game
for us to start innovating with American technology
today today we're announcing that we're gonna do that
we have a big partnership with Nokia
Nokia is the second largest
telecommunications maker in the world
it's a three trillion dollar industry
infrastructure is hundreds of billions of dollars
there are millions of base stations around the world
if we could partner
we could build on top of this incredible new technology
fundamentally based on accelerated computing and AI
and for United States for America
to be at the center of the next revolution in 6G
so today
we're announcing that Nvidia has a new product line
it's called the Nvidia Ark
The Aerial Radio Network Computer
Aerial ran computer
arc Ark is built from three fundamental
new technologies the gray CPU
the Blackwell GPU
and our Connect X Melonox Connect X networking
designed for this application
and all of that
makes it possible for us to run this library
this CUDA X library that I mentioned earlier
called Aerial
Aerial is essentially a wireless communication system
running on top of CUDA X we're gonna
we're gonna create for the first time
a software defined programmable computer
that's able to communicate wirelessly
and do AI processing at the same time
this is completely revolutionary
we call it Nvidia Ark and Nokia is gonna work with us
to integrate our technology
rewrite their stack
this is a company with 7,000 fundamental
essential 5G patents
hard to imagine
any greater leader in telecommunications
so we're gonna partner with Nokia
they're gonna make Nvidia Ark their future base station
Nvidia Ark is also compatible with Aerial scale
the current Nokia base stations
so what that means is
we're going to take this new technology
and we'll be able to upgrade
millions of base stations around the world
with 6G and AI now
6G and AI is really quite fundamental
in the sense that for the first time
we'll be able to use AI technology
AI for ran
to make radio communications more spectral efficient
doing using artificial intelligence
reinforcement learning
adjusting the beamforming in real time
in context
depending on the surroundings and the traffic
and the mobility the weather
all of that could be taken into account
so that we could improve spectral efficiency
spectral efficiency consumes about one and a half
to 2% of the world's power
so improving spectral efficiency
not only improves the amount of data
we can put through wireless networks
without increasing the amount of energy necessary
the other thing that we could do
AI for ran is AI on ran
this is a brand new opportunity
remember the internet enabled communications
but amazingly smart companies
AWS
built a cloud computing system on top of the internet
we are now going to do the same thing
on top of the wireless telecommunications network
this new cloud will be an edge
industrial robotics cloud
this is where AI on ran
the first is AI for ran
to improve radio spectrum efficiency
the second is AI on ran essentially
cloud computing for wireless telecommunications
cloud computing
will be able to go right out to the edge
where data centers are not
are not
because we have base stations all over the world
this announcement is really exciting
Justin Hodar the CEO
I think he's somewhere in the room
thank you for your partnership
thank you for helping United States
bring telecommunication technology back to America
this is really a fantastic
fantastic partnership thank you very much
that's the best way to celebrate Nokia
let's talk about quantum computing
1981
particle physicist quantum physicist Richard Feynman
imagine a new type of computer
that can simulate nature directly
to simulate nature directly
because nature is quantum
he called it a quantum computer
forty years later
the industry has made a fundamental breakthrough
forty years later just last year
a fundamental breakthrough
it is now possible to make one logical qubit
one logical qubit one logical qubit that's coherent
stable and error corrected in the past
now that one logical qubit consists of
could be sometimes tens
sometimes hundreds of physical qubits
all working together as you know qubits
these particles are incredibly fragile
they could be unstable very easily
any observation any sampling of it
any environmental condition
causes it to become de coherent
and so
it takes extraordinarily well controlled environments
and now also
a lot of different physical qubits for them to work
together
and for us to do error correction on these
what it called auxiliary or syndrome qubits
for us to error correct them
and infer what that logical qubit state is
there are all kinds of different types
of quantum computers superconducting photonic
trapped ion stable atom
all kinds of different ways
to create a quantum computer
well we now realize that it's essential
for us to connect the quantum computer
directly to a GPU supercomputer
so that we could do the error correction
so that we could do the artificial intelligence
calibration and control of the quantum computer
and so that we could do simulations collectively
working together
the right algorithms running on the GPUs
the right algorithms running on the QPUs
and the two processors
the two computers working side by side
this is the future of quantum computing
let's take a look
there are many ways to build a quantum computer
each uses qubits quantum bits
as its core building block
but no matter the method all qubits
whether superconducting qubits
trapped ions neutral atoms
or photons share the same challenge
they're fragile and extremely sensitive to noise
today's qubits remain stable
for only a few hundred operations
but solving meaningful problems
requires trillions of operations
the answer is quantum error correction
measuring disturbs a qubit
which destroys the information inside it
the trick is to add extra qubits entangled
so that measuring them gives us enough information
to calculate where errors occurred
without damaging the qubits
we care about it's brilliant
but needs beyond state of the art conventional compute
that's why we built NVQ Link
a new interconnect architecture
that directly connects quantum processors
with Nvidia GPUs
quantum error correction requires
reading out information from qubits
calculating where errors occur
and sending data back to correct them
NVQ Link is capable of moving terabytes of data
to and from quantum hardware
the thousands of times every second
needed for quantum error correction
at its heart is CUDA-Q
our open platform for quantum GPU computing
using NVQ Link and CUDA-Q
researchers will be able to do more than just
error correction
they will also be able to orchestrate quantum devices
and AI supercomputers to run quantum GPU applications
quantum computing won't replace classical systems
they will work together fused into one accelerated
quantum supercomputing platform
wow this is a really long stage
you know CEOs
we don't just sit at our desk typing
it's this is a physically job
physical job so
so today we're announcing the NVQ Link
NVQ Link and it's made possible by two things
of course this interconnect
that does quantum computer control and calibration
quantum error correction
as well as connects two computers
the QPU and our GPU supercomputers
to do hybrid simulations
it is also completely scalable
it doesn't just do
to error correction for today's number of few qubits
it does error correction for tomorrow
where we're gonna essentially
scale up these quantum computers
from the hundreds of qubits we have today
to tens of thousands of qubits
hundreds of thousands of qubits in the future
so we now have an architecture that can do control
cosimulation quantum error correction
and scale into that future
the industry support has been incredible
between the invention of CUDA-Q
remember
CUDA was designed for GPU CPU accelerated computing
basically using both processors to do
use the right tool to do the right job
now CUDA-Q has been extended beyond CUDA
so that we could support CPU
and have the two processors
CPU and the GPU work
and have computation move back and forth
within just a few microseconds
the essential latency
to be able to cooperate with the quantum computer
so now CUDA-Q is such an incredible breakthrough
adopted by so many different developers
we are announcing today
17 different quantum computer industry companies
supporting the MVQ Link and
and I'm so excited about this
eight different DOE labs Berkeley Brookhaven
Fermi Labs in Chicago Lincoln Laboratory
Los Alamos Oak Ridge
Pacific Northwest Sandia National Lab
just about every single DOE lab has engaged us
working with our ecosystem
of quantum computer companies
and these quantum controllers
so that we could integrate quantum computing
into the future of science
well
I have one more additional announcement to make today
we're announcing that the department of energy
is partnering with Nvidia
to build seven new AI supercomputers
to advance our nation's science
I have to have a shout out for Secretary Chris Wright
he has brought so much energy to the DOE
a surge of energy a surge of passion
to make sure that America leads science again
as I mentioned
computing is the fundamental instrument of science
and we are going through several platform shifts
on the one hand we're going to accelerate computing
that's why every future supercomputer will be GPU based
supercomputer
we're going to AI so that AI and principal solvers
principal simulation
principal physics simulation is not gonna go away
but it could be augmented and hands scaled
you surrogate models AI models working together
we also know that principal solvers
classical computing
can be enhanced to understand the state of nature
using quantum computing we also know that in the future
we have so much signal
so much data we have to sample from the world
remote sensing is more important than ever
and these laboratories
are impossible to experiment at the scale and speed
we need to unless they're robotic factories
robotic laboratories
so all of these different technologies
are coming into science at exactly the same time
Secretary Wright understands this
and he wants the DOE to take this opportunity
to supercharge themselves
and make sure that United States
stay at the forefront of science
I want to thank all of you for that thank you
let's talk about AI
what is AI most people would say that AI is a chatbot
and it it's rightfully so
there's no question that
ChatGPT
is at the forefront of what people would consider AI
however just as you see right now
these scientific supercomputers
are not gonna run chatbots
they're gonna do basic science
science AI the world of AI is much
much more than a chatbot of course
the chatbot is extremely important
and AI is fundamentally critical
deep computer science incredible computing
great breakthroughs are still essential for AI
but beyond that AI is a lot more
AI is in fact
I'm gonna describe AI in a couple different ways
that's first way the first way you think about AI
is that
it has completely reinvented the computing stack
the way we used to do software was hand coding
hand coding software running on CPUs
Today AI is machine learning training
data intensive programming
if you will
trained and Learned by AI that runs on a GPU
in order to make that happen
the entire computing stack has changed
notice you don't see Windows up here
you don't see CPU up here
you see a whole different
a whole fundamentally different stack
everything from the need for energy
and this is another area where our administration
President Trump gets
deserves enormous credit his pro energy initiative
his recognition that this industry needs energy to grow
it needs energy to advance
and we need energy to win
his recognition of that
and putting the weight of the nation behind
pro energy growth completely changed the game
if this didn't happen
we could have been in a bad situation
and I want to thank President Trump for that
on top of energy are these GPUs
and these GPUs are connected into
built into infrastructure that I'll show you later
on top of this infrastructure
which consists consists of giant data centers
like easily many times the size of this
size of this room a enormous amount of energy
which then transfer
transforms the energy through this new machine
called GPU supercomputers to generate numbers
these numbers are called tokens
the language if you will
the computational unit
the vocabulary of artificial intelligence
you can tokenize almost anything
you can tokenize of course
the English word you can tokenize images
that's the reason why you're able to recognize images
or generate images
tokenize video tokenize 3D structures
you can tokenize chemicals and proteins and genes
you can tokenize cells
tokenize almost anything with structure
anything with information content
once you could tokenize it
AI can learn that language and the meaning of it
once it learns the meaning of that language
it can translate it can respond just like you respond
just like you interact with ChatGPT
and it could generate just as ChatGPT can generate
so
all of the fundamental things that you see ChatGPT do
all you have to do is imagine
what if it was a protein what if it was a chemical
what if it was a 3D structure
like a factory what if it was a robot
and the token was understanding behavior
and tokenizing motion and action
all of those concepts are basically the same
which is the reason why
AI is making such extraordinary progress
and on top of these models are applications
Transformers Transformers is not a universal model
it's incredibly effective model
but there's no one universal model
it's just that AI has universal impact
there are so many different types of models
in the last several years
we enjoyed the invention
and went through the innovation breakthroughs
of multimodality
there's so so many different types of models
there's CNN models
convolutional neural network models
there's state space models
there's graph neural network models
multimodal models of course
all the different tokenizations
and token methods that I just described
you could have models that are spatial
and it's understanding optimized for spatial awareness
you could have models that are optimized for
long sequence recognizing subtle information
over a long period of time
there are so many different types of models
on top of these models architectures
on top of these model architectures are applications
the software of the past and this is a
a profound understanding
a profound observation of artificial intelligence
that the software industry of the past
was about creating tools Excel is a tool
word is a tool a web browser is a tool
the reason why I know these are tools
is because you use them
the tools industry just as screwdrivers and hammers
the tools industry
is only so large in the case of it tools
they could be database tools
these it tools is about a trillion dollars or so
but AI is not a tool
AI is work
that is the profound difference
AI is in fact
workers that can actually use tools
one of the things I'm really excited about
is the work that Aravind's doing at Perplexity
Perplexity
using web browsers to book vacations or do shopping
basically an AI using tools
ursor is an AI
an agentic AI system that we use at Nvidia
every single software engineer at Nvidia uses Cursor
has improved our productivity tremendously
it's basically a partner
for every one of our software engineers
to generate code and it uses a tool
and the tool it uses is called VS Code
so Cursor is an AI agentic AI system that uses VS Code
well all of these different industries
these different industries
whether it's chatbots or digital biology
where we have AI assistant researchers
or what is a robo taxi
inside a robo taxi of course
it's invisible but obviously there's a AI chauffeur
that chauffeur is doing work
and the tool that it uses to do that work is the car
and so everything that we've made up until now
the whole world everything that we've made up until now
are tools tools for us to use
for the very first time
technology is now able to do work
and help us be more productive
the list of opportunities go on and on
which is the reason
why AI addresses the segment of the economy
that it has never addressed
it is a few trillion dollars
that sits underneath the tools
of a hundred trillion dollar global economy
now for the first time
AI is going to engage
that hundred trillion dollar economy
and make it more productive
make it grow faster make it larger
we have a severe shortage of labor
having AI that augments labor is going to help us grow
now what's interesting about this
from a technology industry perspective also
is that
in addition to the fact that AI is new technology
that addresses new segments of the economy
AI in itself is also a new industry
this token as I was explaining earlier
these numbers
after you tokenize
all these different modalities of information
there's a factory that needs to produce these numbers
unlike the computer industry
and the chip industry of the past
notice if you look at the chip industry of the past
the chip industry represents about 5 to 10%
maybe less 5% or so
of a multi trillion dollar
few trillion dollar it industry
and the reason for that is
because
it doesn't take that much computation to use Excel
it doesn't take that much computation to use browsers
it doesn't take that much computation to use word
we do the computation but in this new world
there needs to be a computer that understands context
all the time it can't pre compute that
because every time you use the computer for AI
every time you ask the AI to do something
the context is different
so it has to process all of that information
environmental for example
in the case of a self driving car
it has to process the context of the car
context processing
what is the instruction you're asking the AI to do
then it's got to go and break down the problem
step by step reason about it
and come up with a plan and execute it
every single one of that step
requires enormous number of tokens to be generated
which is the reason why we need a new type of system
and I call it an AI factory
it's an AI factory for sure
it's unlike a data center of the past
it's an AI factory
because this factory produces one thing
unlike the data centers of the past
that does everything stores files for all of us
runs all kinds of different applications
you could use that data center like
you can use your computer for all kinds of applications
you could use it to play a game one day
you could use it to browse the web
you could use it you know
to do your accounting
and so that is a computer of the past
a universal general purpose computer
the computer I'm talking about here is a factory
it runs basically one thing
it runs AI and its purpose
its purpose is designed to produce tokens
that are as valuable as possible
meaning they have to be smart
and you want to produce these tokens
at incredible rates
because when you ask an AI for something
you would like it to respond
and notice during peak hours
these AIs are now responding slower and slower
because it's got a lot of work to do
for a lot of people and so
you wanted to produce valuable tokens
at incredible rates
and you wanted to produce it cost effectively
every single word that I used
are consistent with an AI factory
with a car factory or any factory
it is absolutely a factory
and these factories
these factories never existed before
and inside these factories
are mountains and mountains of chips
which brings
to today what happened in the last couple years
and in fact what happened this last year
something fairly profound happened this year actually
if you look in the beginning of the year
everybody has some attitude about AI
that attitude is generally
this is gonna be big it's gonna be the future
and somehow a few months ago
it kicked into Turbo Charge
and the reason for that is several things
the first is that we in the last couple years
have figured out how to make AI much
much smarter
rather than just pre training
pre training basically says
let's take all of the
all of the information that humans have ever created
let's give it to the AI to learn from
it's essentially memorization and generalization
it's no
it's not unlike going to school back when we were kids
the first stage of learning
pre training was never meant
just as preschool was never meant
to be the end of education
pre training preschool
was simply teaching you the basic skills
of intelligence so that you can understand
how to learn everything else
without vocabulary without understanding of language
and how to communicate how to think
it's impossible to learn everything else
the next is post training
post training after pre training
is teaching you skills skills to solve problem
break down problems reason about it
how to solve math problems
how to code
how to think about these problems step by step
use first principle reasoning
and then after that
is where computation really kicks in
as you know for many of us
you know we went to school
and that's in my case
decades ago but ever since
I've Learned more thought about more
and the reason for that is because
we're constantly grounding ourselves in new knowledge
we're constantly doing research
and we're constantly thinking
thinking is really what intelligence is all about
and so now we have three fundamental technology skills
we have these three technology
pre training which still requires enormous
enormous amount of computation
we now have post training
which uses even more computation
and now
thinking puts incredible amounts of computation
load on the infrastructure
because it's thinking on our behalf
for every single human so
the amount of computation necessary for AI to think
inference is really quite extraordinary
now I used to hear people say that inference is easy
Nvidia should do training
Nvidia's gonna do you know
they are really good at this
so they're gonna do training that inference was easy
how could thinking be easy
regurgitating memorized content is easy
regurgitating the multiplication tables easy
thinking is hard
which is the reason why these three scales
these three new scaling laws
which is all of it in in full steam
has put so much pressure on the amount of computation
now another thing has happened
from these three scaling laws
we get smarter models
and these smarter models need more compute
but when you get smarter models
you get more intelligence
people use it
guys if anything happens
I wanna be the first one out
just kidding I'm sure it's fine
probably just lunch my stomach
was that me
and so so where was I
the smarter your models are
the smarter your models are
the more people use it
it's now more grounded it's able to reason
it's able to solve problems
it never Learned how to solve before
because it could do research
go learn about it come back
break it down reason about how to solve your
how to answer your question
how to solve your problem
and go off and solve it the amount of
thinking is making the models more intelligent
the more intelligent it is
the more people use it the more intelligent it is
the more computation is necessary
but here's what happened this last year
the AI industry turned a corner
meaning that the AI models are now smart enough
they're making they're worthy
they're worthy to pay for
Nvidia pays for every license of Cursor
and we gladly do it
we gladly do it because
Cursor is helping
a several hundred thousand dollar employee
software engineer or AI researcher be many
many times more productive
so of course
we'd be more than happy to do that
these AI models have become good enough
that they are worthy to be paid for
Cursor 11 Labs
Synthesia Abridge
OpenEvidence, the list goes on
of course OpenAI
of course Claude
these models are now so good
that people are paying for it
and because people are paying for it
and using more of it and every time they use more of it
you need more compute we now have two exponentials
these two exponentials one
is the exponential compute requirement
of the three scaling law and the second exponential
the more people the smarter it is
the more people use it the more people use it
the more computing it needs
two exponentials now
putting pressure on the world's computational resource
at exactly the time when I told you earlier
that Moore's Law has largely ended
and so the question is
what do we do
if we have these two exponential demands growing
and if we don't
if we don't find a way to drive the cost down
then this positive feedback system
this circular feedback system
essentially called a virtuous cycle
essential for almost any industry
essential for any platform industry
it was essential for Nvidia
we have now reached a virtual cycle of CUDA
the more applications the more
the more applications people create
the more valuable CUDA is
the more valuable CUDA is
the more CUDA computers are purchased the
more developers wanna create applications for it
that virtual cycle for Nvidia has now been achieved
after 30 years we have achieved that also
15 years later we've achieved that for AI
AI has now reached the virtual cycle
and so the more you use it
because the AI is smart and we pay for it
the more profit is generated
the more profit generated
the more computes put to on the
on the grid the more compute is put into AI factories
the more compute the AI becomes smarter
the smarter more
more people use it more applications use it
the more problems we can solve
this virtual cycle is now spinning
what we need to do is drive the cost down tremendously
so that one
the user experience is better
when you prompt the AI it responds to you much faster
and 2 so that we keep this virtuous cycle going
by driving its cost down so that it could get smarter
so that more people use it
so that so on
so forth that virtuous cycle is now spinning
but how do we do that
when Moore's Law has really reached this limit
well the answer is called CO-DESIGN
you can't just design chips
and hope that things at
on top of it is gonna go faster
the best you could do in designing chips is add
I don't know 50% more transistors every couple of years
and if you added more transistors just
you know we can add more transistors
and TSMC's got a lot of transistor
incredible company
we'll just keep adding more transistors
however that's all in percentages
not exponentials
we need to compound exponentials
to keep this virtuous cycle going
we call it extreme CO-DESIGN
and Nvidia is the only company in the world today
that literally starts from a blank sheet of paper
and can think about new fundamental architecture
computer architecture new chips
new systems new software
new model architecture
and new applications all at the same time
so many of the people in this room are here
because you're different parts of that layer that
different parts of that stack
and working with Nvidia
we fundamentally rearchitect everything
from the ground up
and then because AI is such a large problem
we scale it up we created a whole computer
a computer for the first time
that has scaled up into an entire rack
that's one computer one GPU
and then we scale it out
by inventing a new AI Ethernet technology
we call Spectrum-X Ethernet
everybody will say Ethernet is Ethernet
Ethernet is hardly Ethernet
Ethernet
Spectrum-X Ethernet is designed for AI performance
and it's the reason why it's so successful
and even that's not big enough
we'll fill this entire room of AI
supercomputers and GPUs
that's still not big enough
because the number of applications
and the number of users for AI
is continuing to grow exponentially
and we connect multiple of these data centers together
and we call that SCALE-ACROSS
Spectrum-XGS
gigascale-X Spectrum-X gigascale-XGS
by doing so we do CO-DESIGN at such a
such an enormous level such an extreme level
that the performance benefits are shocking
not 50% better each generation
not 25% better each generation
but much much more
this is the most extreme
CO-DESIGNed computer we've ever made
and quite frankly made in modern times
since the IBM System 360
I don't think a computer has been ground up
reinvented like this ever
this system was incredibly hard to create
I'll show you the benefits in just a second
but essentially what we've done
essentially what we've done
we've created otherwise
hey Janine you can come out
it's
you have to you have to meet me like halfway
alright so this is kind of like Captain America's shield
so
NVLink 72 NVLink 72
if we were to create one giant chip
one giant GPU this is what it would look like
this is the level of wafer scale processing
we would have to do
it's incredible all of this
all of these chips are now put into one giant rack
did I do that or did somebody else do that
into that one giant rack
you know
sometimes I don't feel like I'm up here by myself
just
this one giant rack
makes all of these chips work together as one
it's actually completely incredible
and I'll show you the benefits of that
the way it looks is this so thanks Janine
I I like this
yeah alright
ladies and gentlemen Janine Paul
I got it in the future
next I'm just gonna go like Thor
it's like when you're at home and
and you can't reach the remote
and you just go like this
and somebody brings it to you
that's yeah
same idea
it never happens to me I'm just dreaming about it
I'm just saying
okay so
so anyhow
anyhow um
we basically
this is what we created in the past
this is MVLinks, MVLink 8
now these models are so gigantic
the way we solve it is we turn this model
this gigantic model into a whole bunch of experts
it's a little bit like a team
and so
these experts are good at certain types of problems
and we collect a whole bunch of experts together
and so this giant multi trillion dollar AI model
has all these different experts
and we put all these different experts on a GPU
now this is
NVLink 72
we could put all of the chips into one giant fabric
and every single expert can talk to each other
so the master the
the primary expert
could talk to all of the student work
and all of the necessary context and prompts
and bunch of data that we have to
bunch of tokens
that we have to send to all of the experts
the experts would
whichever one of the experts are selected to solve
the answer would then go off and try to respond
and then it would go off and do that
layer after layer after layer
sometimes eight sometimes 16
and sometimes these experts
sometimes 64 sometimes two 56
but the point is they're more and more and more experts
well here
NVLink 72
we have 72 GPUs and because of that
we could put four experts in one GPU
the most important thing you need to do for each GPU
is to generate tokens
which is the amount of bandwidth that you have
in HBM memory
we have one H1 GPU generating thinking for four experts
versus here because each one of the computers
can only put eight GPUs
we have to put 32 experts into one GPU
so this one GPU has to think for 32 experts
versus this system each GPU only has to think for four
and because of that
the speed difference is incredible
and this just came out
this is the benchmark done by semi analysis
they do a really really thorough job
and they benchmarked all of the GPUs
that are benchmarkable
and it turns out it's not that many
if you look at the list of looks
list of GPUs you can actually benchmark
is like 90% Nvidia okay
and but so we're comparing ourselves to ourselves
but the second best GPU in the world is the H200
and runs all the workload
Grace Blackwell per GPU is 10 times the performance
now how do you get 10 times the performance
when it's only twice the number of transistors
well the answer is extreme CO-DESIGN
and
by understanding the nature of the future of AI models
and we're thinking across that entire stack
we can create architectures for the future
this is a big deal
it says we can now respond a lot faster
but this is even bigger deal
this next one look at this
this says that the lowest cost tokens in the world
are generated by Grace Blackwell
and NVLink 72 the most expensive computer
on the one hand
GB two hundred is the most expensive computer
on the other hand
its token generation capability is so great
that it produces it at the lowest cost
because the tokens per second divided by the t
by the total cost of ownership of Grace Blackwell
is so good it is the lowest cost way to generate tokens
by doing so delivering incredible performance
10 times the performance
delivering 10 times lower cost
that virtual cycle can continue
which then brings me to this one
I just saw this literally yesterday
this is uh
the CSP CAPEX
people are asking me about CAPEX these days
and this is a good way to look at it
in fact the CAPEX of the top six CSPs
and this one this one is Amazon
Core Weave Google Meta
Microsoft and Google OK
these CSPs together
are going to invest this much in CAPEX
and I would
I would tell you the timing couldn't be better
and the reason for that is
now we have the Grace Blackwell MVLink 72
in all volume production supply chain
everywhere in the world is manufacturing it
so we can now deliver to all of them
this new architecture so that the CAPEX invests
and instruments computers that deliver the best TCO
now underneath this
there are two things that are going on
so when you look at this
it's actually fairly extraordinary
and it's fairly extraordinary anyhow
but what's happening under
underneath is this there are two platform shifts
happening at the same time
one platform shift
is going from general purpose computing
to accelerated computing
remember Accelerated Computing
as I mentioned to you before
it does data processing it does image processing
computer graphics it does comp computation
of all kinds
it runs sequel runs
spark it runs
you know you
you ask it you tell us what you need to have run
and I'm fairly certain
we have an amazing library for you
you could be you know
a data center trying to make masks
to manufacture semiconductors
we have a great library for you
and so underneath irrespective of AI
the world is moving from general purpose computing
to accelerated computing irrespective of AI
and in fact many of the CSP's already have services
that have been here long ago
before AI remember
they were invented in the era of machine learning
classical machine learning algorithms like XGBoost
algorithms like um
data frames that are used for recommender systems
collaborative filtering content filtering
all of those technologies
were created in the old days
of general purpose computing
even those algorithms even those architectures
are now better with accelerated computing
and so even without AI
the world's CSP's are going to invest into acceleration
and
Nvidia's GPU is the only GPU that can do all of that
plus AI and ASIC might be able to do AI
but it can't do any of the others
Nvidia could do all of that
which explains why
it is so safe to just lean into Nvidia's architecture
we have now reached our virtual cycle
our inflection point and this is quite extraordinary
I have many partners in the room
and all of you are part of our supply chain
and I know how hard you guys are working
I wanna thank all of you how hard you are working
thank you very much
now I'm gonna show you why
this is what's going on in our company's business
we're seeing extraordinary growth for Grace Blackwell
for all the reasons that I just mentioned
it's driven by two exponentials we now have visibility
I think we're probably the first technology company
in history
to have visibility into half a trillion dollars
of cumulative Blackwell
and early ramps of Rubin through 2026
and as you know 2025 is not over yet
and 2026 hasn't started
this is how much business is on the books
half a trillion dollars worth so far
now this is out of that
we've already shipped 6 million of the Blackwells
in the first several quarters
I guess the first four quarters of production
three and a half quarters of production
we still have one more quarter to go for 2025
and then we have four quarters
so the next five quarters
there's $500 million, $500 billion
half a trillion dollars
that's five times the growth rate of Hopper
that kind of tells you something
this is Hopper's entire life
this doesn't include China and
and um
and Asia so this is just uh
the west okay
this is just uh
we're excluding China so Hopper
in its entire life 4 million GPUs Blackwell
each one of the Blackwells has two GPUs in it
in one large package 20 million GPUs of Blackwells
in the early parts of Rubin
incredible growth
so I want to thank all of our supply chain partners
everybody I know how hard you guys are working
I made a video to celebrate your work let's play it
the age of AI has begun
Blackwell is its engine an engineering marvel
in Arizona it starts as a blank silicon wafer
hundreds of chip processing
and ultraviolet lithography steps
build up each of the 200 billion transistors
layer by layer on a 12 inch wafer
in Indiana HBM stacks will be assembled in parallel
HBM memory dies with 1,024 iOS
are fabricated using advanced EUV technology
Through-Silicon Vias is used in the back end
to connect 12 stacks of HBM memory
and base die to produce HBM
meanwhile
the wafer is scribed into individual Blackwell die
tested and sorted
separating the good dies to move forward
the chip on wafer on substrate process attaches
32 Blackwell dies and 128 HBM stacks
on a custom silicon interposer wafer
metal interconnect traces are etched directly into it
connecting Blackwell GPUs and HBM stacks
into each system and package unit
locking everything into place
then the assembly is baked
molded and cured
creating the GB three hundred Blackwell Ultra Superchip
in Texas robots will work around the clock
to pick and place over 10,000 components
onto the Grace Kelly PCB
in California
Connect X8 Super NICs for scale out communications
and Bluefield 3 DPUs
for offloading and accelerating networking
storage and security are carefully assembled
into GB three hundred compute trays
NVLink is the breakthrough
high speed link that Nvidia invented
to connect multiple GPUs and scale up into a massive
virtual GPU
the NVLink switch tray
is constructed with NVLink switch chips
providing 14.4 terabytes per second
of all to all bandwidth
NVLink spines form a custom blind mated backplane
with 5,000 copper cables connecting all 72 Blackwells
or 144 GPU dies into one giant GPU
delivering 130 terabytes per second
of all to all bandwidth
nearly the global internet's peak traffic
skill technicians assemble each of these parts
into a rack scale AI supercomputer
in total 1.2 million components
two miles of copper cable
130 trillion transistors weighing nearly two tons
from silicon in Arizona and Indiana to systems in Texas
Blackwell and future Nvidia AI factory generations
will be built in America
writing a new chapter in American history and industry
America's return to making and re industrialization
reignited by the age of AI
the age of AI has begun
made in America
made for the world
in America again it is incredible
the first thing that President Trump asked me for
is bring manufacturing back
bring manufacturing back because it's
it's necessary for national security
bring manufacturing back because we want the jobs
we want that part of the economy
and nine months later nine months later
we are now manufacturing in full production
Blackwell in Arizona
Extreme Blackwell GB 200 MV Grace
Blackwell NVLink 72 Extreme
CO-DESIGN gives us 10 x generationally
it's utterly incredible now
the part that's really incredible is this
this is the first AI supercomputer we made
this is in 2016
when I delivered it to a startup in San Francisco
which turned out to have been OpenAI
this was the computer and in order to do that
create that computer we designed one chip
we designed one new chip
in order for us to do CO-DESIGN
now look at all of the chips we have to do
this is what it takes you're not going to take one chip
and make a computer 10 times faster
that's not gonna happen
the way to make computers 10 times faster
that we can keep increasing the performance
exponentially
we can keep driving cost down exponentially
is extreme CO-DESIGN
and working on all these different chips
at the same time we now have Rubin back home
this is Rubin this is the Vera
Rubin and
and the Rubin
ladies and gentlemen Rubin
this is this is our third generation
NVLink 72 rack scale computer
third generation GB two hundred was the first one
all of our partners around the world
I know how hard you guys worked
it was insanely hard it was insanely hard to do
second generation so much smoother
and this generation look at this
completely cableless
completely cableless and this is
this is all back in the lab now
this is the next generation Rubin
while we're stripping GB three hundreds
we're preparing Rubin to be in production
you know this time next year
maybe slightly earlier and so every single year
we are gonna come up with the most extreme
CO-DESIGN system
so that we can keep driving up performance
and keep driving down the token generation cost
look at this
this is just an incredibly beautiful computer now this
so this is amazing this is 100 petaflops
I know this doesn't mean anything
100 petaflops
but compared to the DGX1 I delivered to OpenAI
10 years ago nine years ago
it's 100 times the performance right here
versus 100 times of that supercomputer
a hundred times a hundred of those
let's see
a hundred of those would be like 25 of these racks
all replaced by this one thing
1
Vera Rubin okay
so this is this is the compute tray
and this is
so very Rubin super chip
okay and this is the computer tray
this up right here
it's incredibly easy to install
just flip these things open
shove it in even I could do it
OK and this is the Vera
Vera Rubin Compute Tray
if you decide you wanted to add a special processor
we've added another processor
it's called a context processor
because the amount of context that we give
AI's are larger and larger
we wanted to read a whole bunch of PDFs
before an answer a question
wanted to read a whole bunch of archive papers
watch a whole bunch of videos
go learn all this before you answer a question for me
all of that context processing could be added
and so you see on the bottom eight
Connect ConnectX-9
new SuperNIC you have CX
you have a CPX's eight of them
you have a Bluefield 4 this new data processor
two Vera CPUs and four Rubin packages
or eight Rubin GPUs all of that in this one node
completely cableless 100% liquid cooled
and then this new processor
I won't talk too much about it today
I don't have enough time
but this is completely revolutionary
and the reason for that is
because your AI's need to have more and more memory
you're interacting with it more
you wanted to remember our last conversation
everything that you've Learned on my behalf
please don't forget it when I come back next time
and so all of that memory
is going to create this thing called KV caching
and that KV caching retrieving it
you might have noticed every time you go into your op
your your AIs these days
it takes longer and longer
to refresh and retrieve
all of the previous conversations and
and the reason for that is
we need a revolutionary new processor
and that's called Bluefield 4
next is the the Connect X switch
excuse me the NVLink switch
which is right here
okay this is the NVLink switch
this is what makes it possible for us
to connect all of the computers together
and this switch is now several times the bandwidth
of the entire world's peak internet traffic
and so that spine is going to communicate
and carry all of that data simultaneously
to all of the GPUs on top of that
on top of that this is the
this is the Spectrum-X switch
and this Ethernet switch
was designed so that all of the processors
could talk to each other at the same time
and not gum up the network
gum up the network that's very technical
okay so um
so these are the these three combined
and then this is the quantum switch
this is for InfiniBand this is Ethernet
we don't care what language you would like to use
whatever standard you like to use
we have great scale out fabrics for you
whether it's Infiniband or quantum or Spectrum Ethernet
this one uses silicon photonics
and is completely co packaged options
basically the laser comes right up to the silicon
and connects it to our chips
okay so this is the Spectrum-X Ethernet
and so now let's talk about
thank you oh
this is this is what it looks like
this is a rack
this is two and a half this is two uh
2 thousand this is two tons
1.5 million parts
and the spine this spine right here
carries the entire internet traffic in one second
same speed
moves it across all of these different processors
hundred percent liquid cooled
all for the you know
fastest token generation rate in the world
okay so that's what a rack looks like
now that's one rack
a gigawatt data center would have
you know call it
let's see 16 racks would be 1,000 um
and then 500 of those so whatever
500 times 16 and so call it
9,000 of these
8,000 of these would be a 1 gigawatt data center
okay and so that's a future AI factory
now we used
as you notice Nvidia started out by designing chips
and then we started to design systems
and we designed AI supercomputers
now we're designing entire AI factories
every single time we move out
and we integrate more of the problem to solve
we come up with better solutions
we now build entire AI factories
this is going
this AI factory is what we're building for Vera Rubin
and we created a technology
that makes it possible for all of our partners
to integrate into this factory digitally
let me show it to you
the next industrial revolution is here
and with it a new kind of factory
AI infrastructure is an ecosystem scale challenge
requiring hundreds of companies to collaborate
Nvidia Omniverse DSX
is a blueprint for building and operating
gigascale AI factories
for the first time the building power
and cooling
are CO-DESIGNed with Nvidia's AI infrastructure stack
it starts in the Omniverse Digital Twin
Jacobs Engineering
optimizes compute density and layout
to maximize token generation
according to power constraints
they aggregate SIM ready open USD assets from Siemens
Schneider Electric Train
and Vertiv into PTC's product lifecycle management
then simulate thermals and electrics with CUDA
accelerated tools from ETAP and Cadence
once designed and video
partners like Bechtel and Vertiv
deliver prefabricated modules
factory built tested
and ready to plug in
this shrinks build time significantly
achieving faster time to revenues
when the physical AI factory comes online
the digital twin acts as an operating system
engineers prompt AI agents from Phaidra and Emerald AI
previously trained in the digital twin
to optimize power consumption
and reduce strain on both the AI factory and the grid
in total for a 1 gigawatt AI factory
DSX optimizations
can deliver billions of dollars in additional revenue
per year
across Texas
Georgia and Nevada
envious partners are bringing DSX to life
in Virginia
Nvidia is building an AI factory research center
using DSX to test and productize Vera Rubin
from infrastructure to software
with DSX Nvidia
partners around the world
can build and bring up AI infrastructure
faster than ever
completely completely in digital long
long before Vera Rubin exists as a real computer
we've been using it as a digital twin computer now
long before these AI factories exist
we will use it we will design it
will plan it we'll optimize it
and we'll operate it as a digital twin
and so all of our partners that are working with us
I'm incredibly happy for all of you supporting us
and GE is here and GE Vernova is here
Schlitter I think
I think uh uh
Olivia's here Olivia Bloom is here
um uh
uh Siemens incredible partners OK
Roland Bush I think he's watching
hi Roland
and so anyways really
really great partners working with us
in the beginning we had CUDA
and we have all these different ecosystems
of software partners now we have Omniverse
DSX and we're building AI factories
and again
we have these incredible ecosystem of partners
working with us let's talk about models
open source models
particularly in the last couple years
several things have happened
one open source models have become quite capable
because of reasoning capabilities
it has become quite capable because their multimodality
and they're incredibly efficient
because of distillation
so all of these different capabilities have become uh
has made open source models for the very first time
incredibly useful for developers
they are now the lifeblood of startups
lifeblood of startups in different industries
because obviously as I mentioned before
each one of the industries have its own use case
its own use case its own data
its own use data its own flywheels
all of that capability
that domain expertise needs to have
the ability to embed into a model
open source makes that possible
researchers need open source
developers need open source
companies around the world
we need open source open source models is really
really important
The United States has to lead in open source as well
we have amazing proprietary models
we have amazing proprietary models
we need also amazing open source models
our country depends on it
our startups depend on it
and so Nvidia is dedicating ourselves to go do that
we are now the largest the largest
we lead in open source contribution
we have 23 models in leaderboards
we have all these different domains
from language models to physical AI models
I'm gonna talk about robotics models to biology models
each one of these models has enormous teams
and that's one of the reasons
why we build supercomputers for ourselves
to enable all these models to be created
we have number one speech model
number one reasoning model
number one physical AI model
the number of downloads is really
really terrific we are dedicated to this
and the reason for that is because science needs it
researchers need it startups need it
and companies need it
I'm delighted that AI startups build on Nvidia
they do so for several reasons
one of course
our ecosystem is rich our tools work great
all of our tools work on all of our GPUs
our GPUs are everywhere
it's literally in every single cloud
it's available on prem you could build it yourself
you could you could
you know build up a
an enthusiast gaming PC with multiple GPUs in it
and you could download our software stack
and it just works
and we have the benefit of rich developers
who are making this ecosystem richer and richer
and richer so
I'm really pleased
with all of the startups that we're working with
I'm I'm thankful for that
it is also the case that many of these startups
are now starting to create
even more ways to enjoy our GPUs
the CoreWeave, Nscale
Nebius Llama
LA Lambda all of these companies
Crusoe uh
companies
are building these new GPU clouds to serve the stars
and I really appreciate that this is all possible
because Nvidia is everywhere
we integrate our libraries
all of the CUDA X libraries I talk to you about
all the open source AI models that I talked about
all of the models that I talked about
we integrated into AWS for example
really love working with Matt
we integrated into Google Cloud
for example really love working with Thomas
each one of these clouds integrate Nvidia GPUs
and our computing our libraries as well as our models
love working with Satya over a Microsoft Azure
love working with
Clay at Oracle
each one of these clouds integrate the Nvidia stack
as a result wherever you go
whichever cloud you use it works incredibly
we also integrate Nvidia libraries into the world SaaS
so that each one of these SaaS
will eventually become a gentic SaaS
I love Bill McDermott's vision for service now
there don't yeah
there you go
I think that might have been Bill
hi Bill
and so service now
what is it 85% of the world's enterprise workloads
work flows SAP
80% of the world's commerce
Christian Klein and I are working together
integrate Nvidia libraries
CUDA-X, NeMo, and Nemotron
all of our AI systems into SAP
working with Cecile over at synopsis
accelerating the World of CAE
Cadence EDA tools
so that they could be faster and get scale
helping them create AI agents
one of these days I would love to hire a AI agent
ASIC designers to work with our ASIC designers
essentially the Cursor of synopsis
if you will we're working with uh
Annie Rude Annie Rude here
I saw him earlier today he was part of the pregame show
cadence doing incredible work
accelerating their stack
creating AI agents so that we can have Cadence AI
ASIC designers and system designers working with us
today we're announcing a new one
AI will supercharge productivity
AI will transform just about every industry
but AI will also supercharge
cyber security challenges the bad AI's
and so we need an incredible defender
I can't imagine a better defender than crowd strike
George George is here
uh he was here
yep I saw him earlier
we are partnering with CrowdStrike
to make cybersecurity speed of light
to create a system that has cybersecurity
AI agents in the cloud
but also incredibly good AI agents
on prem or at the edge this way you
whenever there's a threat
you are moments away from detecting it
we need speed and we need a fast agentic AI
super agent super smart AIs
I have a second announcement
this is the single fastest
enterprise company in the world
probably the single most important enterprise stack
in the world today
Palantir Ontology
anybody from Palantir here
I just I was just talking to Alex earlier
this is Palantir Ontology
they take information
they take data they take human judgement
and they turn it into business insight
we work with Palantir
to accelerate everything Palantir does
so that we could do data processing
data processing at a much
much larger scale and more speed
whether it's structured data of the past
and of course we'll have structured data
human recorded data unstructured data
and process that data for our government
for national security
and for enterprises around the world
process that data at speed of light
and to find insight from it
this is what it's gonna look like in the future
Palantir is going to integrate Nvidia
so that we could process at the speed of light
in an extraordinary scale
okay Nvidia and Palantir
let's talk about physical AI
physical AI requires three computers
just as it takes two computers to train a language
model one that's to train it
evaluate it and then inference it
okay so that's the large G B200 that you see
in order to do it for physical AI
you need three computers
you need a computer to train it
this is GB the Grace Blackwell NVLink 72
we need a computer that does all of the simulations
that I showed you earlier
with Omniverse, DDS
it basically is a digital twin
for the robot to learn how to be a good robot
and for the factory to essentially be a digital twin
that computer is the second computer
the Omniverse computer this computer
has to be incredibly good at generative AI
and it has to be good at computer graphics
sensor simulation ray tracing
signal processing
this computer is called the Omniverse computer
and once we train the model
simulate that AI inside a digital twin
and that digital twin
could be a digital twin of a factory
as long
as well as a whole bunch of digital twins of robots
then you need to operate that robot
and this is the robotic computer
this is this one goes into a self driving car
half of it could go into a robot OK
or you could actually have
you know robots that are quite agile and quite
quite fast in operations
and it might take two of these computers
and so this is the Thor Jetson
Thor robotics computer
these three computers all run CUDA
and it makes it possible for us to advance physical AI
AI that understand the physical world
understand laws of physics causality
permanence you know
physical AI we have incredible partners working with us
to create the physical AI of factories
we're using it ourselves to create our factory in Texas
now once we create the robotic factory
we have a bunch of robots that are inside it
and these robots also need the physical AI
applies physical AI and works inside the digital twin
let's take a look at it
America is reindustrializing
reshoring manufacturing across every industry
in Houston Texas
Foxcon is building a state of the art
robotic facility for manufacturing
Nvidia AI infrastructure systems
with labor shortages and skills gaps digitalization
robotics and physical AI are more important than ever
the factory is born digital
in Omniverse
Foxconn engineers assemble their virtual factory
in a Siemens digital twin solution
developed on Omniverse Technologies
every system mechanical
electrical plumbing
is validated before construction
Siemens Plant Simulation runs design
space exploration optimizations
to identify ideal layout
when a bottleneck appears
engineers update the layout with changes
managed by Siemens Teamcenter
in Isaac Sim the same digital twin is used
to train and simulate robot AI's
in the assembly area
FANUC manipulators build GB three hundred tray modules
by manual manipulators from F I I
and skilled AI install bus bars into the trays
and AMRs shuttle the trays to the test pods
then
Foxconn uses omnivrs for large scale sensor simulation
where robot AIs learn to work as a fleet
in Omniverse Vision AI agents built on Nvidia
Metropolis and Cosmos
watch the fleets of robots and workers from above
to monitor operations and alert Foxconn
engineers of anomalies
and safety violations
or even quality issues
and to train new employees
agents power
interactive AI coaches for easy worker on boarding
the age of US reindustrialization is here
with people and robots working together
that's the the future of manufacturing
the future of factories I want to thank our partner
Foxconn Young Liu
the CEO is here
but all of these ecosystem partners
make it possible for us
to create the future of robotic factories
the factory is essentially a robot
that's orchestrating robots
to build things that are robotic
you know this is
the amount of software necessary to do this
is so intense
that unless you could do it inside a digital twin to
to plan it to design it
to operate inside a digital twin
the hopes of getting this to work is nearly impossible
I'm so happy to see also that Caterpillar
my friend Joe Creed and his hundred year old company
is also incorporating digital twins
in the way they manufacture
um these factories will have future robotic systems
and one of the most advanced is figure
Bread at Cock is here today
he just he founded a company three and a half years ago
they're worth almost $40 billion Today
we're working together and training the
the AI training the robot
simulating the robot and of course
the robotic computer that goes into figure
really amazing I had the benefit of seeing it
it's really quite quite extraordinary
it is very likely that humanoid robots
and my friend Elon is also working on this
that this is likely going to be
one of the largest consumer
new consumer electronics markets
and surely
one of the largest industrial equipment market
Peggy
Johnson and the folks at agility are working with us
on robots for warehouse automation
the folks at Johnson Johnson working with us again
training the robot simulating it in digital twins
and also operating the robot
these Johnson and Johnson surgical robots
are even going to perform surgery
that are completely non invasive
surgery at a precision the world's never seen before
and of course the cutest robot ever
the cutest robot ever the Disney robot
and this is this is um
something really close to our heart
we're working with Disney Research
on a entirely new framework and simulation platform uh
based on revolutionary technology called Newton
and that Newton uh
simulator makes it possible for the
the robot to learn how to be a good robot
inside a physically aware
physically based environment let's take a look at it
blue ladies and gentlemen
Disney blue tell me that's not adorable
he's not adorable
we all want one we all want one
now remember everything you were just seeing
that is not animation it's not a movie
it's a simulation that simulation is an Omniverse
Omniverse the digital twin
so these digital twins of factories
digital twins of warehouses
digital twins of surgical rooms
digital twins
where Blue could learn how to manipulate and navigate
and you know
interact with the world
all completely done in real time
this is going to be the largest consumer electronics
product line in the world
some of them are just really working incredibly well
now this is the future of humanoid robotics
and of course
Blue okay
now human robots is still in development
but meanwhile
there's one robot
that is clearly at an inflection point
and it is basically here and that is a robot on wheels
this is a robo taxi
a robo taxi is essentially an AI chauffeur
now one of the things that we're doing today
we're announcing the Nvidia Drive Hyperion
this is a big deal
we created this architecture
so that every car company in the world
could create cars vehicles could be commercial
could be passenger could be dedicated to robo taxi
create vehicles that are robo taxi ready
the sensor suite
with surround cameras and radars and lidar
make it possible for us to achieve
the highest level of "surround cocoon"
sensor perception and redundancy
necessary for the highest level of safety
Hyperion drive
drive Hyperion is now designed into lucid
Mercedes Benz my friend Ola Källenius
the folks at Stalantis
and there are many other cars coming
and once you have a basic standard platform
then developers of AV systems
and there's so many talented ones
Wayve Waabi
Aurora, Momenta, and Nuro
there's so many of them we ride
there's so many of them
that can then take their AV system
and run it on the standard chassis
basically the standard chassis
has now become a computing platform on wheels
and because it's standard
and the sensor suite is comprehensive
all of them could deploy their AI to it
let's take a quick look
okay that's the
that's beautiful San Francisco
and as you could see as you could see
robotaxis inflection point is about to get here
and in the future
a trillion miles a year that are driven
100 million cars made each year
there's some 50 million taxis around the world
it's gonna be augmented by a whole bunch of robotaxis
so it's gonna be a very large market
to connect it and deploy it around the world
today we're announcing a partnership with Uber
Uber Dara
Dara K
Dara is going to
we're working together to connect these Nvidia drive
Hyperion cars into a global network
and now in the future you'll
you know be able to hail up one of these cars
and the ecosystem is going to be incredibly rich
and we'll have Hyperion
or robotaxi cars all over the world
this is going to be a new computing platform for us
and I'm expecting it to be quite successful
okay
so this is what we talked about today
we talked about a large
large number of things we spoke about
remember at the core of this is two
are two platform transitions from general purpose
computing to accelerated computing
Nvidia CUDA and those suite of libraries called CUDA X
has enabled us to address practically every industry
and we're at the inflection point
it is now growing as a virtual cycle would suggest
the second inflection point is now upon us
the second platform transition
AI from classical
handwritten software to artificial intelligence
2 platform transitioning happening at the same time
which is the reason
why we're feeling such incredible growth
quantum quantum computing
we spoke about open models
we spoke about we spoke about enterprise
with CrowdStrike and Palantir
accelerating their platforms
uh we spoke about robotics
a new potentially
one of the largest consumer electronics
and industrial manufacturing sectors
and of course we spoke about 6G
Nvidia has new platforms for 6G
we call it Ark
we have a new platform for robotics cars
we call that Hyperion
we have new platforms even for factories
two types of factories the AI factory
we call that DSX and then factories with AI
we call that Mega
and so now we're also manufacturing in America
ladies and gentlemen thank you for joining us today
and thank you for allowing me to bring
thank thank you for
for allowing us to bring GTC to Washington DC
we're gonna do it hopefully every year
and thank you all for your service
and making America great again thank you
Loading video analysis...