Groq Founder, Jonathan Ross: OpenAI & Anthropic Will Build Their Own Chips & Will NVIDIA Hit $10TRN

By 20VC with Harry Stebbings

Summary

## Key takeaways - **AI demand is insatiable, compute is the bottleneck**: The demand for AI compute is insatiable, with companies like OpenAI and Anthropic being compute-limited. If they were given twice the inference compute, their revenue would nearly double within a month. [00:35], [09:44] - **Hyperscalers must spend to maintain market leadership**: Hyperscalers are spending 'like drunken sailors' on AI not purely for economic reasons, but to maintain their leadership positions. The alternative to not spending is being completely locked out of their business. [04:46], [04:56] - **Building AI chips is incredibly difficult**: Building AI chips is extremely hard, with a low probability of success. It's not just about designing the chip, but also the software and keeping up with the rapid pace of AI development. [07:33], [12:53] - **US has a compute advantage in the 'away game'**: The US has a clear advantage in the 'away game' of AI, which involves supporting allies. This advantage stems from having more energy-efficient chips, which is crucial for countries with limited power capacity. [32:19], [32:38] - **Speed is crucial for AI engagement and brand affinity**: Speed in AI response is paramount, directly correlating with customer engagement and brand affinity. Companies that can deliver faster responses will win deals and build stronger customer relationships. [11:17], [11:53] - **Nvidia's dominance is due to supply chain control, not just tech**: Nvidia's market dominance is partly due to their control over the HBM supply chain, creating a 'monopsony' situation. This allows them to secure supply years in advance, giving them a significant advantage. [14:28], [17:00]

Topics Covered

Compute scarcity will define the AI race.
AI investment isn't a bubble; it's a survival imperative.
AI's real value is speed, not just cost savings.
Building your own AI chip offers control, not necessarily cost savings.
The US has a compute advantage in the global 'away game'.

Full Transcript

The countries that control compute will

control AI. And you cannot have compute

without energy.

>> So, I'm thrilled to welcome Jonathan

Ross, founder and CEO Grock, back to the

hot seat. And now we're going to be able

to add more labor to the economy by

producing more compute and better AI.

That has never happened in the history

of the economy before. What is that

going to do? I personally would be

surprised if in 5 years Nvidia wasn't

worth 10 trillion. But I can't predict

the outcome. The demand for compute is

insatiable. If OpenAI were given twice

the inference compute that they have

today, if Anthropic was given twice the

inference compute that they have today,

within 1 month from now, their revenue

would almost double. I'm sorry. Can you

unpack that for me?

>> Ready to go.

[Music]

Jonathan, you've just been told by our

team that our last show was the most

successful of uh the year when it came

out. So, there's no pressure at all that

this is going to be the most successful

of this year. But, welcome to the

studio, man. Thank you. It's great to

have you here, dude. Now, I I wanted to

start with a understanding of where we

are. It seems the world moves faster

than ever before, and honestly, I think

a lot of us are trying to understand

where everyone lies in a new market. If

we'd look at the current state of the

market today, how do you analyze it?

>> Are you asking is there a bubble?

>> Relatively.

>> Okay. So,

um, in terms of whether or not there's a

bubble, uh, my answer is if you ask a

question, you keep not getting an

answer, maybe you should ask a different

question. And so, instead of asking is

there a bubble, you should ask what is

the smart money doing? So, what is

Google doing? What is Microsoft doing?

Amazon, what are some nations doing? And

they're all doubling down on AI. They're

spending more. Um, like every time they

make an announcement on how much they're

spending, it goes up the next time. And

one of the best examples of the value

that's coming from this spend, Microsoft

in one quarter deployed a bunch of GPUs

and then announced that they weren't

going to make them available in Azure

because they made more money using them

themselves than renting them out. So,

there's real money in the market. And

the best way that I I think to explain

this market is like the early days of

oil drilling, a lot of dry holes and a

couple of gushers. I think the stat that

I heard was um 35 uh companies or 36

companies are responsible for 99% of the

revenue uh or at least the token spend

um in AI right now. Yeah, it's very

lumpy. And so

>> I'm surprised it's not less when you

look at No, but I mean seriously, Nvidia

really, you know, having concentration

of revenue with two clients so heavily.

>> Yeah. And maybe Nvidia represents 98% of

that.

But um when it's that lumpy, what that's

an indication of is it's like the early

days of the oil drilling where people

didn't know how to find oil. They were

going off of instinct. Uh you know,

almost vibe investing. Um and uh people

who had a good instinct would make a

fortune and everyone else would lose

their shirts over time. It becomes a

science. It becomes very predictable and

there's less uh lumpiness. Um there's

more predictability, but um investors

make less money at that point. The the

good investors make less money. So right

now is the best time for investors.

Right now people are making more money

than they're spending. It's just very

lumpy.

>> I'm sorry. They're making more money

than they're spending. But as an

aggregate, plenty of people are going to

lose their shirts, but overall less

money is going to go in than is going to

come out.

>> But when we look at the capex spend

today by the big providers, everyone is

going, "Okay, okay, okay." Because

there's something coming at the end of

it.

>> Yeah.

>> And the trouble is the capex spend is

going up and up and up.

>> Okay. So you're thinking of it purely

financially and I think that the

financial returns will be positive, but

that's not why people are motivated. So

I was in Abu Dhabi at the inaugural

Goldman Sachs Abu Dhabi event and um I

went and and um you know as you now know

uh we're sponsoring McLaren and so um

Zack Brown was talking I was talking and

it was a fun event but I was asked uh a

similar question like is AI a bubble and

um I asked the following question

everyone so this is like a bunch of

people who manage 10 billion plus in aum

right the entire like 50 plus people who

manage 10 billion boss. I'm like, who

here is 100% convinced that in 10 years

AI won't be able to do your job? No

hands went up. I'm like, great. That's

how the hyperscalers feel. So, of

course, they're going to be spending

like drunken sailors because the

alternative is that they're completely

locked out of their business. So, it's

not a purely economical framework that

they're using. It's a do we get to

maintain our leadership? Now when you

look at it the next step there are these

um you know scale law sort of outcomes

you want to remain in the top 10 right

we keep talking about the mag seven if

you're not a member of the mag 7 you're

not going to be able to get anywhere

near the valuation and so what do you do

to stay there you spend and it's worth

it because the stock value stays up

because you're in the top seven or 10

>> at some point the returns have to be

delivered though the spend has to

materialize into actual tangible revenue

you back and if it doesn't whether

you're in the mag 7 or not doesn't

doesn't matter. Correct.

>> That's correct. But um right now AI is

returning massive value already. It's

very lumpy in the al uh in the

applications but it's returning massive

amounts of value. Let me talk about an

example that actually happened for us.

So I've tried a little bit of vibe

coding. Um I'm not the best in the world

at it. We've got some uh interns who are

amazing at it. and we we had this

customer visit us or um and I had a

meeting with them and so they asked for

a feature and I specked it out very high

level viby um so I was prompt

engineering the engineers and 4 hours

later it was in production not a single

line of code was written by a human

being there was no debugging done by a

human being it was all prompting um I

think we even have slack integration now

where you push in like you commit things

through Slack. So all that was done,

four hours later, it's in production.

Think about the value there. But now

imagine, fast forward 6 months from now

when that could happen before the

customer meeting's over.

It's a qualitative difference. It's not

even just a dollar amount difference.

Yes. Um, you know, when you're able to

do it that fast, you spend less to get

the feature into production. That's real

uh ROI. However, qualitatively when you

can do that before the customer meeting

is over, you're going to be able to win

deals that your competitors won't.

>> Can I ask you just going back to the Mag

7 to stay in the Mag 7? Do you think

everyone realizes that they will need to

move into the chip layer and own the

full vertical end to end?

>> Um, I don't think you're going to see

too many successfully moving into the

chip layer. So

people look at the TPU as a big success

and what they don't realize is that

there were about three chip efforts at

Google at the same time. Uh and only one

of them ended up uh outperforming GPUs.

And when you look around the industry,

you've got a bunch of people building

chips. Some of them are getting

cancelled like Dojo recently got

cancelled. Building chips is hard. I

going off and saying, "I'm going to

build my own AI chip to compete with

Nvidia." It's a little bit like saying,

you know, that Google search is pretty

nice. let's go replicate it. It's

insane. Like the level of optimization,

the level of design and engineering that

goes into that, um, you're not going to

be able to replicate it with a high

probability of success. However, if

there's a bunch of players out there

trying to do it and you have optionality

and one of them succeeds, then you have

another chip. We mentioned earlier that

you have to spend if you want to stay in

Mac 7. Mhm.

>> Nvidia investing $100 billion into

OpenAI for OpenAI just to go and buy

back Nvidia chips.

>> Is this not just an infinite money loop?

>> Um that would be the case if they

weren't spending it with suppliers to

build those chips. It's not roundtpping

if actual productive outcomes are

occurring. So um think of it this way.

How what percentage of the spend is

going to building that infrastructure?

40%. So at least 40% of those dollars

are actually going out into the

ecosystem. So that is not an infinite

loop.

>> Okay. So it's a partial loop. 60% 60% is

going back to Nvidia.

>> Sure.

>> And then they get a bump in their stock

price of a couple hundred billion

dollars.

>> Yes.

>> How did you analyze that?

>> Here's So let's analyze it in a couple

of different ways from from an economic

point of view. Makes perfect sense. Why

not do that all day long? Um the value

occurs if there is lock in right that's

where like when revenue increases result

in stock price increases that are

greater than the amount of the revenue

it's because you believe that that

revenue is going to continue and that's

the belief and I would actually say with

Nvidia that's probably true however it's

not just because Nvidia is good and

Nvidia is very good it's also because

there isn't enough compute in the world.

There isn't. It's ins the demand for

compute is insatiable.

Um I would wager that if OpenAI were

given twice the inference compute that

they have today. If Anthropic was given

twice the inference compute that they

have today that within one month from

now their revenue would almost double.

>> I'm sorry. Can you unpack that for me?

They are compute limited and it and it

comes How would their revenue double if

they had double the compute?

>> Right now, one of the biggest complaints

of Enthropic is the rate limits.

People can't get enough uh tokens from

them. And if they had more compute, they

could produce more tokens and they could

charge more money.

And with OpenAI, it's a chat service.

So, how do you regulate your chat

service? You run it slower. You get less

engagement. How important is speed, do

you think? There's a lot of people who

think actually it's fine. I'm very happy

to have latency and I'm very happy to

have a prompt and then I go away do

something else and something happens

when I'm away.

>> Um those are

those are interesting opinions. Um let

let's look at uh CPG. So consumer

packaging goods. So I want you to rank

the CPG goods by margin. At the very top

is um tobacco uh smoking tobacco. Okay,

below that is chewing tobacco. Below

that is soft drinks. Below that, you you

keep going down, you get to water and

other things like that. What is the

number one thing that um a high margin

correlates to in CPG? It's the speed at

which the ingredient acts on you. So

that dopamine cycle, how quickly

something occurs determines your brand

affinity.

And so um when something uh has a very

quick response you associate to that

brand and then you acrew brand value.

This was the entire basis of um Google

focusing on speed. Uh Facebook focusing

on speed. Every 100 milliseconds of

speed up results in about an 8%

conversion rate. So that is wrong in

terms of people's assessment of the

future where they think, "Oh, it's fine.

we'll actually just have lots of prompts

going on in the background and we'll be

happy to let them run for long periods

of time.

>> 100% wrong. In fact, when we first

started um working on getting speed on

our chips, we knew what speed we could

get. We even made a video example of how

fast we could be. And people would look

at that video example and they would

say, "Why does it need to be faster than

you can read?" And I would respond to

that by saying, "Well, why does a web

page need to load faster than you can

read?" And there's just this mental

disconnect where people couldn't

couldn't gro the the uh the sort of

visceral importance of speed. People are

very bad at determining what's actually

going to matter in terms of engagement,

in terms of outcome, but we know this

from building the early internet

companies.

>> Do you think OpenAI will be able to move

into the chip layer? At some point,

Nvidia must be concerned that the OpenAI

will want to verticalize and own the

chip layer as well. Do you think they

will be able to make that successful

transition? Um I think one of the

problems in building your own chip is

it's really first of all everyone thinks

that building the chip is the hard part.

Uh and then as you do it you start to

realize building the software is the

hard part and then as you do it you

realize keeping up with where everything

is going starts to become the hard part.

I have no doubt that uh OpenAI will be

able to build its own chips. I have no

doubt that um eventually Enthropic will

be building their own chips that every

hyperscaler will build their own chip.

One of the things that um I had this

experience when I was at Google where um

I I got a lab tour and this was before

AMD was doing a great job, right? AMD

was struggling for a little while and

now they're doing great. But um they had

built 10,000 servers and those 10,000

servers of AMD chips, I was walking

through the lab and they were pulling

the servers out of the racks, taking the

AMD chip, popping it off, and throwing

it in a trash can. And the funny thing

was it was almost pre-ordained because

everyone knew that in that generation

Intel was going to win. So why did

Google build 10,000 servers?

because they wanted to um get a discount

on the Intel chips they bought.

And when you're at that scale, the cost

to design your own server because they

had to design their own motherboard in

order to fit the AMD chip uh and and to

build that out and test it versus the

discount that you get totally worth it.

So, you have to think of what all the

motivations are when people are building

their own chips. It's not just because

they're going to deploy that chip in

mass production.

The the thing is Nvidia effectively has

a monopsiny on HBM. A monopsiny is the

opposite of a monopoly. So when you're a

single buyer and there's a finite amount

of HBM capacity, which is the high

bandwidth memory that goes into the

GPUs,

the the GPU itself is made using the

same process that's used to build the

chip that's in your mobile phone. If

Nvidia wanted to, they could build 50

million of those GPU die per year, but

they're going to build about 5.5 million

GPUs this year. And the reason is

because of that HBM, because of the

interposer that it goes on. And uh

there's just a finite capacity. So what

happens is a hyperscaler comes in and

says, I want a million GPUs. And Nvidia

is like, sorry, I've got other

customers. And the hyperscaler says, no

problem. I'm going to build them myself.

And then all of a sudden those GPUs are

found by Nvidia to give to the

hyperscaler. Um there is just a finite

amount of capacity by building your own

chip. What you really get isn't your own

chip. It's that you get control over

your own destiny. That's the unique

selling point of building your own chip.

And so um if

>> what does that mean control over your

own destiny?

>> Uh Nvidia can't tell you what your GPU

allocation is. It may cost you more to

deploy your own chip because it's not

going to be quite as good as Nvidia's.

Let's think about why Nvidia's GPUs with

a slight edge over AMD's GPUs dominate.

If your total cost to deploy is a huge

multiple of the cost of the chips and

the systems, then a small percentage

increase in the cost of the chip is is

negligible. So, think about it this way.

If I'm going to deploy a CPU and that

CPU is 20% of the bomb and I get a 20%

increase in the speed of the chip,

that is a 20% value increase in the

entire system versus a 20%, you know,

you know, increase in the the chip cost,

right? It's negligible. So, you get

these huge multiples when you improve

the the chip performance. So small diff

differences in performance make a huge

difference in the value of the product.

So a small edge gives you a um massive

edge in selling that product.

>> Can I ask you you mentioned the monop

monopsiny?

>> Yes.

>> Yeah. Is it possible for openai

anthropic any of the mag 7 any of the

other providers to move into the chip

layer if there is a monopsiny on HBM

market?

>> It's very hard. However, there is an

incentive from those building HBM to

spread that around because Nvidia gets

to negotiate very good rates because

they're such a large buyer. However, if

you're building an HBM fab um and um

packaging house and all of this other,

you know, part of the ecosystem, if

Nvidia comes in and writes a big check,

then you're going to build the fab for

them. So, Nvidia is always going to get

the amount of supply that they want uh

in advance. the the problem is you have

to um write that check more than two

years in advance. And so the where AI's

gone, you know, just absolutely honking

hockey sticking um even when you have

the cash flow of Nvidia, it's hard to

actually write the checks for the amount

of demand that there's going to be in

advance. So there is going to be a

supply constraint and it's not purely

based on being a monopsiny. Part of it

is based on just the sheer capital costs

and the memory um suppliers are very

conservative. There's also um this

situation where the margin on HBM is so

high that no one wants to actually

increase the supply because then the

margin goes down.

>> I totally understand that. Can I ask you

when you look at that and when you look

at Open AI, when you look at Anthropic

having their own chips, h is that why

they're raising the money they are? Sam

said they're going to need hundreds of

billions of dollars. Is that factoring

that in?

>> No. Most of the spend so um buying a

system is expensive. Buying a data

center is more expensive. The reason is

you're advertising that data center over

a longer period of time. So even if a

data center was going to be one-third of

your cost per year, if you're

advertising that data center over 10

years and the chips over 3 to 5 years,

the data center is going to end up

costing you more

per year. So when you hear the

hyperscalers talking about that, you

know, 75 billion to hundred billion

dollar a year investment because they're

building out the capacity for data

centers, they're putting a lot of money

up uh for returns that they're expecting

over the next 10 plus years. So it's

actually not that much money when you

think about it.

>> Are we thinking about amortization in

the right way in a 3 to 5 year cycle if

chip cycles are actually faster than

that? Uh I think that the amortization

like people are definitely thinking

about it over a longer period than I

would. We use a more conservative number

uh internally. Um I think five to six

years

>> which would be like three years

>> uh a little bit less.

>> Yeah.

>> We're we're looking at um upgrading

chips about once a year.

>> Yeah. Now here's the the way to think

about it. There's there's two phases of

the the value of a chip. There's the am

I willing to buy it and deploy it and

there's am I willing to keep it running.

They're two very different calculations.

And so when you deploy it, you have to

be able to cover the capex. When you

keep it running, you just have to beat

the opex. So if I deploy a chip today, I

have to beat the capex. I have to earn

all my capex back and make a profit and

produce a return

once I've deployed it. as long as I'm

beating my operational costs, I'm going

to keep that thing in production. So,

you're okay with the the price the the

value of that chip going down over time.

Now, the bet that everyone is making is

that those new chips that come out

aren't going to reduce the value of the

old chips below the opex.

>> That's it.

>> That's right. And in our case, we

actually don't think that 5 years makes

any sense

>> because they will be so much less

performant that actually the value will

be lower than the operating cost

>> for the electricity and for paying for

the data center.

>> So what happens then? We just have this

excess supply of wasted chips which are

going

>> because a lot of these people have

entered into really long contracts and

so they have a third point where they

have to consider their calculation which

is breaking this contract is that

cheaper than running the chip at a loss.

>> Yeah.

>> Can you see this? Um so what happens

then?

>> Um then uh I can't tell you what happens

because we're trying to avoid that

situation. So, um, by having a much

faster payback period in all of our, um,

calculations,

uh, I would not want to make a bet that

long out. The the shorter the time frame

that you're making the bet, the clearer

your outcome is.

>> So, essentially, you want to minimize

payback period as much as possible and

then minimize operating cost so that you

can shed less performance chips faster.

>> Yes. But also, here's another crazy

part, which is when you look at the math

this way, you're like, if I'm

approaching it as an accountant, I'm

going to be like, this is a terrible

idea. But if I look at it empirically,

people are still renting H100s,

how old are those chips? They're they're

getting close to 5 years old. Um, and

they're still operating well they're

still earning more than their operating

cost by quite a bit. You would never

deploy an H100 today, but they're still

profitable to run, right? They're in

that second phase. And the reason is

people can't get enough compute.

If that wasn't the case, H100s would be

renting for a fraction of what they're

renting for today. And as long as you

can't get enough compute, that's going

to be true. The question is, is there an

alternative out there that isn't a

supply constraint? And so, this is where

we're hoping to come in. So let's let's

talk about our value proposition. Um so

you started off asking me about speed.

Do you know how many customers come to

us asking for speed?

>> No.

>> 100%. Do you know how many customers

keep asking about that once they realize

um the supply constraint out there?

None. So they start with speed that that

they know the value of that to their end

customer and then they're like, "Oh,

wait a second. I can't even get enough

compute."

The real value prop is can you provide

more compute capacity. So two weeks ago

we had a customer come to us and ask for

5x our total capacity. They couldn't get

that capacity from any hyperscaler. They

couldn't get it from anyone else. We

couldn't give it to them. No one can.

And so we couldn't get that customer.

The hyperscalers couldn't get that

customer. There isn't enough compute.

So when you're in a market where there

isn't an So your your choice is I buy

this compute and I get the customer.

This is where I was going to you when I

said you know if OpenAI or Enthropic

were to double their compute they would

double their revenue. Right? So if

you're someone who can't get enough

compute to serve your customer then

you're going to be willing to pay

whatever it takes to get those customers

because you feel that there's lockin

value by getting that customer now.

And so the number one value prop that we

have is that our supply chain is not

like a GPU supply chain. You you have to

write a check two years in advance to

get GPUs. For us, you write us a check

for a million LPUs and the first of

those LPUs starts showing up 6 months

later.

>> Wow. So you got an 18month chasm

difference.

>> That's right.

>> Wow.

>> So I had a meeting with the head of

infrastructure of one of the

hyperscalers and I talked about speed. I

talked about cost and all this stuff,

but when I talked about the supply chain

and and how we could do something in 6

months, he just stopped the conversation

for a moment and wanted to dig into

that. That was the only thing he cared

about. Think about it this way. If you

have a

>> given the speed of progression of the

landscape of models,

does two years make sense?

>> Well, this is um uh so uh do you know

Sarah Hooker?

>> No. So she she wrote this paper um the

hardware lottery um where it

the the my TLDDR on that one is people

are designing the models for the

hardware. So there are architectures

that could be better than attention.

However, attention works really well on

GPUs. So if you are the incumbent, you

have an advantage because people are

designing their models for your

hardware.

It doesn't even matter if there's a

better architecture out there. It's not

going to run well. So, it's not a better

architecture. There's a little bit of a

loop there. So, um if you are building

two years out and you're the incumbent,

that's okay. But if you're trying to

enter the market, no one's going to

design for for your chips two years out.

So, you have to have a faster loop.

>> When you see

everyone moving into the chip layer, as

you said, OpenAI will have their own,

anthropic will have their own. What does

Nvidia do in that world?

>> Nvidia still keeps selling chips because

>> to Hula given the concentration of their

buyers

>> Um

so no one is successfully predicting how

fast AI Okay, we started off talking

about is AI a bubble? If you look uh for

the last 10 years infrastructure for

data centers, you're planning that out

two three four five years in advance,

right? And what happens is everyone

everyone's predictions are wrong. They

end up building too little. This has

just been what what's happened for the

last 10 years. So if you don't build

enough for 10 years, what do you do? You

try and overbuild. You try and build

more than your most optimistic

projections. And then once again, you

haven't built enough. So you increase

your projections and you just keep doing

this. Um that's what's been happening.

And yet people still aren't building

enough compute. And where people's um

instincts are off and and this just

hasn't I think been recognized yet. AI

doesn't work the way SAS does. In SAS

you have a bunch of engineers who go out

and build a product and the quality of

that product is determined based on what

those engineers did. That's not the case

in AI. In AI, I can improve the quality

of my product by running two instances

of the prompt and then picking the

better answer. I can actually spend more

to make my product better on each query.

I can even decide this customer is more

valuable and I'm going to give them a

better result. That's kind of what

OpenAI announced when they said uh we're

about and they did this this week where

they said we're now going to release

some products where we can't really

afford the compute so we're going to

give it to a limited set of users and

we're going to charge more because we

want to see what happens when we give

more compute to the to the AI. We want

to see what that product looks like and

how much better it is. And that is going

to be our future. Every time you uh give

more compute to an application, the

quality increases. And this is why it's

not coincidental that you see people's

token as a service bill almost matching

their revenue because they're competing

for customers and if they just spend

more, their product gets better. Totally

understand that. But bluntly the the

assumption when you look at GPT5 and the

focus on efficiency is that SAM

transition from performance to

efficiency because compute does not

equal a parallel level of performance

improvement. Do you think that is fair

and true and does that not go against

what you just said?

>> No. And

you have to think of the different

outcomes that they're looking for. So if

you are open AI, you have moved into

markets that are incredibly

cost-sensitive. Let's talk about India

for a second. So if you want to go win

India, what's the one thing you need? 99

rupees a month. That's about a$1.13 with

current conversion rates. You need to

charge your customer a$113 for your

product. So they're going after a market

whose alternative is I have no AI.

>> You've got open. I mean they can use

deepse.

>> This is another misconception in the

market. Let's just start busting every

misconception.

>> Sure. Great.

So when the Chinese models came out,

everyone reacted by saying, "Oh my god,

they've trained models that are almost

as good as the US models." And we had we

had a a podcast on this, right? Uh and

even I was uh snookered a little bit at

first. Um and oh my gosh, aren't these

models so much cheaper to run? Um now

that I know more about the foundation

models that people are using uh versus

the Chinese models, no, they're not

cheaper to run. They're about 10x as

expensive. Actually, let's just take the

GPT OSS model that was released. It's

optimized for something different than

the Chinese models, but the quality is

very high. Uh, and I would argue clearly

is a better model for what it focuses on

than the Chinese models. Now, the

Chinese models focus on different

things. However, the cost to run the OSS

model is about uh onetenth that of the

Chinese models. So why was everyone

charging less? Well, when you have a um

sort of a a captive market for a model

because people say, "I want this model

and there's only one provider of it. You

can charge 10 times as much." The price

was higher and people were confusing the

cost with the price. So the Chinese

models were optimized to be cheaper to

train as opposed to be cheaper to run.

And when you see how much um

intelligence has been squeezed into the

OSS model versus the equivalent uh

Chinese models, it's clear that the US

still has a training advantage. And the

economics work out such that you have to

amvertise that training over um every

inference, which means that you want to

charge more. And so there's still a

balance there. But as you scale out into

larger and larger numbers of people,

being able to afford to train a model

starts to be a payoff. As you deploy

more inference capacity, you want to

spend a bit more on the training to get

your inference cost down. In the US, we

have a massive compute advantage. And so

people train the models harder, bringing

the cost down.

>> Why do we have a comput advantage in the

US just in terms of access to chips?

>> That's correct.

>> Yeah. Um, and so

>> will will China not just subsidize the

inference and the running though? I

understand we

>> Yes. So does it matter if their cost of

running is higher, but the Chinese the

CCP will just subsidize it.

>> Does it matter?

>> There's a home game and there's an away

game. The home game is we want to um

build enough uh compute for the United

States. The away game is we want to

build it for our allies, right? Europe,

South Korea, Japan, um India and so on.

And

the advantage that the U so China can

can win their own home game. They're

going to build 150 nuclear reactors. So

they're going to have enough energy even

though their chips aren't as energy

efficient. Uh and they can subsidize as

you mentioned. But the away game is

different.

If a country only has 100 megawatts of

power,

what are they going to do? Build another

nuclear power plant? Like that's just

not a realistic thing. You can do that

in China. You can't do that elsewhere.

So having a better chip gives you an

advantage in the away game. So my

expectation is that right now for the

next 2 to 3 years, the United States has

a clear advantage in that away game over

China. And if we move very quickly, then

we're going to be able to bring a bunch

of allies into the AI race.

>> Do you think we should have open models

to allow for China to distill in the

effective ways that they have done

already?

>> I think the model itself is not a clear

advantage. So the the first time you had

me on your podcast, I predicted that

OpenAI was about to open source their

model.

>> You remember that?

>> Yeah. And my prediction was based on

their branding strength. Frankly, OpenAI

could probably open like use um Llama 2,

the old model from how long ago? Like

two years ago.

>> Yeah. And people would probably still

use it. And so there's a brand advantage

there. Now, they do have very good

models, but they don't necessarily need

it because of that brand advantage.

I think that Enthropic should be open

sourcing their previous generation in

order to get people using them instead

of the Chinese models because if someone

is willing to use a Chinese model, then

they would at least be using the

anthropic model and their prompts would

be recyclable. And just like you have

software compatibility, you have prompt

compatibility.

For example, when the um OpenAI OSS

model was released, one of the main

reasons people started adopting it over

the Chinese models was they could reuse

their prompts.

>> Now, of course, when someone has a

lowcost application

um and they can't afford the the premium

for, you know, open AI, they want to use

one of these open source models.

Eventually, they start doing really

well. They make more money. They start

wanting to get access to the premium

model. Their prompts are reusable.

So there's a win by open sourcing these

models and you're also getting all of

these infrastructure providers to um uh

drive the cost down on that model as

well. There's a lot of innovation that

goes into that.

>> Totally get that. Can I ask you there's

so many different areas that want to

take this but you we said that just

build as much comput as possible. The

energy requirements are intense. Is the

only way to provide the energy required

for this compute

wave tsunami whatever you want to call

it is the only way nuclear.

>> No no no no. Um so nuclear is efficient

and and cost effective but um uh

renewables are efficient and cost

effective. I'll give you my my simple

hack. Um,

so

all all the allies of the United States

have to do in order to have more energy

than China is to be willing to locate

their compute where energy is cheap.

So right now,

okay, let's compare Europe to the United

States. The United States is incredibly

risk averse compared to Europe.

>> Wow.

>> Yeah.

>> In energy?

>> No. No. In in everything. But you have

to ask what kind of risk. There's two

kinds of risk. There's um mistakes of

commission where you do something that's

a mistake. And then there's mistakes of

omission where you don't do something

and it's a mistake. And the United

States is terrified of making mistakes

of omission.

When you are in a massive growth

economy, missing out is more expensive

than fumbling something. And so the

Europe is incredibly willing to embrace

the risk of omission.

So the way that Europe is trying to

compete is through legislation by saying

things like I want to keep this data in

Europe or I want to keep this data in

this country.

If Europe wanted to compete in AI, all

you'd need to do is say Norway,

please deploy an enormous number of wind

turbines.

Why? Norway has about an 80% uh

utilization rate of wind. So like 80% of

the time you can be generating energy.

Um they have enough hydro that if you

deployed an uh 5x the wind power of the

hydro, Norway itself could provide as

much energy as the United States and

could do it consistently.

The entire United States, that's one

country in Europe.

How much other energy is there out there

that could be unlocked that isn't

nuclear? And by the way, let's also

deploy nuclear. Nuclear is incredibly

safe these days.

Why do we not then?

>> Fear.

>> Is that really it?

>> Yeah.

When you speak to European governments,

what do they say to you?

>> I don't bring up nuclear because I'm not

going to push an energy source that

everyone's going to push back on. But um

when I was in Japan recently, they were

talking about bringing their nuclear

reactors back online.

Japan um has a reputation of being uh

very slow.

There's a lack of subtlety and nuance in

that um perception. The reality is Japan

is slow to make a decision, but when

they decide something, they move really

fast. Um let's take an example. Japan

decided to build a 2nmter fab.

Um when I was there last they were

showing off these two nanometer wafers

that they had produced. Now the yield's

not where it needs to be. This is not

production grade but they built a

2nanmter fab and they are producing

wafers out of it and they're going to

start getting that defect density down.

They're going to move quickly. Uh

they've allocated $65 billion for AI and

they're going to spend it and they're

going to spend it quick. They're going

to turn their nuclear reactors back on.

when Japan is going to turn their

nuclear reactors back on, Europe needs

to listen to that and go, gosh, we need

to catch up in energy.

>> Catch up is exactly what I was thinking

because what I'm thinking is the speed

it takes to build out. You said about

kind of noise like latent capacity of

wind and how we could utilize it. Dude,

it takes years to build huge huge supply

of turbines.

>> Does it?

>> Yeah. Why you think you the Norwegian

government is going to let you shell out

and have 10,000 wind turbines on the

ground?

>> Why does the Norwegian government need

to pay for it?

>> Who should how about the hyperscalers?

How about um other governments that want

to locate there?

In Saudi Arabia, there are gigawatts of

power and they're building out data

centers for that. Why doesn't Europe

work with Saudi Arabia to say, you know

what? So Saudi Arabia wants to do a

program of data embassies where you have

sovereign um oversight over your data,

but you get to use their energy.

Why not use that?

Problem solved.

They're going to build out 3 to 4 gawatt

in the very near future.

So the hyperscalers would pay Norway to

use their renewable energy sources and

then leverage that. The complaint that

the hyperscalers have is all of the the

um paperwork and the slowness. I was

talking to someone who was on the board

of a major energy company that builds

nuclear power plants. He said they spend

three times as much on the permitting in

the United States than on the nuclear

power plant. And I don't know about

Europe, but typically the United States

is better than Europe on this. How much

does it cost to build a nuclear power

plant in Europe versus like versus the

actual cost of the infrastructure versus

the permitting? Here, here's what

everyone needs to walk away from this

with.

The countries that control compute will

control AI

and you cannot have compute without

energy.

How far behind is Europe? Is there a way

for us to get back? Like, is it too

late? I don't want to be negative. I'm

not overly pessimistic, but is there a

chasm which we can catch up on?

>> I don't think there's a problem right

now if if Europe acts now. I mean, um,

China is ahead in action, but there are

500 million people in Europe. There's

over 300 million in the US. And if you

start bringing all the allies together,

South Korea, who by the way knows how to

build nuclear power plants. The power

plant in um the UAE was built by South

Korea,

they could build power plants here.

France knows how to build power plants.

How about a little bit of a Manhattan

project for building enough energy? Um

when I'm walking around in Europe in the

summer, it's incredibly hot. And when

I'm walking around in the winter, it's

incredibly cold. That is not an

experience you have anywhere else in the

world. Build more energy.

>> I'm with you, Jonathan, but I'm also

realistic. I know how slow we are as

governments, both singular and in

collaborating together. It's not going

to happen at the speed of which this

needs to be done. What happens if that

does not happen in the speed with which

it needs to be done?

>> Um, then Europe's economy is going to be

a tourist economy. People are going to

come here to see the quaint old

buildings and that's going to be it. you

you cannot um you cannot compete in a

new economy if you don't have the

resources that the new economy is built

on. And the new economy is going to be

AI and it's going to be built on

compute.

>> Is model sovereignty enough to win? If

you look at a provider

>> because if you don't have compute, you

can't run the AI. Doesn't matter how

good your model is. Um you could have a

model that is 10 times smarter than

OpenAI's model. And if you have 10 times

the compute, OpenAI's model is going to

be better.

>> So for a Mistral who say, "Hey, we're

going to have sovereignty within Europe

and the German healthcare system and the

Croatian transport ministry are going to

use Mistral because we're a European

alternative. That's not a reason to

win."

>> What's the USP? What's the unique

selling point?

>> It's a European model and it doesn't

have ownership in the US under a Trump

administration.

What does it have to do with giving you

enough compute?

What what you're solving for there is

removing someone's else someone else's

ability to control you.

>> Yeah.

>> But what you're not solving for is

having enough of it. And by the way, I'm

not saying don't use Mistral. We have a

partnership with Mstral. We we love

Mistral. The the thing I'm saying is

build enough compute so that ML can

compete.

If you listen to this, are you not just

like, "Shit, I should just buy the

out of Coreweave."

Seriously, like when you look at what

they provide on demand, like yeah, Core

is a great company. Um, but they have a

finite allocation of GPUs.

Everyone has a finite allocation.

>> When we chatted before, you said to me

that GPUs are not the best

infrastructure for inference.

>> Correct.

and that we are moving more and more

into a world of inference as we move

further along the maturation cycle of

training models.

>> Yes.

>> Does that not mean that Nvidia's power

hold weakens further?

>> No. Nvidia is going to sell every single

GPU that they build. And even if we end

up um supplying 10 times as many LPUs as

GPUs, all that's going to do is increase

the demand for GPUs and allow them to

charge an even higher margin.

>> Why is that? Sorry.

>> Because the more inference you have as

as mentioned before the the more you

need to train the model to optimize for

the inference and the more training you

have um the more inference you want to

deploy to optimize for the cost of that

training to amvertise the cost of the

training. There's a virtuous cycle

between the two.

>> Is the inference market playing out as

you expected it to in terms of

maturation

deployment speed?

>> What I never expected was that AI was

going to be based on language.

And what that's done is it's made it

trivial to interact with AI. I thought

it was going to be more like Alph Go. I

thought it was going to be intelligent

in some weird esoteric way. The fact

that it's language means anyone can use

it. So I expected AI to come sooner and

grow slower. It came later and it's

growing faster than I ever imagined. It

is so easy to interact with AI that

anyone can do it. 10% of the world's

population is a GBT weekly active user.

>> Isn't that astonishing?

>> Yes. But um you know what's holding it

back?

>> Compute.

>> So compute is holding it back for the

quality of it. But more people would use

it. They just wouldn't get as much out

of it. But more people would use it if

more languages were supported. Well,

>> this is the number one complaint we hear

around the world. You know what would

solve that? More compute, more data. If

you have more data then you can train

more but you need more compute and by

the way if you have more compute you can

generate more synthetic data so you can

train more

every one of these so you have data you

have algorithms and you have compute if

you improve any one of them it's not a

bottleneck it's not like if the compute

doesn't get better I can't use more data

or if the data doesn't get better I

can't use more compute any one of these

that gets better improves AI and that

makes it really easy to improve AI

because you can improve improve one

dimension of it. It just turns out the

easiest knob to improve an AI, it's not

the algorithms. Algorithms rarely

improve. It's not the data because it's

really hard to get more data and we

haven't fully figured out synthetic data

generation. We're we're good at it, but

but we're not we're not at the point yet

where we can just directly turn compute

into more data. We're getting there.

Compute is the easiest knob because it

just keeps getting better and better and

better every year. And if I write enough

u you know if I write a check for enough

money and I'm willing to wait a little

while I'm gonna get more compute. It's

the most predictable part of the

pipeline

>> given it's the most predictable part of

the pipeline

>> and yet we still underestimate how much

we need.

>> Do you think we are dramatically

underestimating how much we need today?

>> Yes. Yes.

>> By what scale?

Going back to what I said about how

every time you add more compute a

product gets better. Um there is no

limit to the amount of compute that we

can use. It's different from the

industrial revolution. In the industrial

revolution

um you couldn't use energy unless you

had the machinery to use it and you had

to build machinery and that took time.

If I wanted to um if I wanted to, you

know, have more cars on the road, I had

to build the cars. It wasn't enough to

just pull more oil out of the ground. AI

is not like that. Yes, if I make my

model better, it I can actually do more

with the same amount of compute. But if

I double my compute, I double the number

of users. I improve the quality of the

model. This is different. I can

literally just add more compute to the

economy and the economy gets stronger.

We've never had that before where it

wasn't a bottleneck. It was more of a

rubberneck where you could just force

more of one component through and then

everything improves. You said the

economy gets stronger. When we think

about kind of what that's predicated on,

that's predicated on the $10 trillion

labor spend in GDP uh shifting um to AI

and us taking a portion of that. Do you

think that we will see significant

shifts in the GDP or the spend on labor

moving towards AI in the next 5 years? I

believe that AI is going to cause

massive labor shortages.

Yeah. I I don't think we're going to

have enough people to fill all the jobs

that are going to be created. There's

there's three things that are going to

happen because of AI. The first is

massive deflationary pressure. Um this

cup of coffee is going to cost less.

Your housing is going to cost less.

Everything is going to cost less, which

means people are going to need less

money.

>> So, how is it going to cost less to have

a cup of coffee because of AI? because

you're going to have uh robots that are

going to be farming the coffee more

efficiently. You're going to have better

supply chain management. You're going to

um it's just going to be across the

entire supply chain. Um you're going to

be able to genetically engineer the

coffee so that you get more of it per um

watt of sunlight, right? Just across the

entire spectrum. So, you're going to

have massive deflationary pressure.

That's number one. And what that means

is people will need to work less.

And that's going to lead you to number

two, which is people are going to opt

out of the economy more. They're going

to work fewer hours. They're going to

work fewer days a week, and they're

going to work fewer years. They're going

to retire earlier because they're going

to be able to support their lifestyle

working less.

And then number three is we're going to

create new jobs and new uh company uh

new industries that don't exist today.

Um think about 100 years ago. 98% of the

workforce in the United States was in

agriculture. 2% did other things.

When we were able to reduce that to 2%

of the population working in

agriculture, we found things for those

other 98% of the population to do.

the jobs that are going to exist a 100

years from now, we can't even

contemplate. 100 years ago, the idea of

a software developer made no sense. 100

years from now, it's going to make no

sense, but in a different way because

everyone's going to be vibe coding,

right? Um, and influencers, that

wouldn't have made sense 100 years ago.

Uh, but now that's a real job. People

make millions of dollars off of it. So,

what jobs are going to exist 100 years?

So, number one, deflationary pressure.

Number two, opting out of the workforce

because of that deflationary pressure.

And number three, uh, jobs and companies

that couldn't exist today that were

going to exist and are going to need

labor. We're not going to have enough

people.

>> It's fascinating the counternarrative,

isn't it? Everyone being like, "Ah,

millions and millions of people will be

unemployed." And you're like, "No, we're

actually not going to have enough people

for the jobs."

>> Well, what was the famous um prog

prognostication 100 years ago that there

was going to be massive famine because

we weren't going to be able to feed

ourselves? People always underestimate

what's going to change in the economy

when you improve technology.

>> When you think about the requirements

from an energy perspective and then also

what you just said there about kind of

labor, do you think Trump and a Trump

administration is doing more to help or

to hurt the advancement of AI in the US?

>> Um, definitely help. Uh, all of the

moves that have been made are things

that are going to help with AI. Um for

example um you know the permitting

issues right um overall it's been a very

positive experience on AI

>> you mentioned vibe coding I do just have

to ask about it do you think this is an

enduring and sustainable market when you

look at um a lot of the use cases today

they're quite transient

>> do how do you analyze the future of the

vibe coding market having played with it

a little bit and having seen also

interns as you said who are very good at

it internally use it Well,

>> um, vibe coding is going to be, so let

let's take reading. Reading used to be a

reading and writing used to be a career.

If you were a scribe, you were one of

the small percentage of people who knew

how to read and write and people would

hire you just to record things and you

you did much better than the average

person in the economy because of that

because it was a specialized skill.

Coding has been the same thing. Very

small percentage of the population did.

It took you know a couple years to learn

how to do it well. Uh some people were

really good at it.

Now everyone reads, everyone writes.

It's not a special skill. It's expected

in every job. And coding is going to

become the same thing. For you to be in

marketing, you're going to have to be

able to code. For you to be um in

customer service, you're going to have

to be code uh be able to code. Uh, I was

having dinner with someone who runs a

chain of 25 coffee shops, has never

coded in their life, and they vibecoded

a a supply chain tool that allowed them

to check inventory.

They didn't write a single line of code.

They got it to work. And it was funny

because they discovered all the problems

that we software engineers uh discover

over time. They started getting feedback

from their employees like this feature

doesn't work. it doesn't this thing

doesn't work when I do this all the

little edge cases and then he just

started fixing them and all through vibe

coding

>> do margins matter in a world of

exponential growth when we look at the

demand for your products when we look at

the demand for a lovable or a replet

both bluntly have bad margins does it

matter having bad margins when growth

demands are so high

>> um I would say that margins

first of of all, you do have to have

profitability in the end or at least

break even, right? To be an ongoing

concern. At some point, you can't just

keep raising money. Even Amazon had to

start making some money.

But the real reason why you need higher

margins is volatility.

Because if you have a razor thin margin

and the market moves, you may not be

able to raise more money. You may not be

able to get a loan. And so what a margin

does is it gives you stability and

staying power in the market. On the

other hand, um what it does is it also

gives competition the ability to enter,

right? Your margin is my opportunity.

And so what you're trading is stability

um for um for a competitive mode. That's

the decision that you have to make.

>> How do you think about margin internally

today?

Um, I think you want the ability to have

margin and you want to give it to your

customers and you want to give them an

advantage. And if you have the ability

to take that margin when it's needed,

then um you're in a great position. So

I remember talking to a so we hired this

amazing CFO recently but I remember

talking to a previous candidate and when

we were talking about um margin they

said that we should price uh so that our

supply met our our demand.

In other words they wanted to increase

the price in order for the demand to

come down.

>> Makes sense.

>> Does it

>> economic sense? Yeah.

>> Economic sense.

>> Logically and rationally. Yes.

>> But then logically uh why not um use up

your brand equity?

Why not use like the trust that your

customers have to sell them things that

aren't good?

Brand value, brand equity has value.

You want to keep your brand equity as

high as possible because trust pays

interest. And similarly, you want to

keep your margins low enough that you're

building up this sort of equity value

with your customers where they know that

you are giving them a good deal. When

you charge a high margin, you are at

odds with your customer

and you want to do everything that you

possibly can to align with your

customer. I want my margin to be as low

as I possibly can make it while keeping

my business stable. And I'm going to

make my cash flow by increasing the

volume. And one of the things that I

love about the compute business is that

the need for compute is insatiable. It's

Jevan's paradox. If we produce 10x the

compute, we will have 10x the sales.

That's just the way it works. As long as

we keep bringing the cost down, people

are going to buy more. And so I want to

keep bringing that cost down. I want to

keep increasing the volume. And I want

to keep selling more for less so that

people get more value out of their

business and they buy more and that

cycle continues.

>> How far are we on the journey to bring

the cost down? You know, I remember I

look back at some of the shows, dude,

and I I would cringe at myself because

I'm talking about like, oh, Canva

implementing AI and it's hurting their

margins because they're implementing AI

and it's going to cost them more. And

it's just such a naive approach to ask

that question even because now the cost

of implementation has gone down by 98%.

How far are we in terms of that cost

reduction cycle?

>> Well, let's step back and and use your

Canva example.

>> Yeah.

>> Um successful businesses don't watch the

bottom line. They watch their customers.

They they solve problems that their

customers have. If you are competing,

you are doing it wrong. You want to

differentiate. You want to solve a

problem that your customer has not

solved yet and can't solve any other way

and then they're happy to pay you money.

And that's how it works. You solve their

problem and then your cash flow is

solved.

So someone's spending on AI, if you just

look at the balance sheet, that doesn't

make sense. But when the customer is

very happy and they're solving a problem

that they couldn't solve otherwise,

first of all, you're increasing the TAM

usually with AI because it makes the

product so much easier to use. Did you

use Photoshop two years ago? Impossible.

Now, if you want to generate an image,

you just explain what you want. That

increases the TAM. You may be able to

charge less per photo, but your total

revenue increases. Your total market

increases.

Forgive me for this financial question,

but we see the S&P about to hit 7,000.

We see this ripping of the MAG 7 like we

haven't seen a concentration of value in

many, many years. And people suddenly

start to feel like, wow, it's getting

toppy. I listen to you and I hear all of

this and I think it's just the start.

How should I think about the duality of

those two thoughts?

There's two components to the value. Um,

one is the weighing machine and one is

the popularity contest.

And there are some products that are

pure popularity contest like crypto.

I have never bought a Bitcoin, you know,

I missed out. Why? Because I can't play

in the popularity contest. I'm not good

at it. I don't know what's going to be

popular and what isn't. All I can do is

I can see value.

When I look at AI, I see real value

being delivered. Best example, PE firms

are all over us. They want access to

cheap AI compute because every time they

get more cheap AI compute, they can

bring the the they can change the bottom

line of their businesses. It has real

value. When PE firms go after something

and see value in it, it's not a

popularity contest. It's pure value. And

so what happens is the the reason

companies get a large multiple is people

see that the valuation is the actual

value is going to acrue or they get hype

cycled on it and it's pure popularity

contest and there are different

participants in the market. Some of them

are just playing the popularity contest.

Others are looking at the value and they

may come to the same conclusion for

different reasons.

coming at it from the value point of

view, the weighing machine point of

view.

The most valuable thing in the economy

is labor. And now we're going to be able

to add more labor to the economy by

producing more compute and better AI.

That has never happened in the history

of the economy before. What is that

going to do? Do you worry that if we

have a speed bump in the short term, it

will derail significant parts of the

economy given the concentration of

value? Everyone rips today. But if

Nvidia, Mata, Google, Microsoft suddenly

hit speed bumps and the AI speed train

is just slowed down, the consequent

multiplier effect is mega. Do you worry

about that?

>> Yeah. And this is this is independent of

the value of AI. This is the sort of

control system um theory of what's going

on, right? So a stock market could

inherently be on an upward trajectory.

it can overheat and that overheating

causes it to run away. People bid things

up. They realize they've made a mistake

and then it has to come back down and

then it dips below where it should be.

Spending um retreats and then people

don't have the funds they need to build

their businesses. Uh a lot of good

businesses can die during one of these

downward trends. But this is also where

the best businesses are made. How many

times do you see a downturn? Um, and a

ton of amazing businesses come out of

it.

>> Do you think we will have a downturn in

the next year? I

>> I can't predict whether or not there'll

be a downturn. There are things. So, the

ability to predict something is largely

dependent on whether or not predictions

affect predictions.

If a prediction affects the prediction,

you cannot predict it because you are

whatever your prediction is changes the

outcome. The only things that are

predictable are things where the

predictions don't change the outcome. If

an asteroid is headed towards the earth

and we see that

um if we don't have the technology to

stop it, then it's going to happen. But

if we see that happen and we can predict

it, then we might develop the technology

to stop it. Do do you see the problem? I

do.

>> And so in the economy, you don't have to

do anything other than move dollars

around. So you have these very sort of

fast twitches in the economy based on

people's ability to predict which makes

it unpredictable. I can't tell you

what's going to happen in the economy.

All I can tell you is that right now the

biggest problem I see in AI is if you

see a good engineer, one that you would

have hired before, they can go out and

they can raise 10, 20, 100 million, a

billion dollars, and then rather than

contributing to one of the other AI

startups, they go create their own,

which means that you have difficulty in

getting critical mass of talent in any

one of these AI startups. On the other

hand, AI is making everyone at one of

these startups more productive.

So in terms of whether or not the the

economy is overheated, I think one of

the best predictors of that is is the

economy getting in the way of the

success of the companies. If it's not

getting in the way, then I don't think

it's overheated.

>> Do you not think it is getting in the

way? Because fundamentally the capital

supply side is so uh large that we are

actually preventing you from being able

to get great engineering teams together

because we're funding talent to the

extreme where they can raise huge

amounts of money rather than join Grock.

>> Yes. Please stop doing that. Um no no

but but AI is making people more

productive. So, it might be possible for

um the economy to keep ripping and for

all of the companies to continue being

very successful. We don't know. We've

never been through this before.

>> Is the war for talent insane today?

>> It's it's definitely um much more

aggressive than it's ever been in

history. Uh but only in tech. When you

look at sports, um sports have always

been insane or at least recently been

insane. Like you look back 20 years, 30

years ago in sports, the salaries looked

a lot like tech salaries.

>> Sure.

>> People are just realizing the value. The

problem is in in sports, um you have a

limited number of um teams. You have a

you know uh you might even institute a

salary cap and things like this. In

technology, we're not doing that. And

you have an unlimited number of teams,

an unlimited number of startups, right?

Just imagine if anyone could go create

their own football team. What would that

do to salaries? What would and and what

would that do to the the value of the

franchise?

>> Which incumbent are you most impressed

by and which are you most worried or

concerned for?

>> Um I would say Google has probably done

the biggest turnaround and they had a

structural advantage in that. So Google

historically has depended more on their

engineers to come up with good ideas and

as long as management gets out of the

way great things happen at Google and so

I just think from a cultural perspective

that's a systemic advantage. Um and for

for them

>> you think Gemini has been a success for

them ultimately.

>> I do. I mean you just look at the

numbers of the adoption it's been great.

>> How do you feel about the implementation

into consumer products?

um less so. I mean, you you see like

random Gemini introduction into each

product. It's like it's in Gmail, but

it's practically unusable. It's in

pretty much every product and it it

seems thrown in kind of like half

thought through. But you shouldn't judge

that yet because at least they're

getting exposure to how people are using

it and they can use that to figure out

what they should actually do. I mean,

what happened with Google Chrome, right?

Like it was originally Google TV. it was

a total flop and then they iterated and

they turned it into Google Chrome. Uh

this is the classic um problem where

where someone puts something out there,

everyone throws darts at it and you

don't realize that they're just willing

to take those darts in order to build a

better product.

>> And it's fine to take those darts as

long as the window of distribution

advantage remains. But what's

challenging is it doesn't open AI has

closed that chasm so significantly.

>> That's true. Um Google may be too late.

>> Do you see what I mean? It's like a

classic like you know can the incumbent

attain uh innovation before the startup

acquires distribution and it's like the

startup's acquired distribution 10% of

the world. It's pretty impressive.

>> Yeah. At this point it would be hard to

imagine a scenario where open AI goes

away. I just I don't see how that

happens. And so at the very least you

have two competitors from this point on

going at it. But

>> which is OpenAI and Anthropic or Open

AAI and Google?

>> OpenAI and Google. Anthropic does

something different. Anthropic is doing

coding, right? Um, OpenAI is doing a

chatbot. Google's doing a chatbot.

Google's also doing coding. Google's

doing everything.

>> Well, I mean, OpenAI is doing coding,

too.

>> That's Well, um, yes. And actually, um,

our engineers recently started using

codecs more than using the anthropic

tools.

>> Wow.

>> Yeah. And it's funny because it's almost

on a monthly basis. So, we have a

philosophy. We don't tell our engineers

what tools to use. We do tell them you

must use AI because otherwise you're

just not going to be competitive. Um but

we saw them um using source graph. We

saw them then using Enthropic. We saw

them then using codeex. Next month it'll

probably be source graph again. It just

keeps going around and around in a

circle.

>> Do any of these have enduring value then

if the switching cost is so low and if

they're just bluntly being used so

promiscuously.

>> Um our engineers are cutting edge

engineers who will switch to the best

tool the moment it's the best tool. Not

everyone is like that. A lot are like

that though.

>> A lot of a lot of the people you

interact with are like that. Enterprises

make these long-term deals and they

stick with whatever their deal they made

a year ago.

>> Would you rather invest in OpenAI at 500

billion or Anthropic at 180?

>> I'd want to invest in both.

>> Would you?

>> Yeah. They're both undervalued.

Highly undervalued. You You're still

Okay. you're still looking at them as if

they're competing in a finite market for

a finite uh outcome when they're

actually increasing the value of the

market with the more R&D that they do.

>> Play this out for me then. If we do the

bullcase for them, what does that look

like? I think the current tech companies

can increase their value significantly,

but I don't know why they couldn't

increase their value significantly while

the AI labs catch up to where those

current, you know, AI the current

technology leaders are. The Mag 7 is

going to increase in value and what's

going to happen is the the AI labs are

going to achieve the same amount of

value as the current Mag 7, but the Mag

7 is going to be more valuable. The

question is, will the AI labs overtake

the Mag 7?

>> What will determine that?

>> I don't know. I frankly, I think they're

just going to become the Mag 9, the Mag

11, the Mag 20.

>> Do you think the AI labs move very

significantly into the application layer

and subsume the majority of it?

>> That is the natural tendency of a very

successful tech company.

um they start to do what their customers

do and they move up the stack and then

they create they subsume what their

customers did and then there are new

people who build on top of them, right?

And um OpenAI uh you know I think on

your show Sam Alman said something about

how um if you're just uh doing something

like a small refinement on top of OpenAI

you're you're going to get overrun or

whatever. Um he was just being very

honest that's what they do. In our case,

um, we found an area where we will not

compete with our customers, which is we

will not create our own models.

So, we just won't do it. And by putting

that line in the sand, we're saying it's

safe to build on our infrastructure,

right? Because we're not going to go

after what you do. And that may be the

wrong call. We may find that we're

subsumed by one of our customers. Um,

but it also means that you can trust

that you can build on us. I could be

making a huge mistake on that call.

>> You could be. You would also need a lot

of cash to do that to build our own

models.

>> Yeah. And speaking of cash, how much did

you just raise?

>> So, we raised $750 million.

>> $750 million at uh what was it? 6

billion.

>> Uh yeah. Uh almost 7 billion.

>> Okay. Got you.

>> This sounds really unfair. And that's

amazing. Is that enough money?

>> It is. In fact, um we were only going to

raise 300 million. Um

uh you you brought up, you know, the

question of profitability and all that.

Um the hardware companies are are in a

good position because unlike uh these

other companies, um we actually make

money off of what we sell. So we we have

um when we sell hardware, those hardware

units actually have positive margin.

>> I thought you had negative margin.

um when we sell hardware uh now

>> versus when you sell software.

>> When we sell software, it depends on the

model. So our most popular models um on

the chip that um we're ramping up now um

are positive margin. So um but we do

have some models that we run that um

beat the opex but we're not happy with

the capex. Others would be happy with

the capex but we're more conservative.

And so it's just easier to say when we

sell hardware we have positive margin

because you know it at that moment. We

might have positive margin on on even

our least profitable models because we

just don't know how long the hardware is

going to last.

>> Like what are the margins and where do

they go over time?

>> Well, one of the benefits of being

private is I don't have to tell you.

>> You don't. But it'd be lovely if you

did.

>> It's the only advantage of being

private.

>> No, no, no. There's many, many

advantages. Um you don't have a lockup

period. You can sell much more easily.

>> Yeah, but I don't sell shares. So,

>> you've never sold a share, have you?

Never.

>> No.

>> Yeah. You clearly don't understand how

this game works. Uh, don't worry. I will

teach you.

>> Um, but like margins over time, do they

like get significantly sign like how how

do you think about that? I'm not asking

necessarily.

>> No, no. I I I'm going to say what I said

earlier, which is I want our margins to

be as low as our business remains

nonvolatile. So the like I said, the

only reason for a high margin is because

you you want to have the ability to

bring in cash when you need it. And all

you need is the ability to price higher

if you need to in order to be able to

lower your margin. The demand for

compute is so high that if someone came

to us and said, I need this compute and

we have it, they will pay a higher

margin which allows us to charge a lower

margin. Can you help me understand what

the chip market looks like in a

five-year timeline? You said there we'll

have OpenAI, we have Anthropic, will

have all the providers having their own

chip infrastructure. You'll also have

Nvidia, they'll also be What does that

look like?

>> My prediction is that in 5 years, Nvidia

will still have over 50% of the revenue.

However, they will have a minority of

the chips sold. They might have, you

know, minority share. They might have um

51% of the revenue and they might have

10% of the chips sold. Um

>> can you help me understand that?

>> Yeah, there there is huge value in being

a brand. Um you get to charge more.

However, uh it makes you uh less hungry

and you're you're going to start

charging high margins and some people

are going to pay it because no one's

going to get fired for buying from

Nvidia. it's a great place to be in.

That business is going to remain

incredibly valuable. Um, if you're

invested in Nvidia, you're probably

going to do okay. However, um, if you're

looking at it from the customer point of

view, when you have customer

concentration like we're seeing where,

you know, 35 36 customers are 90% 99% of

the total spend in the market.

They're going to make decisions less on

brand and they're going to make

decisions more on what makes their

business successful because they're

going to have more power to make those

decisions.

So, you're going to see other chips

being used because those companies are

going to have enough power to make

decisions themselves.

You said you won't do badly if you're an

Nvidia investor. One of my u friends

says the thing I love about Harry is

that you know he's wonderfully charming

but at the end of the day he goes that's

great that's great but what about me

which is very true over under on Nvidia

in a 5-year timeline 10 trillion I

personally would be surprised if in 5

years Nvidia wasn't worth 10 trillion

the question you should ask is will

Grock be worth 10 trillion in 5 years

possible.

We don't have the same supply chain

constraints. We can build more compute

than anyone else in the world.

The the most finite resource right now,

compute, the thing that people are

bidding up and paying these high margins

for. We can produce nearly unlimited

quantities of.

>> What do you think the market does not

understand about Grock that you think

they should understand?

>> Oh, it changes every month. Um, it used

to be we couldn't have um, uh, it used

to be we couldn't have multiple users.

Um, and then we demoed multiple users to

people uh, on the on the same hardware,

right? They used to think that we

>> this is because of the SRAMM structure.

>> Because of the SRAM actually here's

another one I still impressed with my

learning from last time. Thank you so

much. Yeah, I dude I learned so much

from you genuinely. I was like genuinely

learning so much. But okay,

>> the the question I get asked the most

is, isn't SRAMM more expensive than

DRAM?

>> The answer is yes. Um, SRAMM, a good way

to think of it is SRAM is inherently

three to four times as expensive per bit

inherently. Putting all the like, you

know,

>> and just for anyone who doesn't know

again, SRAMM is versus DRAM. Super

simple.

>> So, um, SRAMM, I'll keep it super

simple, but this isn't technically

accurate. SRAMM is the memory inside of

a chip. DRAM is the external memory. It

really has more to do with how you

design it. Um but but anyway, um so

SRAMM has three to four times as many

transistors or or capacitors just

transistors for SRAM than DRAM. DRAM is

a capacitor and a transistor. SRAM is

six to eight uh transistors.

And so SRAM is inherently larger per

bit, which means it uses more silicon,

therefore it's more expensive. You're

also deploying it on a more expensive

chip like a 3 nanometer chip. So it

costs you more per unit of area than

DRAM. So So there's a multiple. Maybe

it's 10 times as expensive uh per bit.

The thing is, when we're running a model

like Kimmy and we're running it on 4,000

of our chips and you're running that

Kimmy model on eight GPUs,

we're using 500 times as many chips,

which means the GPUs have 500 copies of

that model, which means they're using

500 times as much memory, which means

that their cost is higher because they

even if it the SRAM is 10 times more

expensive, they're using 500 times as

much memory in the DRAM.

So this is one of those classic problems

of looking at it from a chip point of

view rather than a system point of view.

Everything that we did was actually

system point of view and now it's world

point of view. We actually load balance

things across our data centers. We're

now at 13 data centers. We have data

centers in the United States, in Canada,

in Europe, uh in the Middle East. When

you have a worldscale distribution, you

don't just make decisions at the data

center level. We actually will have um

more um instances of some models in some

data centers with different um compile

optimizations for input or output based

on what's going on in a geography.

We may not even have an instance of a

model in a particular data center. We

may have it elsewhere and we can load

balance that. And so we're optimizing at

the world level, not at the data center

level.

What would you do if you weren't scared?

Jonathan,

>> I'll I'll rephrase that to where could I

increase risk in the business?

>> Yeah, same question.

>> And where we haven't um we could double

our our orders in our supply chain. Yes,

we have a six-month supply chain, so we

can respond to the market faster than

anyone else. Um but

>> how overweight demand are you in supply?

>> Like I said, last week someone came to

us and asked for five times our total

capacity.

Here's the only reason we don't just

completely double down on

>> if you're not a supply concern. Why

can't you just do that?

>> Because um there are thresholds. So for

example, if we had double the capacity,

we wouldn't have won that customer. They

needed 5x. So it's not enough to have

twice as much. We have to have enough.

And so if we double the capacity, do we

have enough for those customers?

>> And so what you the risk that you could

take is to what? Sorry, just

specifically,

>> we could just double um the rate at

which we're building out supply. I mean,

with this fund raise, we we ended up

raising um you know, more than twice

what we were, you know, expecting to

raise. And then we were 4x overs

subscribed over um over what we did

raise. And so we could have raised a lot

more money. It would have been more

dilutive. Um, and I'm trying to be

dilution sensitive for investors and

everyone else. Um, but on the other

hand, we could have just raised more

money and we could have just built a ton

of compute. Um, the other advantage that

we have is versus anyone else, our cost

per token, especially given a a given

speed, um, is very advantageous. So, we

know that we can charge less than the

rest of the market. um which matters

when you're trying to build these

businesses, not because people are are

spend conscious. If we lower um what we

charge 50%.

People are going to buy twice as much.

They're spending as much as they're

making because whatever they spend

increases the quality of the output.

>> Do you think about going public at all?

>> Uh our focus is purely on execution

right now. um whether or not you go

public,

you know, that's um

like that's a completely different game

than we're playing right now. Right now,

all that matters is can we satisfy the

demand for compute?

>> Why do you think Cerebrus decided to go

public?

>> Well, they recently decided not to go

public.

>> That answers that question.

Um dude, I could talk to you all day. I

do want to discuss um a quick fire

around. So I say short statement, you

give me your immediate thoughts. Does

that sound okay?

>> Yeah.

>> What's the biggest misconception about

Nvidia today?

>> That Nvidia's software

is a moat.

>> CUDA lockin is

>> Yeah. Um it's true for training, but

it's not true for inference. I mean, we

have 2.2 million developers on us now.

That's how many have signed up.

>> Wow.

>> Yeah.

>> How many do CUDA have?

>> They claim six million.

If you were founding Grock today with

Nvidia at 4 trillion and the AI boom in

full swing, what would you do

differently?

>> I wouldn't do chips

that that ship has already sailed. It

takes too long to build a chip. The the

bet that we

>> does it. So So for the chip providers

today that are coming out, we we are

seeing new chip providers come out where

they're raising like a lot of money from

good people.

>> It's too late.

>> Yeah. So the reason that I decided to go

into chips. So um I did the Google TPU

but um also before I left I set a record

on the uh best classification model like

ResNet 50 um uh with someone in Google

Brain. We we did an experiment. We we

beat everything. Um and so I could have

gone in the algorithm side. And the

reason that um and and actually when we

were fundraising I wasn't even 100% sure

that I was going to do chips. I was like

thinking maybe we we do something on the

algorithm side especially in formal

reasoning. Uh which is good that I

didn't but um the the main motivation to

go into chips was the the moat the the

temporal moat. So, a question we get

asked by VCs a lot is what prevents

someone from copying what we're doing?

And the answer to that is if you copy

what we do, you're three years behind us

because it takes that long to go from

the design of a chip to a chip in

production if you execute perfectly.

I've done um three chips now that are um

in production or ramping to production.

All three were a zero silicon. Only 14%

of chips that are taped out the first

time work the first time are a zero

silicon. So that means there's an 86%

chance each time that you're going to

have to respin it. When we built our V2

chip, we actually um uh already

scheduled a respin for it

and we ended up not having to do it

because to our shock the first one

worked. Like you shouldn't expect that.

So that 3 years is if everything goes

perfect. Nvidia um typically takes 3 to

four years per chip and they just have

multiple being done at a time.

Uh Gro is now in a one-year cycle.

So a year after our V2 is our V3 and a

year after that is our V4.

>> How do you evaluate the meteoric

reaceleration rise of Larry Ellison and

Oracle?

um brilliant business uh decisions and

and the willingness to move fast. So um

most people right now keep asking

themselves, is AI overheated? Should we

double down on this? Um they just went

for it. They're just aggressive. And

that's what it takes to win. When

everyone else is fearful, um you should

be greedy. And when everyone else is

greedy, you should be fearful. Right

now, there's a lot of fear in in in

around AI. What you're seeing though is

there's a couple of greedy, really smart

people and they're making tons of money

and it it looks like there's a lot of

greed out there. It's just a handful of

people that are moving fast.

>> Where should I be greedy and where

should I be fearful? I'm an ambassador

today. Obviously,

>> wherever there's a moat, you know, um

Hamilton Helmer, seven powers, right?

Wherever you see a moat, you should be

greedy.

>> Very few people have a moat.

>> Yeah. And especially at the stage that

you invest in. Yeah.

>> Yeah. So, you have to predict that

there's going to be a moat.

>> And if there is a moat, it's it's a

billion valuation for a preede.

>> I mean, there's a billion, you know,

valuation for a preede primote. That's

what you you should call it primote.

That's what the investors should denote

it as. Primote.

>> What have you changed your mind on in

the last 12 months?

>> Oh my gosh. Um,

I mean,

it's not so much that I've changed my

mind, it's that I've changed how much

what percentage of our business doubles

down where. So,

every month we become more focused. We

we um say yes to fewer things and what

happens is the business just does

better. So the the I would say I used to

think that the most important thing was

preserving optionality

and now I think it's focus. However, I

think having that optionality early on

was crucial so that we could play where

we would be most successful

and now it's about focus.

>> We've spoken a lot about open anthropic.

Do you think Elon Musk is able to pull

it off with Grock and Axe? Um yes,

although it it's probably going to be

different. Um when whenever a new area

emerges, a bunch of people think that

they're competing and they're not. All

of these people creating foundation

models think that they're competing for

the exact same thing. What did Enthropic

do that was brilliant? They decided to

stop competing um by doing everything

and focus on uh coding. And that's

worked great for them, right? So, uh, if

you look at, uh, XAI,

um, they have a social network and

they've integrated their chatbot with

that. I'm not going to use that chatbot

for, um, you know, solving deep uh,

analysis or or, you know, deep research

problems. I'm not going to use it for

coding. Now, they do have a coding

model, but they don't have a coding

distribution. Can they use that

distribution to get into coding? Maybe,

but then they're not going to be as

focused. So what are they doing?

Eventually the markets will diverge. Mag

7.

All of those companies uh have some

overlapping business. But the primary

business of each of those mag seven

companies is different. If you do not

differentiate, you die.

>> When you look at Google, Microsoft and

Amazon, you can buy one and you can sell

one. Which do you buy? Which you sell?

Um,

it depends on the time frame. So, in the

short term, I think Microsoft is

resetting a little bit um because of the

the OpenAI relationship. Uh, long term,

they're probably going to do fine again.

I think

>> Do you think that's a material damage to

them?

>> No, that that's why I'm saying in the

short term I I think it's going to hit

them and then the long term it's not.

>> Have they not done majestically well

from that? they have the financial

ownership of open AAI and then they have

the flexibility to use anthropic for

most of suite

>> and they've deployed an enormous amount

of compute. So if open AI diversifies

and gets their compute elsewhere, they

have that compute now. Um compute is

like gold, right? If you have it, you

you have AI. Um and then Amazon I think

um doesn't have AI DNA and they like if

you compare them. So you didn't mention

Meta, right? But um Meta and Google

always had the AI DNA and uh Microsoft

bought it with OpenAI but that bought

them time. Amazon still doesn't have

that DNA but they do have compute.

Final one. What are you most excited for

when you look forward? I like to end on

an element of positivity. What are you

most excited for when you look forward

over the next 5 to seven years? I think

the things that scare most people are

what excite me. And what I mean by that

is,

you know, everyone's afraid of what AI

is going to do. And I think there's a

good historical analogy here, which is

Galileo.

So, um, couple hundred years ago,

Galileo popularized the telescope,

right? And, um, he got in a lot of

trouble for that. And the reason he got

in so much trouble was the telescope

allowed us to see some truths and

allowed us to realize that the universe

was larger than we imagined. And it made

us feel really, really small.

And over time, we've come to realize

that while we may be small, the universe

is grand

and it's beautiful.

I think over time

we're going to realize that LLMs are the

telescope of the mind. That right now

they're making us feel really, really

small.

But in a hundred years, we're going to

realize that intelligence is more vast

than we could have ever have imagined

and we're going to think that's

beautiful.

>> John, dude, I I always end up taking

copious notes in our conversations.

Thank you so much for doing this with

me, man. So lovely to do it in the

studio and you've been fantastic.

>> Thank you.

Loading...

Loading video analysis...