LongCut logo

Agents Over Bubbles | Stratechery by Ben Thompson

By Stratechery

Summary

Topics Covered

  • Three Inflection Points Show Why AI Investment Is Justified
  • Agents Finally Verify Their Own Work Without Humans
  • Enterprise AI Wins Because Companies Pay for Productivity
  • Microsoft Abandoning Model-Agnostic Strategy for Agents
  • Rise of Agents Justifies Capex and Signals Durability

Full Transcript

Agents over bubbles was published on Monday, March 16th, [music] 2026.

There's a weird paradox in terms of AI prognostication. On one hand, you don't

prognostication. On one hand, you don't want to be the one to completely dismiss the most terrifying doomsday scenarios.

Who wants to be found out to be foolishly optimistic? At the same time,

foolishly optimistic? At the same time, there's also pressure to give credence to the possibility that we are in a bubble and all of this hype and spending is going to go belly up. While I have argued against the former, I have very

much been on board with the latter, making the case that bubbles can be good. Sitting here in March 2026,

good. Sitting here in March 2026, however, on the morning of Nvidia's GTC, I've come to a different conclusion. I

don't think we're in a bubble, which paradoxically maybe is the truest evidence we are.

LM paradigms. Over the last couple of weeks, first in the context of Nvidia's earnings and then last week in the context of oracles, I've talked about three LM inflection points. Chat GPT.

The first LM inflex plate was the November 2022 launch of Chat GPT, which hardly needs an explanation. Yes,

transformer-based large language models were introduced in 2017, and the capabilities were both impressive and growing, but underappreciated.

Started an interview series with Daniel Gross and Nat Friedman on October 2022 under the premise that there was an incredible new technology that was sorely lacking for product applications and startup energy. Needless to say,

that was entirely flipped on its head just weeks later. Chad GPT opened the eyes of the world to what LM were capable of. But the initial versions had

capable of. But the initial versions had two flaws that have stuck in many people's minds, particularly those convinced that we were in a bubble. The

first flaw is that LM frequently got things wrong and worse would hallucinate when it didn't know the answer. This

made LM feel like something of a parlor trick. Amazing when it works, but not

trick. Amazing when it works, but not something that you can count on. The

second was related to the first. Even in

that flawed state, LMS were tremendously useful, but you needed to have an idea of what to use them for, and you needed to proactively take care to manage mistakes and verify the output in case it was hallucinated. 01. The second LLM

inflection point was the release of OpenAI's 01 model in September 2024. By

that point, LM had improved tremendously, both thanks to new foundation models and also because of continued improvements in post- trainining. That meant that the stream

trainining. That meant that the stream of tokens that constituted an answer in Chat GPT or Claude were now much more likely to be right. and somewhat less likely to hallucinate. What made Owen different, however, was that it reasoned

over its answer before delivering it to you. I explained in an update at the

you. I explained in an update at the time, quote, "The big challenge for traditional LM is that they are path dependent. While they can consider the

dependent. While they can consider the puzzle as a whole, as soon as they commit to a particular guess, they are locked in and doomed to failure. This is

a fundamental weakness of what are known as auto reggressive large language models, which to date is all of them."

End quote. Reasoning models

self-evaluate. They work through an answer and then consider if the answer is correct or if they should consider other alternatives. To put it in terms

other alternatives. To put it in terms of the weaknesses I identified above, they were internally proactive in terms of managing mistakes, reducing the burden on the user to continually actively guide the LLM. And the results

were remarkable. From my perspective, if

were remarkable. From my perspective, if the brilliance of Chat GPT was in making LMS much more readable and useful, the brilliance of 01 was in making LM

reliable and essential. Opus 4.5.

Anthropic released Opus 4.5 on November 24th, 2025 to relatively little fanfare.

Then at some point in December, Quad Code with Opus 4.5 suddenly seemed to be able to do things that were never possible previously. OpenAI released GPT

possible previously. OpenAI released GPT 5.2 Codeex around the same time on December 18th, and it was similarly capable. People have been talking about

capable. People have been talking about agents for a while. Suddenly, however,

both Quad and Codex were actually accomplishing tasks, some of which took hours, and doing them correctly. That

bit about the opus 4.5 models release date is interesting. However, the key thing about agentic workloads is that they are about more than the model or using the model recursively like 01.

Rather, a critical component of making agentic workloads work is the harness, i.e. the software that actually controls

i.e. the software that actually controls the model. To put it another way, cloud

the model. To put it another way, cloud code and openi's codecs actually abstract the user away from the model.

You give instructions to an agent which actually directs the model. Critically,

the agent can also use other deterministic tools as well, which means that it can verify its results. To put

it in the context of coding, in paradigm 1, an LM would generate code. In

paradigm 2, an LM would think about the code it was generating and iterate towards a better answer. In this

paradigm, an agent directs a model to generate code, then checks to see if the code actually works, and if it doesn't, tries again, all without the user needing to be involved. In other words, many of the biggest flaws in the

original chat GPT have been substantially mitigated, at least for verifiable use cases like coding. LMS

are much more likely to be right the first time. They reason over their

first time. They reason over their results to increase their chances. And

now agents actively verify the results without humans needing to be in the loop. That leaves one flaw, actually

loop. That leaves one flaw, actually figuring out what to use these for.

The decreased need for agency. The

reason I've been writing about these three inflection points for the last couple of weeks has been to explain why it is that the industry is so compute constrained and why the massive investment in capax by the hyperscalers is justified. The first paradigm

is justified. The first paradigm required a lot of compute for training but inference actually answering a question was relatively efficient. You

simply sent the user whatever the model spit out. The second paradigm

spit out. The second paradigm dramatically increased the amount of computing needed for inference for two reasons. First, generating an answer

reasons. First, generating an answer required a lot more tokens because all of the reasoning required tokens in addition to the answer itself. Second,

the fact that reasoning made the model so much more useful meant that they were used more, which drove increased token usage in its own right. It's the third paradigm, however, that has truly tipped the scales in favor of capex expenditure

not being speculative investment, but rather badly needed investment in meeting demand that far exceeds its supply. First, generating an answer will

supply. First, generating an answer will often entail multiple calls to a reasoning model. Second, the agent

reasoning model. Second, the agent itself needs compute and that compute and the tools the agent uses is better done by CPUs than GPUs. Third, agents

are another step function increase in usefulness, which means they are going to be used even more than even reasoning models in a chatbot. It's how this third point will be manifested that I think is underappreciated. After all, far more

underappreciated. After all, far more people use chatbots than use agents. And

I would make the case that most people are not using chat bots as much as they should. It's been a question of agency.

should. It's been a question of agency.

To get the most from AI requires actually taking the initiative to use AI. I wrote in 2024's MKBHDs for

AI. I wrote in 2024's MKBHDs for everything. Quote, large language models

everything. Quote, large language models are intelligent, but they do not have goals or values or drive. They are tools to be used by, well, anyone who is willing and able to take the initiative to use them. I don't think either

Brownley or I particularly need AI, or to put it another way, are overly threatened by it. The connection between us and AI, however, is precisely the fact that we haven't needed it. The

nature of media is such that we could already create text and video on our own and take advantage of the internet to at least in the case of Brownley deliver finishing blows to $230 million startups. How many industries though are

startups. How many industries though are not media in that they still need a team to implement the vision of one person?

How many apps or services are there that haven't been built not because one person can't imagine them or create them in their mind but because they haven't had the resources or team or coordination capabilities to actually ship them? This gets at the vector

ship them? This gets at the vector through which AI impacts the world above and beyond cost savings in customer support or whatever other obvious lowhanging fruit there may be. As the

ability of large language models to understand and execute complex commands with deterministic computing as needed increases, so too does the potential power of the sovereign individual telling AI what to do. The internet

removed the necessity and inherent defensibility of complex cost structures for media. AI has the potential to do

for media. AI has the potential to do the same for a far greater host of industries. End quote. It's interesting

industries. End quote. It's interesting

to read that two years on, realize that I was writing about the latest paradigm shift well before it happened and yet feel completely blown away by that paradigm shift at the same time. That's

how big of a deal actually functional agents are. You can see them coming and

agents are. You can see them coming and yet still be amazed when they arrive.

And as one must say with everything related to AI, in a form that is the worst they will ever be. It's the

implications on agency, however, that are the most profound. Yes, you need agency to use agents, and yes, the number of people who will have that agency are probably far fewer than those who might use a chatbot. Of course, you

can make the almost certainly accurate case that chatbots will become agent managers in their own right. But the

more critical observation is that by abstracting humans away from direct model management, any one single human can control multiple agents. What this

means in terms of compute and by extension economic impact is that it actually won't require that many people with agency to drastically increase the amount of compute that is actively utilized to create products with

meaningful economic impact. In other

words, the rise of agents doesn't just mean a dramatic increase in compute, but also a narrowing of the need for widescale adoption by humans for that demand to manifest. Yes, AI still needs

agency. It just doesn't need agency from

agency. It just doesn't need agency from that many people for its impact to be profound. [music] Enterprise economic

profound. [music] Enterprise economic imperatives. Apple Focus Media in the

imperatives. Apple Focus Media in the wake of the recent MacBook Neo launch latched on to comments from Asus CFO Nick U on the company's recent earnings call, describing the $59 computer as

quote a shock to the entire market end quote. Equally interesting, however, was

quote. Equally interesting, however, was how UO sought to downplay the Neo's potential effects on that market.

Actually uh we heard about the MacBook Neo shipments coming online back in the

second half of last year. So we made some internal preparations.

But after the product officially released, we found the specs to have some limitations. For example, the

some limitations. For example, the memory is not upgradeable and it only has 8 GB of memory. So this may limit certain applications.

So I think when Apple positioned the product, it's probably focused more on content consumption um this differs somewhat from mainstream

notebook usage scenarios because in that case the Neo feels more like a tablet because tablets are mostly for content

consumption. This feels like a bit of a

consumption. This feels like a bit of a copout given just how capable the Neo's processor is and how well Mac OS operates on 8 GB of RAM thanks in part to Apple's deep integration of hardware

and software. At the same time, OO is

and software. At the same time, OO is tapping into something that is true, which is that most consumers mostly do just want to consume content, which I would add means he should be more worried about the Neo, not less. This is

why your favorite productivity application always ends up pivoting to the enterprise. It is companies who are

the enterprise. It is companies who are willing to pay for productivity because they are the ones actually paying for the workers who they want to be more productive. It's reasonable to expect

productive. It's reasonable to expect this to apply to AI as well. The most

compelling consumer applications of AI, at least in the near term, are Google and Meta's advertising businesses, which sit alongside content. By the same token, it was always unrealistic for OpenAI to think that it could convert

more than a small percentage of consumers into subscribers. That's both

why an ad model is essential and also why that won't be enough to pay the bills. It's definitely the case that

bills. It's definitely the case that most people don't want to pay for AI. It

remains to be seen if they want to use it enough to make the ad model work.

That is another way of saying that Anthropic got it right by focusing almost entirely on the enterprise market. Companies have a demonstrated

market. Companies have a demonstrated willingness to pay for software that makes their employees more productive.

And AI certainly fits the bill in that regard. What makes enterprise executives

regard. What makes enterprise executives truly salivate, however, is the prospect of AI not simply eliminating jobs, but doing so precisely because that makes the company as a whole more productive.

It's always been the case, even in large companies, that a relatively small number of people actually move the needle and drive the company forward in meaningful ways. That drive, however,

meaningful ways. That drive, however, has been filtered through a huge apparatus filled with humans who accelerate the effort in some vectors and in others. That apparatus

makes broad impact possible, but it carries massive coordination costs.

Agents, however, will tilt much more heavily towards peer acceleration, making those drivers of value much more impactful. I'm sympathetic to the

impactful. I'm sympathetic to the argument that the best companies will want to use AI to do more, not simply save money. The reality of large

save money. The reality of large organizations, however, is that the positive impact of AI will not be in eliminating jobs, but rather replacing hard to manage and motivate human cogs in the organizational machine with

agents that not only do what they are told, but do so tirelessly and continuously until the job is done. This

only makes the argument that we are not in a bubble that much more compelling.

First, all the weaknesses of LMS are being addressed by exponential increases in compute. Second, the number of people

in compute. Second, the number of people who need to wield AI effectively for demand to skyrocket is decreasing.

Third, the economic returns from using agents aren't just impactful on the bottom line, but the top line as well.

In this context, is it any wonder that every single hyperscaler says that demand for compute exceeds supply and that every single hyperscaler is in the face of stock market skepticism announcing capex plans that blow away

expectations?

This is also why the impending wave of layoffs that are going to be credited to AI shouldn't be completely dismissed as a useful cover for correcting overhiring decisions in the co era or right sizing compensation structures in the wake of

multiple contractions. That is all true.

multiple contractions. That is all true.

At the same time, it's worth considering that companies become bloated because that has long been the only way to scale. And it's hard to know at what

scale. And it's hard to know at what point the diminishing returns that come from the dreg of coordination costs and a sprawling workforce outweigh the benefits of the marginal employee. You

only find that point when you have blown past it, and it's hard to go backwards.

AI, however, not only gives the aforementioned excuse to undo that bloat, but also moves the right size point significantly towards a much smaller workforce. More and more

smaller workforce. More and more companies are not simply going to wonder if they hired too much for a pre-AII world, but also if they hired too much for a post AAI world. The most

forward-looking and futurep proof approach will likely be to cut more rather than less with the hope that those who remain have no choice but to rebuild scale with agents. After all, if they don't, dramatically smaller

companies built with AI from the beginning will soon be nipping at their heels with both smaller cost structures and more capabilities that will structurally increase over time. There's

a good chance this is going to get ugly.

I'm not advocating for this outcome.

Rather, analyzing what is probably going to happen. The economic imperatives are

to happen. The economic imperatives are going to be impossible to resist and will fuel demand for even more compute over time, further supporting the case that this is no bubble. Agents in the AI

value chain. Another important bubble

value chain. Another important bubble question is about the sky-high valuations of anthropic and open AI.

Sure, maybe all this stuff is real, but if models are a commodity, is there any profit to be made? Horus Deju raises these questions at a simco and wonders if Apple is executing the most brilliant move in corporate history.

Here is where Apple's bet becomes genius. AI models are commoditizing

genius. AI models are commoditizing faster than anyone predicted. Software

and hardware both have tendencies to commodify. Protections exist, but they

commodify. Protections exist, but they have to do with integration and distribution. DeepSync built a model for

distribution. DeepSync built a model for $6 million that matches systems costing $100 million.

Open source models now power 80% of startups seeking VC funding. The mode

these companies are spending hundreds of billions to build is evaporating. Apple

understood this before anyone else. It

didn't build its own AI model. It

licensed Google's Gemini for about $1 billion a year. Why spend $100 billion building a factory when outsourcing costs a billion? And if a better model

appears next year, Apple just switches vendors. Apple didn't miss the AI

vendors. Apple didn't miss the AI revolution.

It just bet that the winners won't be the ones who build the infrastructure.

They'll be the ones who own the customer and no one else on earth owns the best customers. I think that nearly all these

customers. I think that nearly all these assertions were defensible during the first LM paradigm. It didn't take long for multiple base models to be more than good enough for what most people use LLMs for like say cooking or basic

medical advice or as a therapist or companion. Moreover, it was reasonable

companion. Moreover, it was reasonable to expect that models of this quality would soon be able to run locally. I

made the case that this was Apple's opportunity myself back when their own models which they absolutely did try to build contrau failed to ship. The

reasoning paradigm however blew a significant hole in the local inference case. Not only do reasoning models

case. Not only do reasoning models require fast compute given the number of tokens generated but they also need exponentially more memory to accommodate much larger context windows which is the biggest limitation of local models.

Apple makes incredible chips with a compelling unified memory architecture that makes basic inference more plausible for their devices than anyone else. There's also no scenario where

else. There's also no scenario where capable reasoning models that are remotely competitive with cloud-based models are running locally in the foreseeable future. It is agents,

foreseeable future. It is agents, however, that may strike the fatal blow to Deju's argument. Specifically, I

noted above that what made Opus 4.5 compelling was not the model of release itself, but changes to the quad code harness that made it suddenly dramatically more useful. What this

means is that model performance isn't the only thing that matters. The

integration between model and harness is where true agent differentiation is found. This is a very big deal when it

found. This is a very big deal when it comes to figuring out the future structure of the AI industry and where profits will flow because profits flow away from modular parts of the value chain which are commoditized and flow

towards integrated parts of the value chain which are differentiated. Apple is

of course the ultimate example of this.

Its hardware is not commoditized because it is integrated with their software, which is why Apple can charge sustainably higher prices and capture nearly the entirety of the PC and smartphone sector profits. It follows

then that if agents require integration between model and harness, that the companies building that integration, specifically Anthropic and OpenAI, Gemini is a strong model, but Google hasn't yet shipped a compelling harness,

are actually poised to be significantly more profitable than it might have seemed as recently as late last year.

And by the same token, companies who were betting on model commoditization may struggle to deliver competitive products. The canary in the coal mine in

products. The canary in the coal mine in this regard is Microsoft. Microsoft once

fancied itself as an integrated AI provider, bringing on earnings calls about how its deep integration with OpenAI would mean sustainably differentiated infrastructure. A month

differentiated infrastructure. A month later, OpenAI nearly imploded and Microsoft pivoted, talking increasingly about models as commodities and a core AI strategy that entailed building infrastructure around models that

themselves would be interchangeable and abstracted away from Microsoft's customers. Fast forward to last week,

customers. Fast forward to last week, however, when Microsoft revealed how they will handle the potential business impact of AI reducing seats, which is a bit of a problem for their seatbased business model. The company is going to

business model. The company is going to bundle AI into a new higher tiered enterprise offering E7, which is going to cost twice as much, $99 per seat per month, as the formerly top-of-the-line

E5. That's a big increase, which

E5. That's a big increase, which Microsoft needs to justify with AI that actually makes those seats more productive. And the product they

productive. And the product they launched with the new bundle was Co-Pilot Co-work. If the co-work sounds

Co-Pilot Co-work. If the co-work sounds familiar, it's because this is basically the enterprise version of Quad Co-work, a gified version of Quad Code that the company released earlier this year.

There are important differences with the Microsoft version, including the fact that the ladder runs in the cloud and is grounded in your organizational data with all the permission and access policies that go with it. What is

crucial, however, is that copilot co-work agnostic. Co-work is an agent, which

agnostic. Co-work is an agent, which means it needs both a model and a harness, and those are two integrated pieces, not modular components. The

implications of this are significant.

Microsoft is admitting, at least for now, that delivering a truly compelling agentic product that enterprises are willing to pay for means abandoning their stated goal of being model agnostic. That by extension raises the

agnostic. That by extension raises the possibility that models are not and will not be commodities because agents require more than models. This certainly

raises questions about Apple's decision to merely license Gemini and build a harness themselves in the form of new Siri. Microsoft decided that they

Siri. Microsoft decided that they couldn't deliver a compelling product by going that route. What has Apple done to inspire faith that they can do a better job? If anything, the company's saving

job? If anything, the company's saving grace is the point that Deju ended with.

Consumers may simply not care that much about agents, in which case Apple will be fine with good enough even as Microsoft with enterprise customers who do care realizes it needs to share more margin than it might want to with

anthropic. What matters in terms of this

anthropic. What matters in terms of this article, however, is that if agents are making anthropic and open AAI, the point of integration the value chain, then the bubble arguments these companies are overvalued or that the massive investments other companies are making

on their behalf in data centers is unwarranted may not be correct. I must

in the end address my opening parenthetical. I've long maintained that

parenthetical. I've long maintained that there is no need to be worried about a bubble as long as everyone is worried about a bubble. It's the moment when caution is flung to the wind and assurances are made that this is definitely not a bubble that we might

actually be in one. And well, I think the rise of agents means we are not in a bubble. The capex is warranted and

bubble. The capex is warranted and anthropic and open eye look more durable than ever. If my declaring there is no

than ever. If my declaring there is no bubble means there is one, then so be it. [music]

it. [music] For more analysis like this, please like and subscribe and visit strategy.com and listen to the Sharptech [music] podcast.

Also, check out the Asenometry channel on YouTube to learn more about [music] the technology changing our world.

Loading...

Loading video analysis...