Agents Over Bubbles | Stratechery by Ben Thompson
By Stratechery
Summary
Topics Covered
- Three Inflection Points Show Why AI Investment Is Justified
- Agents Finally Verify Their Own Work Without Humans
- Enterprise AI Wins Because Companies Pay for Productivity
- Microsoft Abandoning Model-Agnostic Strategy for Agents
- Rise of Agents Justifies Capex and Signals Durability
Full Transcript
Agents over bubbles was published on Monday, March 16th, [music] 2026.
There's a weird paradox in terms of AI prognostication. On one hand, you don't
prognostication. On one hand, you don't want to be the one to completely dismiss the most terrifying doomsday scenarios.
Who wants to be found out to be foolishly optimistic? At the same time,
foolishly optimistic? At the same time, there's also pressure to give credence to the possibility that we are in a bubble and all of this hype and spending is going to go belly up. While I have argued against the former, I have very
much been on board with the latter, making the case that bubbles can be good. Sitting here in March 2026,
good. Sitting here in March 2026, however, on the morning of Nvidia's GTC, I've come to a different conclusion. I
don't think we're in a bubble, which paradoxically maybe is the truest evidence we are.
LM paradigms. Over the last couple of weeks, first in the context of Nvidia's earnings and then last week in the context of oracles, I've talked about three LM inflection points. Chat GPT.
The first LM inflex plate was the November 2022 launch of Chat GPT, which hardly needs an explanation. Yes,
transformer-based large language models were introduced in 2017, and the capabilities were both impressive and growing, but underappreciated.
Started an interview series with Daniel Gross and Nat Friedman on October 2022 under the premise that there was an incredible new technology that was sorely lacking for product applications and startup energy. Needless to say,
that was entirely flipped on its head just weeks later. Chad GPT opened the eyes of the world to what LM were capable of. But the initial versions had
capable of. But the initial versions had two flaws that have stuck in many people's minds, particularly those convinced that we were in a bubble. The
first flaw is that LM frequently got things wrong and worse would hallucinate when it didn't know the answer. This
made LM feel like something of a parlor trick. Amazing when it works, but not
trick. Amazing when it works, but not something that you can count on. The
second was related to the first. Even in
that flawed state, LMS were tremendously useful, but you needed to have an idea of what to use them for, and you needed to proactively take care to manage mistakes and verify the output in case it was hallucinated. 01. The second LLM
inflection point was the release of OpenAI's 01 model in September 2024. By
that point, LM had improved tremendously, both thanks to new foundation models and also because of continued improvements in post- trainining. That meant that the stream
trainining. That meant that the stream of tokens that constituted an answer in Chat GPT or Claude were now much more likely to be right. and somewhat less likely to hallucinate. What made Owen different, however, was that it reasoned
over its answer before delivering it to you. I explained in an update at the
you. I explained in an update at the time, quote, "The big challenge for traditional LM is that they are path dependent. While they can consider the
dependent. While they can consider the puzzle as a whole, as soon as they commit to a particular guess, they are locked in and doomed to failure. This is
a fundamental weakness of what are known as auto reggressive large language models, which to date is all of them."
End quote. Reasoning models
self-evaluate. They work through an answer and then consider if the answer is correct or if they should consider other alternatives. To put it in terms
other alternatives. To put it in terms of the weaknesses I identified above, they were internally proactive in terms of managing mistakes, reducing the burden on the user to continually actively guide the LLM. And the results
were remarkable. From my perspective, if
were remarkable. From my perspective, if the brilliance of Chat GPT was in making LMS much more readable and useful, the brilliance of 01 was in making LM
reliable and essential. Opus 4.5.
Anthropic released Opus 4.5 on November 24th, 2025 to relatively little fanfare.
Then at some point in December, Quad Code with Opus 4.5 suddenly seemed to be able to do things that were never possible previously. OpenAI released GPT
possible previously. OpenAI released GPT 5.2 Codeex around the same time on December 18th, and it was similarly capable. People have been talking about
capable. People have been talking about agents for a while. Suddenly, however,
both Quad and Codex were actually accomplishing tasks, some of which took hours, and doing them correctly. That
bit about the opus 4.5 models release date is interesting. However, the key thing about agentic workloads is that they are about more than the model or using the model recursively like 01.
Rather, a critical component of making agentic workloads work is the harness, i.e. the software that actually controls
i.e. the software that actually controls the model. To put it another way, cloud
the model. To put it another way, cloud code and openi's codecs actually abstract the user away from the model.
You give instructions to an agent which actually directs the model. Critically,
the agent can also use other deterministic tools as well, which means that it can verify its results. To put
it in the context of coding, in paradigm 1, an LM would generate code. In
paradigm 2, an LM would think about the code it was generating and iterate towards a better answer. In this
paradigm, an agent directs a model to generate code, then checks to see if the code actually works, and if it doesn't, tries again, all without the user needing to be involved. In other words, many of the biggest flaws in the
original chat GPT have been substantially mitigated, at least for verifiable use cases like coding. LMS
are much more likely to be right the first time. They reason over their
first time. They reason over their results to increase their chances. And
now agents actively verify the results without humans needing to be in the loop. That leaves one flaw, actually
loop. That leaves one flaw, actually figuring out what to use these for.
The decreased need for agency. The
reason I've been writing about these three inflection points for the last couple of weeks has been to explain why it is that the industry is so compute constrained and why the massive investment in capax by the hyperscalers is justified. The first paradigm
is justified. The first paradigm required a lot of compute for training but inference actually answering a question was relatively efficient. You
simply sent the user whatever the model spit out. The second paradigm
spit out. The second paradigm dramatically increased the amount of computing needed for inference for two reasons. First, generating an answer
reasons. First, generating an answer required a lot more tokens because all of the reasoning required tokens in addition to the answer itself. Second,
the fact that reasoning made the model so much more useful meant that they were used more, which drove increased token usage in its own right. It's the third paradigm, however, that has truly tipped the scales in favor of capex expenditure
not being speculative investment, but rather badly needed investment in meeting demand that far exceeds its supply. First, generating an answer will
supply. First, generating an answer will often entail multiple calls to a reasoning model. Second, the agent
reasoning model. Second, the agent itself needs compute and that compute and the tools the agent uses is better done by CPUs than GPUs. Third, agents
are another step function increase in usefulness, which means they are going to be used even more than even reasoning models in a chatbot. It's how this third point will be manifested that I think is underappreciated. After all, far more
underappreciated. After all, far more people use chatbots than use agents. And
I would make the case that most people are not using chat bots as much as they should. It's been a question of agency.
should. It's been a question of agency.
To get the most from AI requires actually taking the initiative to use AI. I wrote in 2024's MKBHDs for
AI. I wrote in 2024's MKBHDs for everything. Quote, large language models
everything. Quote, large language models are intelligent, but they do not have goals or values or drive. They are tools to be used by, well, anyone who is willing and able to take the initiative to use them. I don't think either
Brownley or I particularly need AI, or to put it another way, are overly threatened by it. The connection between us and AI, however, is precisely the fact that we haven't needed it. The
nature of media is such that we could already create text and video on our own and take advantage of the internet to at least in the case of Brownley deliver finishing blows to $230 million startups. How many industries though are
startups. How many industries though are not media in that they still need a team to implement the vision of one person?
How many apps or services are there that haven't been built not because one person can't imagine them or create them in their mind but because they haven't had the resources or team or coordination capabilities to actually ship them? This gets at the vector
ship them? This gets at the vector through which AI impacts the world above and beyond cost savings in customer support or whatever other obvious lowhanging fruit there may be. As the
ability of large language models to understand and execute complex commands with deterministic computing as needed increases, so too does the potential power of the sovereign individual telling AI what to do. The internet
removed the necessity and inherent defensibility of complex cost structures for media. AI has the potential to do
for media. AI has the potential to do the same for a far greater host of industries. End quote. It's interesting
industries. End quote. It's interesting
to read that two years on, realize that I was writing about the latest paradigm shift well before it happened and yet feel completely blown away by that paradigm shift at the same time. That's
how big of a deal actually functional agents are. You can see them coming and
agents are. You can see them coming and yet still be amazed when they arrive.
And as one must say with everything related to AI, in a form that is the worst they will ever be. It's the
implications on agency, however, that are the most profound. Yes, you need agency to use agents, and yes, the number of people who will have that agency are probably far fewer than those who might use a chatbot. Of course, you
can make the almost certainly accurate case that chatbots will become agent managers in their own right. But the
more critical observation is that by abstracting humans away from direct model management, any one single human can control multiple agents. What this
means in terms of compute and by extension economic impact is that it actually won't require that many people with agency to drastically increase the amount of compute that is actively utilized to create products with
meaningful economic impact. In other
words, the rise of agents doesn't just mean a dramatic increase in compute, but also a narrowing of the need for widescale adoption by humans for that demand to manifest. Yes, AI still needs
agency. It just doesn't need agency from
agency. It just doesn't need agency from that many people for its impact to be profound. [music] Enterprise economic
profound. [music] Enterprise economic imperatives. Apple Focus Media in the
imperatives. Apple Focus Media in the wake of the recent MacBook Neo launch latched on to comments from Asus CFO Nick U on the company's recent earnings call, describing the $59 computer as
quote a shock to the entire market end quote. Equally interesting, however, was
quote. Equally interesting, however, was how UO sought to downplay the Neo's potential effects on that market.
Actually uh we heard about the MacBook Neo shipments coming online back in the
second half of last year. So we made some internal preparations.
But after the product officially released, we found the specs to have some limitations. For example, the
some limitations. For example, the memory is not upgradeable and it only has 8 GB of memory. So this may limit certain applications.
So I think when Apple positioned the product, it's probably focused more on content consumption um this differs somewhat from mainstream
notebook usage scenarios because in that case the Neo feels more like a tablet because tablets are mostly for content
consumption. This feels like a bit of a
consumption. This feels like a bit of a copout given just how capable the Neo's processor is and how well Mac OS operates on 8 GB of RAM thanks in part to Apple's deep integration of hardware
and software. At the same time, OO is
and software. At the same time, OO is tapping into something that is true, which is that most consumers mostly do just want to consume content, which I would add means he should be more worried about the Neo, not less. This is
why your favorite productivity application always ends up pivoting to the enterprise. It is companies who are
the enterprise. It is companies who are willing to pay for productivity because they are the ones actually paying for the workers who they want to be more productive. It's reasonable to expect
productive. It's reasonable to expect this to apply to AI as well. The most
compelling consumer applications of AI, at least in the near term, are Google and Meta's advertising businesses, which sit alongside content. By the same token, it was always unrealistic for OpenAI to think that it could convert
more than a small percentage of consumers into subscribers. That's both
why an ad model is essential and also why that won't be enough to pay the bills. It's definitely the case that
bills. It's definitely the case that most people don't want to pay for AI. It
remains to be seen if they want to use it enough to make the ad model work.
That is another way of saying that Anthropic got it right by focusing almost entirely on the enterprise market. Companies have a demonstrated
market. Companies have a demonstrated willingness to pay for software that makes their employees more productive.
And AI certainly fits the bill in that regard. What makes enterprise executives
regard. What makes enterprise executives truly salivate, however, is the prospect of AI not simply eliminating jobs, but doing so precisely because that makes the company as a whole more productive.
It's always been the case, even in large companies, that a relatively small number of people actually move the needle and drive the company forward in meaningful ways. That drive, however,
meaningful ways. That drive, however, has been filtered through a huge apparatus filled with humans who accelerate the effort in some vectors and in others. That apparatus
makes broad impact possible, but it carries massive coordination costs.
Agents, however, will tilt much more heavily towards peer acceleration, making those drivers of value much more impactful. I'm sympathetic to the
impactful. I'm sympathetic to the argument that the best companies will want to use AI to do more, not simply save money. The reality of large
save money. The reality of large organizations, however, is that the positive impact of AI will not be in eliminating jobs, but rather replacing hard to manage and motivate human cogs in the organizational machine with
agents that not only do what they are told, but do so tirelessly and continuously until the job is done. This
only makes the argument that we are not in a bubble that much more compelling.
First, all the weaknesses of LMS are being addressed by exponential increases in compute. Second, the number of people
in compute. Second, the number of people who need to wield AI effectively for demand to skyrocket is decreasing.
Third, the economic returns from using agents aren't just impactful on the bottom line, but the top line as well.
In this context, is it any wonder that every single hyperscaler says that demand for compute exceeds supply and that every single hyperscaler is in the face of stock market skepticism announcing capex plans that blow away
expectations?
This is also why the impending wave of layoffs that are going to be credited to AI shouldn't be completely dismissed as a useful cover for correcting overhiring decisions in the co era or right sizing compensation structures in the wake of
multiple contractions. That is all true.
multiple contractions. That is all true.
At the same time, it's worth considering that companies become bloated because that has long been the only way to scale. And it's hard to know at what
scale. And it's hard to know at what point the diminishing returns that come from the dreg of coordination costs and a sprawling workforce outweigh the benefits of the marginal employee. You
only find that point when you have blown past it, and it's hard to go backwards.
AI, however, not only gives the aforementioned excuse to undo that bloat, but also moves the right size point significantly towards a much smaller workforce. More and more
smaller workforce. More and more companies are not simply going to wonder if they hired too much for a pre-AII world, but also if they hired too much for a post AAI world. The most
forward-looking and futurep proof approach will likely be to cut more rather than less with the hope that those who remain have no choice but to rebuild scale with agents. After all, if they don't, dramatically smaller
companies built with AI from the beginning will soon be nipping at their heels with both smaller cost structures and more capabilities that will structurally increase over time. There's
a good chance this is going to get ugly.
I'm not advocating for this outcome.
Rather, analyzing what is probably going to happen. The economic imperatives are
to happen. The economic imperatives are going to be impossible to resist and will fuel demand for even more compute over time, further supporting the case that this is no bubble. Agents in the AI
value chain. Another important bubble
value chain. Another important bubble question is about the sky-high valuations of anthropic and open AI.
Sure, maybe all this stuff is real, but if models are a commodity, is there any profit to be made? Horus Deju raises these questions at a simco and wonders if Apple is executing the most brilliant move in corporate history.
Here is where Apple's bet becomes genius. AI models are commoditizing
genius. AI models are commoditizing faster than anyone predicted. Software
and hardware both have tendencies to commodify. Protections exist, but they
commodify. Protections exist, but they have to do with integration and distribution. DeepSync built a model for
distribution. DeepSync built a model for $6 million that matches systems costing $100 million.
Open source models now power 80% of startups seeking VC funding. The mode
these companies are spending hundreds of billions to build is evaporating. Apple
understood this before anyone else. It
didn't build its own AI model. It
licensed Google's Gemini for about $1 billion a year. Why spend $100 billion building a factory when outsourcing costs a billion? And if a better model
appears next year, Apple just switches vendors. Apple didn't miss the AI
vendors. Apple didn't miss the AI revolution.
It just bet that the winners won't be the ones who build the infrastructure.
They'll be the ones who own the customer and no one else on earth owns the best customers. I think that nearly all these
customers. I think that nearly all these assertions were defensible during the first LM paradigm. It didn't take long for multiple base models to be more than good enough for what most people use LLMs for like say cooking or basic
medical advice or as a therapist or companion. Moreover, it was reasonable
companion. Moreover, it was reasonable to expect that models of this quality would soon be able to run locally. I
made the case that this was Apple's opportunity myself back when their own models which they absolutely did try to build contrau failed to ship. The
reasoning paradigm however blew a significant hole in the local inference case. Not only do reasoning models
case. Not only do reasoning models require fast compute given the number of tokens generated but they also need exponentially more memory to accommodate much larger context windows which is the biggest limitation of local models.
Apple makes incredible chips with a compelling unified memory architecture that makes basic inference more plausible for their devices than anyone else. There's also no scenario where
else. There's also no scenario where capable reasoning models that are remotely competitive with cloud-based models are running locally in the foreseeable future. It is agents,
foreseeable future. It is agents, however, that may strike the fatal blow to Deju's argument. Specifically, I
noted above that what made Opus 4.5 compelling was not the model of release itself, but changes to the quad code harness that made it suddenly dramatically more useful. What this
means is that model performance isn't the only thing that matters. The
integration between model and harness is where true agent differentiation is found. This is a very big deal when it
found. This is a very big deal when it comes to figuring out the future structure of the AI industry and where profits will flow because profits flow away from modular parts of the value chain which are commoditized and flow
towards integrated parts of the value chain which are differentiated. Apple is
of course the ultimate example of this.
Its hardware is not commoditized because it is integrated with their software, which is why Apple can charge sustainably higher prices and capture nearly the entirety of the PC and smartphone sector profits. It follows
then that if agents require integration between model and harness, that the companies building that integration, specifically Anthropic and OpenAI, Gemini is a strong model, but Google hasn't yet shipped a compelling harness,
are actually poised to be significantly more profitable than it might have seemed as recently as late last year.
And by the same token, companies who were betting on model commoditization may struggle to deliver competitive products. The canary in the coal mine in
products. The canary in the coal mine in this regard is Microsoft. Microsoft once
fancied itself as an integrated AI provider, bringing on earnings calls about how its deep integration with OpenAI would mean sustainably differentiated infrastructure. A month
differentiated infrastructure. A month later, OpenAI nearly imploded and Microsoft pivoted, talking increasingly about models as commodities and a core AI strategy that entailed building infrastructure around models that
themselves would be interchangeable and abstracted away from Microsoft's customers. Fast forward to last week,
customers. Fast forward to last week, however, when Microsoft revealed how they will handle the potential business impact of AI reducing seats, which is a bit of a problem for their seatbased business model. The company is going to
business model. The company is going to bundle AI into a new higher tiered enterprise offering E7, which is going to cost twice as much, $99 per seat per month, as the formerly top-of-the-line
E5. That's a big increase, which
E5. That's a big increase, which Microsoft needs to justify with AI that actually makes those seats more productive. And the product they
productive. And the product they launched with the new bundle was Co-Pilot Co-work. If the co-work sounds
Co-Pilot Co-work. If the co-work sounds familiar, it's because this is basically the enterprise version of Quad Co-work, a gified version of Quad Code that the company released earlier this year.
There are important differences with the Microsoft version, including the fact that the ladder runs in the cloud and is grounded in your organizational data with all the permission and access policies that go with it. What is
crucial, however, is that copilot co-work agnostic. Co-work is an agent, which
agnostic. Co-work is an agent, which means it needs both a model and a harness, and those are two integrated pieces, not modular components. The
implications of this are significant.
Microsoft is admitting, at least for now, that delivering a truly compelling agentic product that enterprises are willing to pay for means abandoning their stated goal of being model agnostic. That by extension raises the
agnostic. That by extension raises the possibility that models are not and will not be commodities because agents require more than models. This certainly
raises questions about Apple's decision to merely license Gemini and build a harness themselves in the form of new Siri. Microsoft decided that they
Siri. Microsoft decided that they couldn't deliver a compelling product by going that route. What has Apple done to inspire faith that they can do a better job? If anything, the company's saving
job? If anything, the company's saving grace is the point that Deju ended with.
Consumers may simply not care that much about agents, in which case Apple will be fine with good enough even as Microsoft with enterprise customers who do care realizes it needs to share more margin than it might want to with
anthropic. What matters in terms of this
anthropic. What matters in terms of this article, however, is that if agents are making anthropic and open AAI, the point of integration the value chain, then the bubble arguments these companies are overvalued or that the massive investments other companies are making
on their behalf in data centers is unwarranted may not be correct. I must
in the end address my opening parenthetical. I've long maintained that
parenthetical. I've long maintained that there is no need to be worried about a bubble as long as everyone is worried about a bubble. It's the moment when caution is flung to the wind and assurances are made that this is definitely not a bubble that we might
actually be in one. And well, I think the rise of agents means we are not in a bubble. The capex is warranted and
bubble. The capex is warranted and anthropic and open eye look more durable than ever. If my declaring there is no
than ever. If my declaring there is no bubble means there is one, then so be it. [music]
it. [music] For more analysis like this, please like and subscribe and visit strategy.com and listen to the Sharptech [music] podcast.
Also, check out the Asenometry channel on YouTube to learn more about [music] the technology changing our world.
Loading video analysis...