2025 State of Foundation Models
By Innovation Endeavors
Summary
Topics Covered
- Self-Supervised Learning Scales Data Exponentially
- Frontier Models Depreciate in Weeks
- Inference-Time Scaling Unlocks New Frontier
- Build AI Systems Not Single Models
- AI Reshapes Organizations to Generalists
Full Transcript
Hi everyone, my name is Davis Tribe. Uh
I work at Innovation Endeavors. I'm an
investor on the team here. We're an
early stage venture capital firm that primarily invests in very technical founding teams solving hard problems in data science and engineering. Um today
we're excited to give you an overview of kind of the state of foundation models in 2025. And so we'll aim to give you a
in 2025. And so we'll aim to give you a holistic overview of everything that's happening and how we got here. In terms
of what we'll cover today, we'll start with a brief history of kind of what took us here over the last five five years or so. We'll then talk a lot about the model layer and what's happening there. From there, we'll move more into
there. From there, we'll move more into the application layer, and we'll talk more about where we're seeing key use cases of foundation models, as well as tips, tricks, and observations on what it looks like to build foundation modelbased products. Um, from there,
modelbased products. Um, from there, we'll move more into market structure, market dynamic, and some of the economics around what's happening in the foundation model category right now. and
we'll finish with some observations on what we might expect moving forward. And
so with that said, let's get into it. Um
so as mentioned, let's start by kind of setting the stage. And my goal here is to give you a quick overview of what happened over the last five years to get to the point where we are today. Um
there were really two key technical insights that kind of ushered this current technology wave. The first was a data insight. This uh uh technique of
data insight. This uh uh technique of self-supervised learning. This is an a
self-supervised learning. This is an a way to scale data in machine learning.
And so the key idea is actually quite simple. You look at a bunch of latent
simple. You look at a bunch of latent data that exists, for example, on the web, in this case, sentences. And all
you do is you split up that data in different ways. And so here I have an
different ways. And so here I have an example of a sentence that perhaps you split in half. And what you can see is that when you do this, you kind of create an implicit input output or input and labeled output pair. And so then if
your task for a model is given an input, predict the output. you've created an implicit piece of labelled data without requiring any degree of human annotation, human labor, or these other things that were traditional bottlenecks
in scaling machine learning data. And so
with this technique, it was suddenly very possible to create massive amounts of kind of implicitly labeled data. The
second key technique or observation was an architectural one, the attention architecture. If you've heard the word
architecture. If you've heard the word transformer, most often it's referring to a model that uses this attention architecture. And while I won't go into
architecture. And while I won't go into all the detail, the key insight here was one of how to scale compute. um
specifically attention the attention architecture is highly parallelizable and this made it much more efficient to scale up compute to very large degrees especially on top of GPUs without it requiring a huge amount of time or cost
and so as you can see with these two techniques we had a way to scale data we had a way to scale compute and so we started scaling models from that and what researchers started to observe is that as you scaled models to larger and
larger degrees you started to see emergent behavior emerge and so here I have a couple graphs of different behavioral traits you might ask a model to do. So on the left we have for
to do. So on the left we have for example how well a model performs certain arithmetic tasks. In the middle we have a graph of how a model might perform certain natural language understanding tasks. And the x-axis here
understanding tasks. And the x-axis here is how much compute how large the model is. And the y- axis is the accuracy on
is. And the y- axis is the accuracy on that task. And what you can see is that
that task. And what you can see is that for a lot of these tasks for a long time performance does not improve at all.
It's basically at zero. And that in a certain scale the performance jumps up.
And so this was really surprising right?
The model doesn't learn. it seems to not learn for a long time and then all of a sudden it can do something that it couldn't do before. And so researchers were surprised by this, intrigued by it and started pushing scaling further. And
so from that first transformers paper there was an insane increase in model scale over the next five six years. Uh
inc in fact this graph kind of visualizes that where in three years there was a 15,000x increase in the scale or the size of frontier models. If
you compare that to Moore's law, the green line here, which was doubling only every two years, but propelled the semiconductor industry forward for the last half a century, you can get a sense of why this started to be exciting. And
what this led us to as we continued scaling was essentially the fastest rate of technology adoption of a new technology of all time. Uh, Chad Shippet is now at almost 500 million weekly
active users. Almost a billion people
active users. Almost a billion people use AI on a monthly basis nowadays. And
on the right here, I have a fun graph.
It shows uh how long it took different technologies to reach a 100 million users. So for example, electricity took
users. So for example, electricity took 46 years, television took 26 years, the internet took seven, it only took Chat GPT 60 days. Um and what's interesting
is that we have not only now seen technology increase in this space, but revenue now as well. Um so I have a couple of illustrative examples here.
GitHub Copilot reached 400 million ARR after only three years since launch. Now
maybe that's a slightly unfair example because GitHub obviously had a lot of existing distribution. But then you look
existing distribution. But then you look to some of these newer startups like Midjourney and Cursor which hit between 100 to 200 million ARR in just a year or two and critically with somewhere
between 20 to 40 employees each. This is
unprecedented in the history of technology and is a lot of where the excitement around AI is starting to come from. What we also see is that if you
from. What we also see is that if you look at the technical metrics that matter in this space, they're also all following exponential curves. So, a
couple of examples here. The context
window of Frontier models, which is essentially how much data they can reason about when they're making an inference, has gone up somewhere between 100 to 500x over the last year and a half, depending on how you measure it.
The cost per token for a GPT4 level model. So, if you fix quality over the
model. So, if you fix quality over the last year and a half, for a given quality, the cost has reduced by over a thousandx. And then if you look at how
thousandx. And then if you look at how much compute is used to train frontier models. So this is essentially
models. So this is essentially correlating with the size or the amount of compute put into making them what they are. That has also gone up a
they are. That has also gone up a thousandx over the last year and a half.
And these are just a couple of examples.
But what's interesting about this space is that basically every metric follows the same kind of super exponential curve.
This is a fun variation of kind of what that means. Right? So there's a bunch of
that means. Right? So there's a bunch of ways you can benchmark foundation models, right? You can think of these as
models, right? You can think of these as like quizzes or exams for getting a model to try to do something. So for
example, you might come up with a science reasoning exam, a grade school math exam, a general reasoning exam.
I've plotted a lot of these in the graph here. And what you can see is that the
here. And what you can see is that the LM rate of improvement is so significant that it essentially beats all benchmarks almost as soon as we can come up with them. They all get saturated, right? And
them. They all get saturated, right? And
so one of the most interesting things about the space is we can hardly even come up with ways to effectively measure large language models because of how quickly they're improving. And now
indeed they can score almost perfectly on even professional level exams in areas like mathematics, science, and philosophy. Um, another interesting
philosophy. Um, another interesting viewpoint on how dramatic the rate of improvement is is looking at how long of a task that something that is for something that a human can do, an LLM
can reliably achieve. So this is a graph that's showing you on the x- axis the um model release date and on the y- ais the duration of tasks that a model can
reliably do at at least a 50% accuracy rate. And what you can see is that in
rate. And what you can see is that in 2019 2020 models could often only do tasks that a human might take somewhere between a couple seconds maybe 10 seconds to do. Um this is doubling every
seven months and indeed LLMs can reliably do tasks that might take a human an hour or multiple hours. If you
play this out, it's likely that LLMs will be able to reliably automate tasks that take days or even months over the next couple years, which is really unbelievable, right? Um, and here's a
unbelievable, right? Um, and here's a couple examples of what that reasoning capability improvement looks like in practice. On the left here, I have a
practice. On the left here, I have a graph from a paper called towards conversational diagnostic AI where they were computing uh comparing fine-tuned large language models versus doctors on various diagnostic tasks. Right? A
patient comes in presenting with certain conditions. What should you do next?
conditions. What should you do next?
What you actually see is that LMS now outperform doctors in many diagnostic tasks. Uh on the right here I have a
tasks. Uh on the right here I have a more math example. Um LMS can now solve geometry problems more than almost all humans on earth including the best mathematicians. This is from a paper
mathematicians. This is from a paper called alpha geometry. And so these are just fun examples. But what's very clear is that as we push the scaling further, LMS are now effectively becoming the
best in the world at almost all mainstream subject areas where we've traditionally considered humans especially strong. Um, and while most of
especially strong. Um, and while most of what I've talked about thus far has been oriented towards language, all of these seem same laws and scaling curves apply to other modalities. And so here's just
one example from the image diffusion space. Um, a couple of years ago on the
space. Um, a couple of years ago on the left, the imagining model from Google Deep Mind was the best in the world. It
was the Frontier. As you can see, it kind of looks like maybe a high school kids drawing or something like that. Um,
on the right, we have a more recent example from a startup called Visual Electric. And what you can see is that
Electric. And what you can see is that it's essentially indistinguishable from high-end photography. And so all these
high-end photography. And so all these same laws and scaling curves are applying to other modalities, which I'll get into more in a second. And so the main thing I want you to take away from all of that is that we figured out how
to scale models. We've pushed that to the max over the last five years, and we're starting to not only see real revenue and real companies being built on top of that, but more critically, we're seeing the capabilities of these
systems so far exceed what I think even the smartest researchers thought possible. And so with that, that takes
possible. And so with that, that takes us to where we are today. And the rest of the presentation is going to talk more about what's happening right now and what we might expect to happen over the next couple years. And so let's start with the model layer because
that's really the most critical. So
first, let's talk about cost for a second. Um, the training costs for
second. Um, the training costs for Frontier Foundation models are unbelievable. Uh, a leading model
unbelievable. Uh, a leading model conservatively now costs over $300 million to train and I'm not including any labor costs associated with that or data costs associated with that. Um,
here you can see a graph kind of showing how these costs have increased over time. So, GPT3, which is really the chat
time. So, GPT3, which is really the chat GPT moment, right? Um, that model was trained in 2020 and cost only about $5 million to train. And since then the si the cost of these models has gone up in
a very consistent exponential fashion from 10 million to 100 million to 200 million and now over 300 million for frontier models. What's interesting
frontier models. What's interesting though is that in some sense these are the fastest appreciating assets of all time. Frontier models are typically
time. Frontier models are typically depreciating or becoming commodity on just a 6 to 12month time scale. So here
I have a specific example. So GPT4 was released in March 2023. It cost about $100 million to train and it was of course a closed source model from OpenAI. DeepseekVL
OpenAI. DeepseekVL which was a model of very similar quality was released exactly a year later was open source and cost less than $10 million to train. The graph on the
right here is comparing DeepS versus GPT4 on a number of mainstream benchmarks. And what you can see is
benchmarks. And what you can see is indeed it performs almost identically.
And this is really the story of the foundation model space, right? A re a leading lab will now spend 300, 400, $500 million to train a new model, but within a year, there's an open source
version that's just as good. And so that creates very interesting market dynamics in the model space. Um, kind of playing off of that, what we see is that open- source continues to converge with closed source in this space. So what you see
here is a graph comparing the performance on a bunch of benchmarks for closed source models which are the light blue on top versus open source models which are the dark blue on the bottom.
And what you can see earlier on in 2023 and similar you know there was some degree of divergence in these models.
But as time has gone on uh the time lag between the best closed source models and the best open source models has gotten tighter and tighter and tighter.
And so in some ways this is a different lens on the point that I expressed in the last uh in the last slide and it gives you a sense of uh why some of the model providers are trying to figure out so much how do you go beyond just
becoming a model company and be more of an application layer company.
Um another interesting metric is so this is a graph of how long top models stay in the top model set and so there's a model proxy that a lot of developers use called open router that they use to make
calls to different model providers. And
if you look at the data from open router, the data is really interesting.
So this is a histogram of for a given new model that's in the top five of open router. How long does it stay in the top
router. How long does it stay in the top five? And so what you can see here on
five? And so what you can see here on the right is that there's a couple of models that last, you know, 30 plus weeks, 20 plus weeks, but the median time a model stays in the top five is just three weeks. So think about that.
You spend 300 $400 million to develop a new model and after a couple of weeks there's something better and people have moved on. It's really there's no
moved on. It's really there's no precedent for this in the history of technology.
So far, I've only talked about the cost of compute to train models. But what's
also important to consider is the data required to train models. And data
budgets are also insane as well. Um, so
I'll give you a couple of examples.
DeepMind spends over a billion dollars a year on data annotation, data labeling.
Um, OpenAI jointly for training and data spends about three billion in total, a huge portion of which is data. Um for
Llama 3 in particular, Meta spent over 125 million just on post-training data.
So again, not compute, just data. And um
on a more micro level, if you are a professional in an area like uh law or healthcare, if you're a doctor, OpenAI will actually pay you somewhere between $2,000 to $3,000 for a single reasoning
trace, which is really crazy. Um and so if you combine this all up, I have on the right here a graph that gives you a roughly illustrative spend of a frontier model. You might be spending somewhere
model. You might be spending somewhere between 150 to 300 million for training the base model. You're probably spending somewhere between 50 to maybe 150 for the post-training and then the data
itself is an additional you know 50 100 150 million. And so all in you're very
150 million. And so all in you're very quickly hitting 500 million plus for frontier models.
On a different lens um something else that's interesting to explore right now is that there's a little bit of a shift away from this idea of just how do we scale up parameter count to the max. And
so this is a graph of the number of parameters of frontier models. What you
can see is that kind of playing off of that graph I showed you earlier, the number of parameters has gone up in a super exponential rate, but more recently has come down a little bit. And
the reason for this is actually kind of interesting. I think it reflects a move
interesting. I think it reflects a move from this being more of a research denominated space to a more of an application denominated space. Large
models are more efficient. You can use less money to make them more powerful, but they're much harder to serve to users because they take a longer time to compute something and it costs a lot more data, a lot more money for them to
compute something. And so more recently,
compute something. And so more recently, there's been a lot more effort to oversaturate models. So to use less
oversaturate models. So to use less parameters, but train for a much longer amount of time. This makes training less efficient, but it makes serving a lot more efficient. And so it's likely that
more efficient. And so it's likely that this trend is going to continue in terms of you want kind of small to medium-sized models that have huge amounts of data distilled into them.
So this plays onto a broader trend that's perhaps the most important trend in the AI research world right now and that's pre-training as we know it. So
that initial curve I showed you of just scaling the base model as far as you can is kind of coming to an end. And the
reason for this is mostly a data one, right? So we had this technique of scour
right? So we had this technique of scour all the data on the web, turn it into this simplicity labeled data and then train on that. The problem is that there's only one internet and we're kind of running out of data. And so here you have a a slide from a presentation by
Ilia Sudskver who's one of the most famous AI researchers in the world who's essentially saying this right we keep going growing compute but we're running out of data and so the key question in large language models right now is what
comes next how do we keep scaling beyond just pre-training right and so there's a couple of ideas of what you might be able to do uh one path is that a lot of labs are taking and is certainly important is using more synthetic data
rather than real data this is important but it's not the only thing um second is we can build more complex systems, right? So maybe the models don't get
right? So maybe the models don't get better, but we just combine different models in different ways to build powerful products. That's definitely an
powerful products. That's definitely an important direction, but it doesn't really touch on the research side. And
the last, which is what people are most excited about right now, is inference time scaling or so-called reasoning models. And so let's spend a little bit
models. And so let's spend a little bit of time talking about those. Um, so a lot of researchers think that this inference time compute or reasoning models are really the new frontier. The
idea of these is actually really simple.
So here's a basic visual. Imagine you're
asking a model a complicated question, right? So, for example, what's the
right? So, for example, what's the implication of the new Canadian prime minister on foreign exchange rates, right? And what you do is you train the
right? And what you do is you train the model to not answer right away, but to instead think for a long period of time.
And so, I've visualized this here to the right where the model actually develops an internal monologue where it talks to itself or kind of thinks to itself for a very long time before it answers. And so
in this example, it may think for five minutes, coming up with a plan, identifying different things it needs to consider, and then after it synthesizes all of that, it gives you the answer.
And so to you as a user, it looks like it wrote a very small answer, but it actually output thousands, if not tens of thousands of different tokens. What's
interesting is that it seems like this approach of thinking before answering is actually a new type of scaling law. So
on the left here I have a really famous graph from OpenAI that conveys a lot of what I discussed in the earlier part of this presentation. Right? As you train
this presentation. Right? As you train models for a longer period of time with more data that's the X-axis the quality of those models goes up in a reliable fashion. That is the story of the last
fashion. That is the story of the last five six years right on the right I have a different graph. This is showing you on the x-axis the longer that you think
the better the quality is in a reliable scalable fashion. Right? And so this
scalable fashion. Right? And so this excites researchers a lot because it means that there's now a second exponential curve that we can start to play off of this idea of think longer,
you get better answers, right? Um and so here's an example of this, right? Um
small reasoning models can outperform massively mod larger models if they have enough time to think. So if you look on the top here, uh look at the orange line. This is a small three billion
line. This is a small three billion parameter model, but it's trained to be a reasoning model. The dotted line on top is a 70 billion parameter model that is not trained to be a reasoning model.
What you can see is that when the orange model doesn't have enough time to think, it's way worse than the 70 billion parameter model. But if it thinks for
parameter model. But if it thinks for long enough, it actually surpasses the 70 billion parameter model in spite of the fact that that model is 20 to 30 times larger. And so this gives you
times larger. And so this gives you maybe a more implicit sense of why this is so exciting. So how do you develop regime models? Right? There's really two
regime models? Right? There's really two key ways. The first is that you just
key ways. The first is that you just generate a lot of this form of reasoning data. You find a lot of examples of
data. You find a lot of examples of people outlining their thought process in complex domains. You might pay for this data. You might synthetically
this data. You might synthetically generate it in areas like mathematics or you can kind of train a special type of verifier or reward model that can guide a complex reasoning trace and you use
that as training data to train the models. The second approach is more of a
models. The second approach is more of a systems approach uh where you use a search technique at inference time. And
so in this case when the models generate an answer it generates part of an output. You have a second model that
output. You have a second model that says go more in this direction or think more in this direction and they kind of go in a loop right um and so the output of the first is you basically have a
model that's thinking to itself for a really long period of time in a large trace of tokens. That's kind of the example that I showed you earlier. And
in the second, you can think about it more as there's a secondary control system at inference time that's mediating the back and forth between a model and a verifier and that lets it think for a long period of time. Both of
these can work and both are interesting though I would say most models probably look like the former right now. So let
me give you an example of this. Some of
you guys may have used 01 pro. What's
actually happening here is you are taking a base reasoning model that's trained with that first method that I talked about and then you're sampling four different generations of that base reasoning model and then you have a
different model a verification model that says hey I think this is the best of those sampled outputs to you as a user this is all hidden but you end up with a really really good answer that's much better than what you might get with
a base model and so as you can see bridging off of this I've touched in the last couple examples of this idea of a verifier model or a reward model. This is a specialized type of large language model
that is taught to verify things, right?
And there's actually two types of verifiers. You can think of procedural
verifiers. You can think of procedural verifiers. These actually aren't large
verifiers. These actually aren't large pre-trained models, but they're domain specific ways of verifying a problem.
For example, in code, you can compile code. In math, you can use these theorem
code. In math, you can use these theorem provers that kind of show you whether or not a math proof is valid or inconsistent. And then on the right you
inconsistent. And then on the right you have learned verifiers where you basically are taking a base large language model and training it to do this form of verification or reward.
Right? The procedural verifiers tend to be more accurate in their domain but they're definitionally domain specific.
Right? The learn verifiers are definitionally more general but are typically less accurate in a given domain. And so a lot of the exploration
domain. And so a lot of the exploration right now in the ML world is how do you combine these two approaches or can you build very high quality generalist verifiers? But either way, this idea of
verifiers? But either way, this idea of verification and reward is becoming absolutely essential to develop these forms of reasoning models.
So, let's then talk for a second about context windows. Um, you know, a lot of
context windows. Um, you know, a lot of people are excited by the idea that these recent models have huge context windows in theory. There's a number of models out there that say they have 1 million, even 10 million token context
windows, right? As a recap, this is how
windows, right? As a recap, this is how much data the model can consider when it makes an inference. Um, and so yes, it's great that conduct windows are growing dramatically, but there is a little bit of a gotcha in this space. Um, and so
here I have an example from the Llama 4 model, which espouses to have a $10 million 10 million uh token context window. But if you actually look at the
window. But if you actually look at the paper for Llama 4, what they say is that we mostly only trained with context at 256k tokens. Um, and then they show a
256k tokens. Um, and then they show a very very simple eval of only retrieving a single piece of data from the context.
The problem is that in reality, most real world problems, real world questions, you need to reason over many pieces of data in the context in a complicated way. And so, not only is
complicated way. And so, not only is this eval actually not very reflective of real world use cases, but of course, if you never actually had data to make use of a 10 million uh token context window, you actually have no idea if the
model can actually make use of it. And
so there's this interesting dichotomy in the space where context windows are growing and that is great but good founders, good people building products in the space can't actually use them the way you might think.
Another interesting aspect of the model category right now is tokenization. Um
and so to give some context here when a model takes the input language that it gives you and has to figure out how to turn that language into a set of discrete tokens, right? And so on the right here I have some examples of let's
say you give the word egg to a model.
You've got to split that up in some way.
And so some models will split it up where it's like eG is one token, the period is another. Other models will say E is a token and then GG is a token and the period is a token. There's a lot of different ways to do it. The problem is
that when you tokenize words in this way and then you ask the model things that are directly related to the structure of the language, the models then get really confused at how to answer the question.
So many of you all have likely seen that while these models are brilliant, in many cases they get basic arithmetic questions wrong. That's because if you
questions wrong. That's because if you feed it something like 3.11 or 2.978, it'll split that up in a way that the model doesn't understand that that's a single numeric value. And so I have a
funny kind of blog post from Andre Kaparthy who's another famous uh scientist in the space in the left who's basically like tokenization is the heart of all weirdness in LLMs. Why can't they spell words? Tokenization. Why can't
spell words? Tokenization. Why can't
they do arithmetic tokenization? what is
the real root of all suffering tokenization. So there's a lot of desire
tokenization. So there's a lot of desire to try to figure out how do we fix tokenization in models to help them improve on these kind of language manipulation tasks but I think it's still not well understood exactly how
you might do that. Another really
interesting research direction right now is the world of mechanistic interpretability. So a big question in
interpretability. So a big question in foundation models is you have these huge models, you can ask them any question you can think of. Um how are they coming up with the answers that they give you, right? And can we understand the thought
right? And can we understand the thought process maybe to help better understand when the models are hallucinating versus when they're giving you a well thought through answer, right? And so Anthropic and a number of other model labs have
actually started to push this idea of mechanistic interpretability a lot, which is can we start to analyze how a neural network activates to understand what is its thought process when it gives us an answer. And so here's
actually a really cool example that Anthropic recently gave. And what they did is they identified a way to extract the neurons from the uh neural network that's being activated and to identify
which combinations of neurons in which order represent a given concept. And so
here is a visual of one example of this.
They identified a set of neurons that they thought pertained to the Golden Gate Bridge in San Francisco. And what
they showed is that any input that was given to the model that included or referenced the Golden Gate Bridge or even related concepts like San Francisco in some way activated this set of neurons that they thought related to the
Golden Gate Bridge. Taking it a step further, what they then started to do is ask questions that had nothing to do with the Golden Gate Bridge, but they would artificially amp up the neurons that they thought related to the Golden Gate Bridge. And interestingly, when you
Gate Bridge. And interestingly, when you do that, the model will answer the question, always answering in some way related to the Golden Gate Bridge. So,
you give it a math question, it'll give you an example related to the Golden Gate Bridge to answer the math question.
And so, this is broadly starting to be described as the idea of model steering.
And while the Golden Gate Bridge examples may be fun or silly, you might imagine that let's say you have a code focused model and you understand the set of neurons that relate to logical thinking, arithmetic thinking, things
like that. you might want to bump those
like that. you might want to bump those neurons up when you query the model for questions related to coding. And so this is still early research, but I think it's a really exciting area.
So beyond language, uh the other kind of big direction that matters in the space right now is multimodality. Over the
last couple years, vision language models or VLMs have really gained steam.
If you use any of the products like claude or chatbt now you can input kind of an arbitrary combination of text image video audio and it'll synthesize it all together and give you an output right and so that's cool and the way
that works actually pretty simple you just embed all of these modalities in a single kind of uh latent space and then you feed it to a pretty standard kind of language model output to give you language. What's earlier is the idea of
language. What's earlier is the idea of an omnimodal model. And so the difference here is imagine a model that can not only take any combination of these inputs but output any combination of those outputs in an interled
structured fashion. And so on the right
structured fashion. And so on the right here I have a um extract from a paper from meta that was basically showing you know asking a question about some birds.
And what you can see is that the answer kind of interleaves pictures of birds, answers about birds, more pictures of birds etc. Building models like this is still not well understood and is a bit tricky. It's really hard to get data
tricky. It's really hard to get data that reflects this well. And so this I think it has a lot of uh respects what you might expect to happen over the next couple years in omni modality.
There's also a lot of work right now in alternative architectures, alternative architectures, right? And so I talked
architectures, right? And so I talked earlier about the transformer architecture and that's really still the basis of a lot of these models, but there's some really cool explorations on the FA space now of other directions you
might want to go. And so really briefly some examples. There's a concept known
some examples. There's a concept known as state space models, which is you take attention, but you kind of relax it a little bit so that you can better handle very large context windows. This matters
a lot in some domains like audio. And
there's a pretty cool company called Cartisia that's pursuing that direction.
Um, another uh architectural trend is what's known as flow matching models.
And so this is a generalation of generalization of a diffusion model where that allows for more efficient learning than diffusion has traditionally allowed. Stability AI is
traditionally allowed. Stability AI is pushing this direction pretty hard. Um,
inductive moment matching is another interesting idea. It's a diffusion model
interesting idea. It's a diffusion model that allows you to make better use of predicting where you need to diffuse to.
And so it lets you create much faster results than you might otherwise expect.
Luma, which is one of the leading image and video companies, is exploring that.
And then finally, there's some really cool work in discrete diffusion models where you're using diffusion, which is traditionally reserved for images and videos, but actually applying it to language. And that's a bit non-intuitive
language. And that's a bit non-intuitive because language is a very discreet space and diffusion has traditionally been been considered a continuous process. Um, but it turns out it works.
process. Um, but it turns out it works.
And so Inception is a cool company working on that direction.
Touching on image models a little bit more. You know, I gave you that example
more. You know, I gave you that example earlier of how quality has increased a lot. But what I wanted to emphasize is
lot. But what I wanted to emphasize is that it's not just that quality has gotten better. Just like how in language
gotten better. Just like how in language we've seen a lot more sophistication in what the models can do, there's also a lot more precision and control in image models today. And so here's a couple of
models today. And so here's a couple of examples of this. On the left, um, many of you all probably saw this Giblify trend in OpenAI where you can upload an image, ask it to turn it into a studio
Gibli style image, and it will do this kind of giblication of the image. Um,
what's interesting is this is actually a really complicated use case. If you
think about it, you're giving it an image and what you're implicitly saying is, I want you to maintain all the structure, all the components, all the characters, just change the style. A
couple of years ago, an image model couldn't have gotten even close to this.
And you could have achieved it, but you had to train a bunch of specialized models and combine them in a really complicated way. Now, it just works. Um,
complicated way. Now, it just works. Um,
on the right, I have another example of this, which is a couple years ago. If
you asked for any kind of image that had text inside of it, it would be a disaster. Um, now you can not only get
disaster. Um, now you can not only get text and images, but the text can fit into the style of the image, which is really remarkable. And so again, just as
really remarkable. And so again, just as we've seen this improvement in the reasoning capabilities of language models, you've seen similar quality improvements in areas like images and of course video.
On the topic of video, I really think that we're about to hit the chat GBT moment for video. This is an example from Google's VO models, which are their newer video models. And what's
remarkable is that the quality is starting to get imperceptible from a real human video. For example, that dog video in the top right. Um, you
basically saw the water reflections, the light refracting in the water, these things that are crazy for a video model to do. And so, we're actually starting
to do. And so, we're actually starting to see a ton of startups being built around this now when a year ago, two years ago, it was still in that uncanny valley moment.
Robotics is another area where there's a lot of exciting work happening. This is
a video from Physical Intelligence, which is one of the robotic foundation model companies. And basically, they
model companies. And basically, they asked the robot, can you make a certain type of roast beef and cheese sandwich?
And what you can see is that the robot is correctly assembling the roast beef and cheese sandwich. However, the robot has never been trained specifically to do this task, and it's never seen this precise environment before. The reason
it can do this is because it's making use of vision language models that have such a latent understanding of the world and how the world operates that the robot is able to come up with a plan even in an environment that it's never
seen. If you have any background in
seen. If you have any background in robotics, this is crazy because even just a couple years ago, you had to do so much task specific and environment specific training for any type of
application or use case. And this is why people are really excited about robotics right now.
There's also uh some cool work in an area known as world models. And so I'll play this video. And what I want you to pay attention to is it's not only a video that's being generated, but you can see in the bottom right of the video
that there are keys getting pressed on a keyboard kind of like a video game, right? And so what's actually happening
right? And so what's actually happening here is that someone is controlling with a keyboard and mouse the character. And
the character is moving through the world just like a video game, but it's not a video game. There's no 3D model.
There's no Unity or Unreal Engine. It's
just doing frame by frame video uh prediction, but each next frame is conditioned on the current controls right now. And so what's cool is you can
right now. And so what's cool is you can actually create like dynamic video games from video prediction models in a way that it like the video game is consistent. It has physics concepts.
consistent. It has physics concepts.
It's really crazy. Someone's actually
released a Minecraft or a version of Minecraft that works this way and it doesn't work perfectly, but it actually plays far better than you might expect.
For now, these techniques are mostly being used to generate data for areas like robotics, but it's very likely that a lot of entertainment and media in the future has this kind of dynamic generation rather than being all static,
pre-rendered, pre-ompiled, things like that.
There's also a lot of progress right now on audio, voice, and speech. So, some of you may have tried these music models from products like they are extremely good. In fact, some of the songs on if I
good. In fact, some of the songs on if I heard on Spotify, I would think it is a real song made by a human. Um, audio and voice cloning is also really good. If
you use a product like 11 Labs, you can upload maybe 30 seconds of your voice and get basically a perfect speech clone of yourself. You know, on these
of yourself. You know, on these dimension, it's likely that in the future voice actors just license their voice as an API rather than actually doing voice acting directly. What's a
little bit newer are voicetovoice models. You can think about this kind of
models. You can think about this kind of like a language model that operates in the voice space or the sonic space, not in the language space. This is still very early. Most voice agent startups
very early. Most voice agent startups will still take audio, transcribe it to text, reason on the text with a large language model and then synthesize it back to voice. But if we can get voicetovoice working, it'll unlock a lot
more use cases because it will be much lower latency. And so phonic is one cool
lower latency. And so phonic is one cool company in that space.
The last model subject area that I wanted to talk about were these life sciences. And so I'll start with an
sciences. And so I'll start with an example. There's a pretty cool model
example. There's a pretty cool model called EVO 2 from the Ark Institute.
They describe it as a DNA foundation model. And so the actual idea here is
model. And so the actual idea here is pretty simple. Uh so as a recap, genomic
pretty simple. Uh so as a recap, genomic sequences are sequences of certain uh nucleotides, right? A G, T and C in a
nucleotides, right? A G, T and C in a random order depending on the bean. And
just like language, you can take a given genomic sequence and split it up, right?
Get an input and get an output just like we did with language earlier. And then
very similar to before, you take a language model base and you just say, hey, predict the output given the input, but instead only apply it to genomic sequences, right? and you end up with a
sequences, right? and you end up with a model that's very similar to a language model but only trained on genomes.
What's interesting is that if you do this, you get some really interesting behavior. Um, so how might you use
behavior. Um, so how might you use something like this? Well, one effect uh one use case is mutation effect prediction. And so the idea here is kind
prediction. And so the idea here is kind of like because these models predict probabilities of the next token. If you
change a genomic sequence, you can look at how that changes the probabilities of the next token. And if all the probabilities go way down, it's kind of an indication that that's a really weird sequence that might not exist naturally
in nature, might not be biologically viable. Maybe an analogy there would be
viable. Maybe an analogy there would be if I had a sentence, I went to the store and bought an elephant, language models would be really confused and not know what to say next. And so you can use this to understand biologically viable
sequences. Another use case is
sequences. Another use case is biological feature discovery. So you can take all those same ideas that I uh showed you in that Golden Gate Bridge example and apply it to these genomic models to actually identify biologically
relevant features or concepts that are influencing how the model is making a prediction. And then last, you can do
prediction. And then last, you can do something that's more like guided genome design where the genomic foundation model predicts possible sequences in the genome and you combine that with a
secondary model that for example might do um uh biological function prediction and you can run them in a loop, right?
generate a bunch of possible genomic sequences, predict the likely biological function of each of those sequences, pick the best ones, and repeat back and forth, right? Um, all of this stuff is
forth, right? Um, all of this stuff is still pretty early. It's researchy, but it's cool and it's likely to progress very quickly, just like we've seen with language models. Um, and this is just
language models. Um, and this is just one example in the sciences, but there's actually a ton of use cases of foundation models in the sciences right now. So, for example, given a function
now. So, for example, given a function that you want to have, how do you predict a protein that might create that function? Given a protein structure, how
function? Given a protein structure, how might you predict the geometry or the way that protein folds in a biological sample, which is basically how proteins affect the biological world? Um, given
how you might perturb a cell, how might that affect the expression of a cell?
And then all the way into other science domains, right? So given past weather,
domains, right? So given past weather, predict the future weather or given the a set of atoms and their coordinates in space, how might you predict the properties of the materials that that that um that they create? And so the I
would say the market maturity in a lot of these sciences categories is still very early. One particular problem in
very early. One particular problem in the sciences broadly is that the data is much noisier, much sparser and we have a lot less of it. But there's a lot of progress in these domains and I would expect that over the next couple years
many of them them become much more mainstream and really go from research to kind of real world applications.
And so that kind of gives you an overview of everything that's going on in the model space right now. From there
I want to go more into the applic uh the application layer of AI and I'll start by talking about use cases. Um there's
obviously a lot of use cases of foundation model based companies. So I
will not be able to comprehensively cover them all but really my goal here is to give you a sense of the broad categories where we see the most work happening. And so to start I would say
happening. And so to start I would say that really the marquee use case for large language models is search and information synthesis. If you've used a
information synthesis. If you've used a product like Perplexity or even ChatGBT or Glean, I would really put them in this category, right? You're using large language models to take a lot of information in the world and synthesize
it. They're really good at that, right?
it. They're really good at that, right?
Um, so this is kind of that next generation Google use case. But what you may be less familiar with is that there are literally thousands of domain specific versions of this, right? So you
see a lot of versions of this in investing where a certain type of investor or analyst might want to analyze a lot of information in a domain to help come up with some financial analysis or prediction or investment recommendation. You see it in the legal
recommendation. You see it in the legal domain where basically the whole job of lawyers is to an analyze unstructured information like case law, precedent, things like that. You see it in construction where construction workers
want to be able to understand the blueprints and instead of calling the project manager, can LM just tell them what's going on? What do they need to do today? You also see it in areas like
today? You also see it in areas like healthcare. There's a really cool
healthcare. There's a really cool company called Open Evidence that helps doctors understand all of the clinical research in a domain as part of how they're helping or diagnosing a patient.
And so these again are just a couple of examples, but I think this is likely the most strong use case for large language models where they're the most companies with very strong product market fit.
The second key use case that you might have heard about is software engineering. Um, you've probably heard
engineering. Um, you've probably heard of products like Cursor, Windsurf, Augment, or GitHub Copilot. Um the
degree to which AI is disrupting software engineering is really hard to convey. Um so first of all this is the
convey. Um so first of all this is the fastest growth market of all time.
Cursor is the fastest growth SAS company ever. They hit a 100red million in
ever. They hit a 100red million in annual revenue in less than two years.
Um and if you look at the market in aggregate it's an over a billion a year market in just a span of two to three years which is unprecedented. But more
importantly, what's crazy here is that you speak to software engineers who've used these products, use these tools, and they really feel like it's the biggest change to software engineering since maybe the invention of the compiler. And so, we're really excited
compiler. And so, we're really excited about this space. And what's interesting is that we're starting to see a lot of startups that go beyond just the co-pilot, right? Um, more specifically,
co-pilot, right? Um, more specifically, we're seeing LM start to touch the entire software development life cycle.
So we see LM enabled companies in areas like code review, documentation, code migration, prototyping, testing, and QA.
Basically, pick any subcategory of software engineering or the software development life cycle. And you're
starting to see profoundly innovative companies using LMS to rethink that space. We generally suspect that the
space. We generally suspect that the entire developer tools ecosystem is going to be fundamentally rethought in a world of large language models.
Bridging on the idea of software engineering co-pilots, this idea of co-pilots and agents also applies to basically all other forms of specialized and particularly highskilled knowledge
work. So we see cursor style products in
work. So we see cursor style products in all of these other domains including areas like PCB engineers, game developers, electrical engineers, accountants, 3D designers, mechanical
engineers. In all these spaces, you have
engineers. In all these spaces, you have someone who is a really high-skilled professional who's typically designing, building, and testing really complicated systems, whether you're writing code or working in a CAD tool or designing a
chip. And in all these cases, you can
chip. And in all these cases, you can build a bunch of co-pilot style workflows that dramatically leverage and accelerate that high-skilled knowledge worker. And so, we think all of what's
worker. And so, we think all of what's happening in software engineering is going to apply to these other domains as well.
Creative expression is another area where there's obviously huge impact. You
know, we've touched a little bit on the image models and the video models. Just
a couple of examples here. We see a lot of excitement in areas like video and animation from companies like Runway. I
now know of in Hollywood multiple examples of feature films that are being developed fully produced via generative models, which is crazy. Um, you see it in many forms of vertical design. For
example, Visual Electric is a cool company in brand design. um you could upload, for example, a picture of an object that you want to show and instead of doing high-end photography on that product, it just creates a perfect
render of it for you. You also see it in areas like 3D design, right? Um and so these are just a couple of examples, but I would say that basically every area of
design work and creative work is being rethought in some way by generative AI.
And then there's a lot of other cool stuff that I can't go into all the depth on, but I'll just give you a taste of it. Um, so there's a lot in verticalized
it. Um, so there's a lot in verticalized writing. So think if you want to get an
writing. So think if you want to get an immigration document or you want to submit a defense bid. There's a lot of these markets where you need to write a really complicated specific piece of content. LM are really good at that.
content. LM are really good at that.
Second education coaching and companionship. Um, so speak for example,
companionship. Um, so speak for example, is a cool language learning product, but there's so many areas where LMS are actually probably better than even the best teachers in the world at teaching you something, guiding you through something, or maybe even giving you
therapy. And there's a lot happening in
therapy. And there's a lot happening in that space. Um, voice agents, which I
that space. Um, voice agents, which I touched on a little bit earlier, are really exciting. There's so many domains
really exciting. There's so many domains in the world where there's no digitization. There's no APIs, but if an
digitization. There's no APIs, but if an AI can call people, you can suddenly automate things you could never automate before. So, Fair Health is a cool
before. So, Fair Health is a cool company that's using voice agents to help patients navigate um care post hospital. So, if they need to schedule
hospital. So, if they need to schedule certain follow-up appointments and understand, you know, what doctors exist in my area that'll take my insurance and have availability this time.
Traditionally, the only way to do it was to call them all one by one. Now AI can automate it for you. Um there's a lot of work in what I would describe as tier one labor automation. So there's many
jobs for example customer success or certain top ofunnel sales jobs where essentially there's someone who's a little bit lowkilled that needs to analyze a ton of information and all they're doing is deciding what pieces of
information do I uplevel to someone more senior. Um, and in all these spaces,
senior. Um, and in all these spaces, what we're seeing is that first line of defense is typically getting automated by by AI because it's more about analyzing a lot of information and getting coverage rather than it is
avoiding mistakes. And so, Drop Zone is
avoiding mistakes. And so, Drop Zone is one example of a company in security automation where you're actually partially automating that firstline security analyst in this sort of mechanism. Um, LM are very good at
mechanism. Um, LM are very good at translation. In fact, the transformer
translation. In fact, the transformer was uh originally invented to improve with translation. And there's a lot of
with translation. And there's a lot of use cases where basically what you're doing is translating something but in a maybe more nuanced way. So light table as an example is a company in the construction space that's exploring
let's say you're designing blueprints or architectural prints for a new thing that you want to construct. And
eventually you're going to have to go through review. Does it adhere to all
through review. Does it adhere to all the state codes, the regional codes, things like this? Traditionally that was didn't happen until like five months, six months later. You realize there's a mistake and you got to go back to the
drawing board six months again, right?
What if instead LMS or AI could understand all those rules that you need to follow and give you a first pass right when you're creating the thing, right? Um and so this sort of idea of
right? Um and so this sort of idea of like shifting compliance left or instant compliance is a really good use case.
There's a lot of AI products in that category. Um LMS as I've touched on are
category. Um LMS as I've touched on are good at kind of rethinking semi-structured systems of record. Uh
think for example CRM. We think in all of these categories there's going to be AI native products that kind of replace the incubants. Um, there's also some
the incubants. Um, there's also some cool second order effects of AI. So,
Profound is one interesting company in this space where, you know, traditionally there was a huge industry that got spun out around search engine optimization, right? How do I make sure
optimization, right? How do I make sure I show up correctly on Google? Well, a
lot of people aren't googling things anymore. They're going to chat GBT. And
anymore. They're going to chat GBT. And
so, the new question that a given brand or marketer needs to understand is how do I show up on chat GBT? Do I show up on chat GBT? Right? And so, Profound is a company that's based like search engine optimization for chat GBT style
products. And so this is one example of
products. And so this is one example of I think a broader trend which is as AI so substantially changes the way that we work there's going to be a lot of new needs in that world. Right? Finally, one
last thing I'll touch on is there's some really cool work in synthetic data. LMS
are very good at impersonating people.
And so there's a lot of cases where you might have traditionally done a lot of user interviews or surveys or things like that where instead you can now just ask LM, hey pretend you're this type of user. what would you say? Um, and what's
user. what would you say? Um, and what's interesting is I've now seen a number of cases even with very large brands where these forms of synthetic surveys actually exactly mirror a real survey.
And so there's going to be a lot more work in these kind of market researchy categories. And so that's just a quick
categories. And so that's just a quick overview of kind of the big use cases where we see a lot of activity in the foundation model space. From here I want to talk a little bit more about building
foundation modelbased products and some of the trends that we see there. So to
start, if I were to describe the overarching kind of arch of products in the space over the last couple years, it really went from model to retrieval augmented generation to agents. And so
let's kind of go through that in a little bit more depth. In the early days, you had these very simple products that basically just did a single large language model call to do something like generate a little text or summarize something. And so Notion AI is a good
something. And so Notion AI is a good example of that. You could query the AI in the Notion product and ask it to summarize maybe a paragraph, right?
Useful, but very simple. What you then saw were a bunch of products that started to combine a model with a lot of data. This is often described as
data. This is often described as retrieval augmented generation. So when
the user enters a query, you first search and find a lot of pertinent data.
You give that data to the LM along with the user's query and then you output an answer. Most of the early copilot or
answer. Most of the early copilot or software engineering style products did this. For example, in the code space,
this. For example, in the code space, you might want to find the relevant pieces of the codebase and then give that to the model alongside the user query. What we've seen more recently is
query. What we've seen more recently is the idea of combining not only model plus data but adding tools. And this is really how I describe an agent, right?
And so here a model is determining given a user's query, what do I need to do?
What tools should I use? How should I use them? And also what data should I
use them? And also what data should I retrieve? I would argue that a lot of
retrieve? I would argue that a lot of the newer deep research style products kind of fit this description. And
indeed, if you look at the new startups being founded, I would say almost all interesting startups today fit more in this agent category.
So let's talk about agents a little bit more. Um I think agents are a little bit
more. Um I think agents are a little bit illdefined, but if I were try to give one succinct definition, I would really say that agents are models using tools in a loop. And so the idea is actually pretty simple, right? You give a query
to a large language model. That large
language model has access to some environment it can understand a lot of tools. Given the query, it comes up with
tools. Given the query, it comes up with a plan and a step. It might call an API, search the web, you know, read some files. It then does that step and then
files. It then does that step and then analyzes what happens. Did things
change? Did I succeed? Did I fail? And
it keeps circling over and over and over until it thinks it completed the task that the user asked it to do. Uh what's
really interesting is that, you know, even just a year ago, this kind of architectural pattern didn't really work honestly at all, but now it works really, really, really well. And so,
again, this is one sense of how fast the space is moving. And so some of the most common tools that you see agents use include things like searching files, writing code, calling APIs, searching the web, or using a browser. But the
number of tools you might use can go far beyond this, right? And so let me give you an example um of what this looks like in reality, right? A lot of leading agent startups will actually recurse
somewhere between 50 to 100 times for a single user query. So there's a cool company called Basis. It's an AI accounting startup. You can think of the
accounting startup. You can think of the form factor of the product as pretty simply. Imagine any question you might
simply. Imagine any question you might ask an accountant to do with a spreadsheet. You can instead ask basis
spreadsheet. You can instead ask basis to do. And so maybe I'm asking it
to do. And so maybe I'm asking it something simple. I want to help
something simple. I want to help reconcile this month's collections with last month's revenue. Right? And to
answer this, what it's actually doing is chaining 30 to 60 large language model calls, including planning, retrieving data, writing and running code, browsing the internet, manipulating the spreadsheet, and also accessing all the
other accounting tools that you might access. And what's crazy is that this
access. And what's crazy is that this works really, really well. And so this gives you a sense of kind of the sophistication that now exists in AI products where we're now having very
complicated systems, not just calling a model once.
And so, as I kind of touched on, well, generalist agents are still probably not here. There's a couple of startups that
here. There's a couple of startups that have tried to be like the everything assistant, the everything agent that can do anything for you. That's a really hard problem. But there's now a number
hard problem. But there's now a number of startups that are kind of like vertical specific constrained agents that work really really well. So for
example, I would put lovable in a lot of the coding products in this category. Um
agent uh products in areas like customer support like Sierra also work really well and this is going to continue over the next couple of years.
What's interesting about agents in particular is that if you query users of a lot of the leading agent products, you get very polarized reactions. And so
some of you all may have heard of this product Devon, which is trying to be an autonomous software engineering agent.
And if you look online, you find a lot of people who are like, Devon is horrible. Like, you know, I tried it. It
horrible. Like, you know, I tried it. It
didn't work at all. I had to always correct it. I would never use it again.
correct it. I would never use it again.
But you also find a lot of people who are like, Devon is actually the most productive software engineer at my company and produces more PRs than anyone else. And so, how do you
anyone else. And so, how do you reconcile this? Right? What I'm kind of
reconcile this? Right? What I'm kind of starting to observe is that in most agent categories, there's a bit of a learning curve. It's actually difficult
learning curve. It's actually difficult for a human to understand when do I use a agent, how do I use an agent, how do I what are my expectations around how to review the agent, not dissimilar for maybe a first-time manager who's never
managed people before, right? And so,
um, it's interesting to see how agent products try to educate or teach users and better define that kind of human agent relationship.
Um, so as you saw with that kind of accounting example, I think one thing I want to get across is that most good AI products nowadays, the teams think more about systems than models, right? So I
kind of showed you that basis example where really it's a system, there's a control flow, there's a model, there's a bunch of tools, there might be multiple models, right? And it's the whole system
models, right? And it's the whole system in totality that creates the user experience, right? And what I observe is
experience, right? And what I observe is that in most good LM products, um, you actually often want to break the problem down into this kind of more systems approach. And so I'll give you a more
approach. And so I'll give you a more specific example of this, right? Imagine
I have a product that helps people understand certain political questions or political discussions, right? Um, you
know, for example, I might ask it the question, what are the best arguments for and against the claim that social media hurts democracy? Right? Now, you
could definitely give this question to a large language model and you'd probably get a pretty good answer, right? But
let's say you really wanted to optimize for this type of question, right? What
you might do is instead break the problem down. And so maybe I start by
problem down. And so maybe I start by splitting this up and saying, okay, I want to generate arguments for the question and I also want to generate arguments against the question. And
maybe I have two large language models that are prompted or fine-tuned specifically to be really creative and generate a lot of ideas, right? And so I get some top answers for some top answers against. Then maybe I have a
answers against. Then maybe I have a different set of models that are trained to be critics. They're really negative.
They like to kind of make fun of things or describe why they won't work. And I
pass those hypotheses to the critic models who identify maybe the top answer for and the top answer against. And then
maybe I have a final LLM that I fine-tune separately that's more of a judge, right? That's kind of analyzing
judge, right? That's kind of analyzing fairness, thinking holistically about these two options. And I ultimately give you a final answer. What I can tell you for sure is that the second architecture almost certainly generates better
results than the first. And this is how a lot of good agent teams or AI startups actually think about solving problems. It's much more of a systems problem. And
so here's a quote that kind of illustrates that. You know, we might
illustrates that. You know, we might think of OpenAI as the company that's most allin on models, but this is a quote from the CPO or chief product officer of OpenAI. And what he's saying is we actually use ensembles of models
much more than people might think. You
know, we might have 10 different problems and solve them with 20 different model calls, all of which are different specialized fine-tunings, right? And this is invisible to you as a
right? And this is invisible to you as a user, but it kind of goes back to my example of 01 Pro earlier. It's much
more of a system that's happening under the hood with many different models being called in complicated ways even if to you as a user it feels like you just queried a model once. And so a interesting question then is like how do
you design these systems, right? And so
there's an interesting uh graph here from a paper called large language mo large language monkeys which basically shows that if you take a kind of low-end model and you just keep asking the model over and over and over the same question
and then you have some kind of voting mechanism or judging mechanism for how to aggregate across those answers. If
you ask that one model that's bad enough times, it will be better than much better models, right? And so this is a very simple idea of a system. I just
query one model like a hundred or a thousand times and vote on the answer.
But even that simple system, you get huge performance increases, right? And
so there's a lot of other kind of systems paradigms you might see in the space. You might query the model many
space. You might query the model many times like I just described. You might
do a fan out of answers and kind of find the most common answer. Um, you might break the problem down into substeps and kind of fan out, pick the best, fan out, pick the best. There's different ideas and in some ways I think the design
space is still probably underexplored but in many ways I think this is the future of where a lot of AI systems will go and what's interesting is that there's starting to be some frameworks that make it easier to explore this
stuff because at a certain point it's probably impossible actually for a human to come up with all the ways you might design an AI system. Take the graph on the right here. It's an illustrative system or maybe for a given query I
break the problem down into three steps and for each steps I generate over a thousand answers have some heristic for how to pick the best one and then go to the next step. A human's likely not going to come up with that but
frameworks can permute many different possibilities of how you combine models and pick the best one for you. And so
DSPI and Ember are two interesting examples of this.
Um I touched on context windows earlier um and how sometimes they're a little bit misleading. I want to talk a little
bit misleading. I want to talk a little bit more about context windows because I think another big question when you're building AI products is does search or information retrieval matter in a world
where I can just stuff all the data into the context window right and I think what's likely very true is that even as context windows continue to increase in size uh search and retrieval are not going away and I'll give you some basic
examples of why so from a quality perspective if you look on the graph on the left this is comparing a retrieval augmented generation system in green
that uses search plus AI with a long context model that where all the data fits in the context window right and what you can see is that even in cases where all the data fits in the context
window and even in cases in fact where you're only using like a fifth of the context window so would be the kind of left side of this graph that search plus AI system is still dramatically higher
quality there's also a cost consideration so here on the right you can see comparing a rag system to a long conx model for a certain set of tasks Um, if you were running the rag system for a day, it might be $78. If you're
running the long conx window model instead, it would be $1,500.
And then if you look at latency, which is the other metric that matters a lot, let's assume that you want to kind of analyze a hund a million context window uh size, right? So you can either search over a million contexts or you can put
it all in the model. Um, doing this with search requires 600 milliseconds. Doing
it with a long conx model requires over a minute, right? And so what you can see is that in all these cases on quality, cost, latency, you're still one, if not two orders of magnitude better by using
the more complicated system than just using the context window. And so while the really simple use cases maybe you can get away with just context, it's very likely that search and information retrieval will remain a critical aspect
of most complicated or sophisticated AI products.
I then want to touch a little bit around the idea of product and design. Um, and
my argument would be that there is still a huge amount of room for AI companies to differentiate mostly on product and design even if they don't do anything different technically. And so I'll give
different technically. And so I'll give you an example of this. Um, many of you have likely used these AI note-taking products, right? And there's a lot of
products, right? And there's a lot of them like Firefly, Assembly, etc. And if you had asked me two, three years ago, you know, is there room for a new AI note-taking app, I would have said, no way. Absolutely not. I would never
way. Absolutely not. I would never invest in that. Right? because it kind of felt like a commodity category to me to be honest. Um, and then Granola came out. I'm not an investor granola. I just
out. I'm not an investor granola. I just
love the product. And they had fundamentally rethought what it meant to be an AI note-taking app in a world of large language models as opposed to just applying LLMs to kind of the existing AI
note-taking hypothesis or thesis, right?
And the result is that I and almost everyone I talked to immediately used that product and is like wow that's a hundred times better a thousand times better than what existed before for what was really actually just a product and
design innovation right and I don't mean to diminish granola it was very hard and very complicated to build such an elegant beautiful product and if anything what I'm saying is that I think there's a lot more room for kind of
designoriented founders in the AI space because the technology is good enough already to reinvent many of these categories the question is just can we start to rethink the assumptions in those categories.
Bridging off of that, I really think that the UX design patterns for foundation model based products are still really early. So on the left here, I have an image from cursor. Um, if you haven't used it, when you're writing
these codegen tools or using these codegen tools, you can pick from this drop down that shows you, you know, 10 plus models. And I can use 01 mini, I
plus models. And I can use 01 mini, I can use mini 128k, I can use claw 3.5 sauna, etc. In my mind, this is kind of crazy because the product is asking the
user to be an expert in evaluating 10 plus models that are changing all the time. They're changing every three weeks
time. They're changing every three weeks for all these different use cases, right? Like that complexity should be
right? Like that complexity should be solved by the company, not the user. And
in a lot of ways, this reminds me of older patterns in traditional technology waves where, you know, in the early days of mobile, you had to pick your preferred network type. Um, in the early internet, you had to pick your preferred
media codec for a video player, right?
like these we see these sorts of patterns as crazy today but I really think the same idea applies to these idea of model picker UIs and I think this really just emphasizes how early we
are in the design patterns for building AI based products.
Another really interesting thing to consider for people building AI based products is how do you balance building something for users today versus letting the models get better and solve your
problems implicitly. And so I'll give
problems implicitly. And so I'll give you an example of this. Over the last couple years, there were tons of products built around the idea of fine-tuning for image generation. So,
you know, let's say you want to take a picture of yourself and give it a certain style or a certain structure.
The only way to do that a couple of years ago was you had to fine-tune a specialized model for that use case. And
so, the way all these products work is you sign up, you upload a bunch of your own images, you wait for them to train a model just for you, and then after 15 minutes, you can use their model. behind
the scenes that company had to build a lot of infrastructure around fine-tuning per customer, storing a given model per customer, etc. Okay, now uh the recent
image plus version of chatgbt emerges and guess what? You can do incontext learning natively without doing anything. No fine-tuning, no custom
anything. No fine-tuning, no custom model, no nothing, right? And so in the blink of an eye, all of the product form factor, all of the workflow and most of the infrastructure that all these AI
image generation products have built has become completely obviated. And this is the risk of being a founder right now, right? You can spend so much time
right? You can spend so much time solving a problem today, but it turns out that all that work becomes technical and product debt in just a year or two.
And so the really good founders in this space think endlessly about this question. And often there's no right
question. And often there's no right answer because you need to be able to solve users problems today. You can't
wait forever. But how you balance this is one of the most interesting questions in building applied AI startups right now.
So from there I wanted to talk a little bit more on the tool side of things which which I touched on earlier for agents. Um some of you may have heard of
agents. Um some of you may have heard of model context protocol. It is really emerging as kind of the open ecosystem standard for tool use. Right? So just
like HTTP emerged as the common way we access websites. There was the desire to
access websites. There was the desire to have a standard way to expose services to agents. Right? And so the way it
to agents. Right? And so the way it works is pretty simple. You have what are known as MCP clients. You can think of this as something like claude or chatgbt which might want to access tools, right? And then you have MCP
tools, right? And then you have MCP servers which you can think of are basically a service that exposes itself or exposes its API as a tool that an agent can then use reliably. Right? So
in this example, I might be using claude and then I might have it attached to servers for Gmail, for Figma, for Blender and then as I use Claude, the agent and Claude can determine when to use those tools and how to use those
tools, right? Um, this was released late
tools, right? Um, this was released late last year and is now supported pretty officially by basically every major player in the space. And so while things might still change, it's likely that this will be the kind of standard deacto
way that agents use tools. And the
reason this matters, it's going to make it much easier to build agents in the future because you won't have to build custom integrations into every single API that you might want to use. And so
here's a really cool example of this.
Um, someone hooked up Blender, which is a 3D modeling tool, to Claude. And what
you can see is that all they're doing is talking to Claude or Enthropic in the web URL. And on the right, Claude is
web URL. And on the right, Claude is controlling and designing a scene in 3D for them, even though that person knows nothing about 3D modeling or how to use Blender, yet in the end, they get a
really good result. And so, this is just one kind of cool example of how powerful it is to connect tools and systems to these AI models. Um, speaking of tools, I think one thing that's becoming
increasingly clear as I talk to founders is that the interface for tool use matters a lot. And so this is a research uh paper um uh from a talk that came out of replet. And so let's consider a
of replet. And so let's consider a coding agent that can access a couple of basic tools. It can edit files, it can
basic tools. It can edit files, it can search files, it can view files, and it can man it can manage its context a little bit. uh what you see on the
little bit. uh what you see on the bottom is that really subtle changes in the way that that tool is defined massively impact the quality of the agent. So for example, if you look at
agent. So for example, if you look at the search here, if the search API or search tool use call both summarizes the search results and shows the search results, you get much higher quality
than if all it does is show the results.
Similarly, this is a bit a little bit non-intuitive. Imagine the file viewer,
non-intuitive. Imagine the file viewer, the best file viewer didn't show all the files. It didn't show 30 files, it
files. It didn't show 30 files, it showed a 100 files, right? Like a weird in the middle value. And so actually optimizing tool use, it's becoming clear, matters a lot for improving the
quality of agents. And what this is actually leading to is that in spite of the fact that MCP is becoming a standard, I'm still actually talking to a lot of startups who are saying, you
know what, MCP is a nice starting point, but for my agent, I really need to build first class integrations optimized for my agent for each of the tools. So this
is a quote of a series A um agent or AI sort of that I work with and they were literally like our agent was over 10x better once we stopped using standard MCP and started just building deep
custom integrations and different tools and so I think there will continue to be this tension for a little bit of you can expose a naive MCP server but it's not going to work as well as building a first class integration and it's going
to be interesting to see how that evolves.
One other thing I wanted to touch on on the product side is the idea of personality. So um the paper on the
personality. So um the paper on the right gives you kind of an interesting hint at this. What these researchers from Stanford basically showed was that if you take a a kind of base model that has had none of all the work we've
started doing over the last couple years to make it better at instruct following instructions and answering questions and reasoning. You just take the really
reasoning. You just take the really primitive base model and you ask it to do creative tasks and you compare it to the frontier best models where we've done all this post- training and optimization etc. The base model is
better. Why is that? It's because we
better. Why is that? It's because we spend so much time trying to get these models to be good at answering questions and following orders and doing what we say that you kind of get them to lose
some of their creative freedom in doing that. Right? But there's a lot of use
that. Right? But there's a lot of use cases where you care a lot more about the creative freedom than someone just answering your question correctly, right? And so there are these inherent
right? And so there are these inherent trade-offs in personality just like we see in humans. And I think it has underexplored how do you emphasize different types of personality in different use cases, right? So in a lot
of creative expression categories like design, you probably care more about creativity and randomness and answering questions. In areas like education, you
questions. In areas like education, you probably want a better balance of like the model being more of an authority and telling you you're wrong rather than just being a sycophant and doing what you say. In areas like therapy, you may
you say. In areas like therapy, you may want a model that's not so focused on answering questions, but maybe asking questions, right? And so I think this is
questions, right? And so I think this is a really underexplored area of product development in the space. And over time, it wouldn't surprise me to see more variance of model personalities from
these large providers.
Finally, I wanted to just briefly touch on the fact that the infrastructure ecosystem for building foundation model based apps has matured massively over the last couple years. So everything
from running inference to managing data to doing eval and observability through embeddings tools, things like this. Um a
couple years ago, you basically had nothing. You had to do it all from
nothing. You had to do it all from scratch. Now there's such a foundation
scratch. Now there's such a foundation that it's much much easier to build products that are really good really quickly. And so this is also part of
quickly. And so this is also part of what's accelerating startup growth in this category.
So that covers the product side. I want
to finish by talking more about market structure, market dynamics. Then I'll
end with a little bit of what might come next. Um so first a remarkable
next. Um so first a remarkable statistic. In 2024, 10% of all venture
statistic. In 2024, 10% of all venture dollars went to foundation modelbased companies. And what you can see on the
companies. And what you can see on the right here is that you know 2020, 2021, even 2022, it essentially rounded to zero. And now we're at 10% after a
zero. And now we're at 10% after a couple of years. It wouldn't surprise me actually if next year or in 2025 it's even higher than this. And so this just gives you a sense of how much excitement
maybe even hubris and over excitement exists in the foundation model category.
Um at the same time though I think this is partially justified right these foundation model vendors and startups are not only in the billion plus runway
run rate but they are accelerating their revenue growth at that run rate. And so
this is on the left some revenue data from OpenAI that shows that at 2025 they'll probably end at about 12 to 15 billion in revenue. Um Anthropic over doubled its revenue from a billion to
over two billion in just a single quarter. Um and so you know there's a
quarter. Um and so you know there's a method to the madness and there's no precedent for this degree of growth at this scale.
What's also interesting in the model category is that some of the model players are pushing really hard to be application companies or really consumer product companies rather than API
companies. And so here you have a graph
companies. And so here you have a graph that shows the relative percentage of revenue for chatbot subscriptions versus API revenue. What you can see is that
API revenue. What you can see is that OpenAI is nearly 70 to 80% chatbot subscriptions now whereas Anthropic is still predominantly API revenue. And I
think this is a reflection of what I was discussing much earlier in the presentation, which is that the model layer continues to commoditize. Open
source continues to be a serious threat.
And even though the revenue growth of these players is crazy, I think if you just stay a model provider, you have very little revenue durability and stickiness. It's very easy to switch a
stickiness. It's very easy to switch a model API like that. But if you build a consumer brand, a consumer subscription, that looks really different. And so
indeed what you see is that all the leading model companies are really pushing to be app layer companies. So
for example both Anthropic and OpenAI have hired former consumer product people to lead product. Actually both of their product leads used to work at Instagram. Um you can see this in
Instagram. Um you can see this in acquisitions these players are making.
OpenAI recently announced it was starting to acquire the AI coding company Windsurf because coding is a really good application category in AI.
And indeed, there's even rumors that some of these players are trying to develop a novel social media platform built on AI. There was also the recent news that OpenAI acquired a consumer hardware device company, right? And so,
it's very likely this trend continues and that the only way to succeed at the large scales in the model layer is to bundle or vertically integrate with the application layer category.
Another reason for this is that the big cloud vendors are finally catching up.
You know, two years ago, Google was a disaster in spite of the fact that transformers had actually come out of Google, but they increasingly seem unstoppable. This is a graph of two
unstoppable. This is a graph of two metrics. It's showing all the models
metrics. It's showing all the models that exist out there. On the y-axis is quality. The x-axis is cost. And so this
quality. The x-axis is cost. And so this cost quality curve is really the key thing that matters in large language models. And what you can see is the
models. And what you can see is the entire PTO frontier now is owned by Google. That is for any trade-off
Google. That is for any trade-off between cost and quality, Google has the best model. Now things change quickly so
best model. Now things change quickly so this could change but I think this gives you a sense of the concern I would have if I were running a big model company that wasn't Google because Google could offer this as a loss leader. Google has
crazy economies of scale and Google can treat this more as an on-ramp to the rest of their cloud infrastructure.
Right? And so I think this is also reflective of why you see the model companies pushing so hard into the application layer.
Another really interesting question in my mind is that there's starting to be all this investment into foundation model companies that are not just these purely digital companies like image text video right for example I have a bunch
of robotics data here and you know these companies are raising a lot of money at similar valuations to the LLM companies the image companies and the question is will they be able to defy gravity like
we saw in images and text in terms of their revenue growth or will they ultimately ultimately be limited by the fact that in these physical domains There's just operational concerns and hardware concerns that are very
difficult to get around. I think this is really the billion dollar question for these people that are investing, you know, hundreds of millions of dollars in these physical foundation modelbased startups.
If you look then at the application layer, you see a similar dichotomy, right? Where yes, the valuations are
right? Where yes, the valuations are really, really high, but there's also unprecedented revenue growth, right? So
on the right here I have a graph from red from redpoint that basically showed that in 2023 to 2024 the average revenue multiple of AI startups was dramatically higher than non-AI startups but their
average growth rate was also much higher and on the left here I have a number of examples of companies that have had unprecedented revenue growth in the application category uh including some you know Bolt for example went 0 to 20
million in just 60 days and so there remains a method to the madness but the question in a lot of these cases is how durable is the revenue, how defensible is it, how sticky is it, right? Um,
another view on this is that if you look at even just the most well-known publicly published AI native applications that have real revenue targets that have been listed in the public domain, uh, AI native
applications are now at over a billion dollar plus run rate. And this is a vastly conservative estimate. There's
many startups that I know have very high revenue that are not in this list. And
so the revenue is real here. Make no
doubt about it. Um
the other interesting thing is that you know I think especially a year ago two years ago there was a big question of whether large language models and AI are more of a sustaining innovation that
helps incubants right in theory it's not that hard to add AI model calls to existing products. So maybe this means
existing products. So maybe this means that big companies like Salesforce Adobe etc will just win. You know what advantage does the startup have? Um, and
I think what's now empirically true is that in fact the incubids do not have the advantage in this category. And I
think that's mostly because building really good AI products, kind of like what we saw with the granola example, is so much more than just slapping a model call on. You really have to reinvent the
call on. You really have to reinvent the workflow, right? And so even in a couple
workflow, right? And so even in a couple of these categories where the big company has had a ton of resources and gone all in on AI, for example, GitHub copilot and coding, Adobe Firefly and
Creative Expression, you have seen the startups absolutely dominate and I would expect this to continue to be true. I
really think it's a startup's game for the most part in the AI world right now.
And kind of touching on this question of revenue growth, I think there's a huge risk uh risk of the novelty effect in AI startups right now. This is a graph of the revenue curve of a consumer startup,
Lensza. It was one of those early AI
Lensza. It was one of those early AI image style transfer startups. And what
you saw is they very quickly got to a huge amount of revenue and then just as quickly dropped off. And we see a lot of startups that have this effect mostly because a lot of people are curious about AI. They want to try try AI, but
about AI. They want to try try AI, but that doesn't mean that they're going to stick around, right? And so as an investor or as an employee, you've got to be really careful of what's defensible and what will be um kind of
maintained over time versus what's not.
And so overall, if I were to summarize, you know, what's happening here is real, but the market also feels really bubbly.
Um this is just one example that I found kind of humorous, interesting. Um there
was a French foundation model startup called uh H, and they raised a $220 million seed basically just on an idea.
And within three months of founding the company, three of the key co-founders left, right? And so there's just a lot
left, right? And so there's just a lot of interesting behavior in the market right now that I think makes it simultaneously super exciting and super bubbly. And it's really hard to
bubbly. And it's really hard to understand how to navigate it as a result. And I think at the end of the
result. And I think at the end of the day, the people who win for sure are the GPU ecosystem and companies like Nvidia because no matter what happens, we're going to still keep needing an
exponential growth in compute hardware.
So I want to finish with just a little bit of discussion on what might come next and playing things out a couple of years, right? Um the first is that it's
years, right? Um the first is that it's very clear that operating as an AI native company is going to look fundamentally different. Um so here I
fundamentally different. Um so here I have an extract from a recent memo that the CEO of Shopify sent to everyone at his company. And basically what it said
his company. And basically what it said is that using AI is a fundamental expectation and if you don't use it, you know, you're going to get fired. you
have no way that you can contribute here if you don't know how to use AI tools.
And um I really think that companies that don't do this are going to fall massively behind because when you infuse large language models and these research tools into everything you do as a
company, the degree to which you can be more efficient is really unbelievable.
Um and you can see this in startups today, right? There's a number of these
today, right? There's a number of these startups that are 10 people, 20 people, 30 people hitting 50 plus million ARR who are not only building AI products but infusing AI into the way that they
work, right? And so Gamma is an example
work, right? And so Gamma is an example of a company that's just a couple of years old. They've raised less than $15
years old. They've raised less than $15 million and they haven't used most of it. They're only 30 people and yet
it. They're only 30 people and yet they're over 50 million ARR, right? And
so this is what incubates are competing with. And so if you don't as a company
with. And so if you don't as a company learn how to use AI quickly, you're probably going to get screwed. Um
related to this, it's really interesting to see how much the composition of teams is changing. Um so I have two
is changing. Um so I have two representative quotes here, right? I was
talking to a VP of product at a kind of preipo startup recently and what he was saying is that I really don't see a difference between designers and product managers in our company anymore, right?
Because AI has made a lot of the functional design skills less useful.
And so really it's just a question of who has taste and who understands the customer, right? Um, similarly, I was
customer, right? Um, similarly, I was talking to a CMO of a public tech startup recently and she was saying that AI has completely changed how she thinks about hiring. If you think about the
about hiring. If you think about the traditional marketing org, you probably have someone who knows how to edit videos in a tool like Premiere. You
probably have someone who knows how to make motion graphics and a tool like After Effects, etc., etc. Well, AI kind of obviousates the need for a lot of these functional skills because these generative models can do a lot of it for
you. And so, the result is you probably
you. And so, the result is you probably just want to hire generalists now who know how to use AI. And so she had completely changed how she thought about hiring and told her team you can't hire specialists anymore. And so this again
specialists anymore. And so this again gives you a sense of how dramatic the shift is going to be in companies if they kind of really leverage or make use of modern AI tools.
Related to this, I think that learning to manage fleets of AI workers is going to be a new skill that's really not different from managing people. Um I was talking to a CTO of one of the top codegen startups recently and he was
saying, "I haven't written a new line of code myself in three months. I just
manage agents." Now, he's maybe an extreme example because he's trying to dog food his own product, but I really see this start to take shape where a lot of knowledge workers are thinking more about how do I use AI rather than how do
I do the work myself? And I think it's interesting to start to see this design pattern emerging of the agent inbox. I
think a lot of the future is going to be in a tool like a project management tool or a email tool. You're actually
interacting maybe more with AI agents that are communicating you than you are with your co-workers or with other humans.
Um, another interesting shift that's happening is that a lot of products as a result of the rise of agents and AI are going to actually start to be designed more for AI as the consumer rather than
the human. So, I'll give you two
the human. So, I'll give you two examples of this. Um, there's a thing called a cursor rules file, which is basically a way of defining certain instructions that a coding agent can understand when it's answering
questions. What you're starting to see
questions. What you're starting to see is that a lot of the biggest developer tools companies like Cloudflare in this example are actually spending just as much time writing cursor rules files for their APIs as they are traditional
documentation that a human might read because if AIs are more responsible for coding it actually matters more to make sure your APIs are accessible to the agents rather than to the humans. Right?
Um, similarly on the right I have a quote from a database company called Neon where the CEO is basically saying that now over 80% of our database instances are created by agents not
humans right and so I really think there's going to be this huge shift in the world especially lower levels of infrastructure where everything you're doing as a product is serving AI not
serving humans and so where will value be destroyed right it's hard to perfectly answer this questions but a couple of rough lines of thinking um First is I think there's a
huge shift from outsourced labor to in in-house labor right if you think about a lot of corporations traditionally I might pay external consultants or services firms to do you know various functional things that maybe I don't
have the expertise to do myself and so I need to pay someone else obviously working with thirdparty service contractors is not a good experience in many cases though as AI democratizes access to a lot of functional skills
those value chains are going to shift and it's going to be more about software empowering teams rather than services doing human labor for work. Um, second,
I touched on this a little bit with the CMO quote, but I generally think there's going to be dramatic shift from specialist to generalists. And so tools and companies and services that were very oriented towards servicing uh uh
servicing specialists may be hit and I think there's going to be a lot of new opportunities to build new types of tools that serve these more generalist user personas that are going to emerge with AI. Third is that I think a lot of
with AI. Third is that I think a lot of middle management is going to get eroded. You know, so many of these jobs
eroded. You know, so many of these jobs that are just about mediating communication, taking information from here and moving it over here. Honestly,
a lot of the traditional corporate America jobs are frankly irrelevant in a world where LLMs can automate almost all forms of communication and knowledge transfer. And so, I think we're going to
transfer. And so, I think we're going to go move to a world of much flatter organizations with people who are all doers who are doing with AI. And so
tools and services and companies that are reliant on this like super layered middle management hierarchy I think will become a lot less relevant. One example
of this is that I think tools for project managers will not really be useful anymore because project managers as a job will likely go extinct. Right?
I touched on this a little bit but I think incubants in these categories that are directly in the line of fire of AI like companies in the creative tools space, companies in the CRM space are in real trouble. I think if they don't make
real trouble. I think if they don't make large acquisitions in these spaces, it'll be probably impossible for them to adapt. And then finally, I think the
adapt. And then finally, I think the companies that don't go through that organizational pain like we saw with Shopify of kind of adapt or die will end up dying anyways, right? Because they
will be out competed by much leaner companies that can move at a much higher rate. And so I'll close with kind of the
rate. And so I'll close with kind of the interesting question, right? Is AGI
close? Um, one of the funniest observations for me is that it's the smartest AI researchers in the world who actually think this. And so there's this classic Dunning Krueger curve where if you talk to people who know nothing about AI, you know, they're like, "Oh my
god, AGI is about to be here." And you talk to the person who knows a little bit, it's like, uh, these are just statistical models. You know, how smart
statistical models. You know, how smart can they be? Like they're useful, but they're not going to change the world.
But then you talk to the people who've actually trained these systems, built these systems, see how fast they're improving, and they also think AGI is here in three years. And so I don't know if I know the answer, but I find this kind of fact or observation really
interesting. And so with that, hopefully
interesting. And so with that, hopefully you've gotten an overview of basically everything that's happening in the AI space today. I really appreciate you
space today. I really appreciate you taking the time. Um, if you have more questions, you want to follow up, please email me. My email is here. And if you
email me. My email is here. And if you want to see the full presentation with a lot more detail, um, that's just in the link where you would have seen this. And
so, thank you so much.
Loading video analysis...