LongCut logo

2025 State of Foundation Models

By Innovation Endeavors

Summary

Topics Covered

  • Self-Supervised Learning Scales Data Exponentially
  • Frontier Models Depreciate in Weeks
  • Inference-Time Scaling Unlocks New Frontier
  • Build AI Systems Not Single Models
  • AI Reshapes Organizations to Generalists

Full Transcript

Hi everyone, my name is Davis Tribe. Uh

I work at Innovation Endeavors. I'm an

investor on the team here. We're an

early stage venture capital firm that primarily invests in very technical founding teams solving hard problems in data science and engineering. Um today

we're excited to give you an overview of kind of the state of foundation models in 2025. And so we'll aim to give you a

in 2025. And so we'll aim to give you a holistic overview of everything that's happening and how we got here. In terms

of what we'll cover today, we'll start with a brief history of kind of what took us here over the last five five years or so. We'll then talk a lot about the model layer and what's happening there. From there, we'll move more into

there. From there, we'll move more into the application layer, and we'll talk more about where we're seeing key use cases of foundation models, as well as tips, tricks, and observations on what it looks like to build foundation modelbased products. Um, from there,

modelbased products. Um, from there, we'll move more into market structure, market dynamic, and some of the economics around what's happening in the foundation model category right now. and

we'll finish with some observations on what we might expect moving forward. And

so with that said, let's get into it. Um

so as mentioned, let's start by kind of setting the stage. And my goal here is to give you a quick overview of what happened over the last five years to get to the point where we are today. Um

there were really two key technical insights that kind of ushered this current technology wave. The first was a data insight. This uh uh technique of

data insight. This uh uh technique of self-supervised learning. This is an a

self-supervised learning. This is an a way to scale data in machine learning.

And so the key idea is actually quite simple. You look at a bunch of latent

simple. You look at a bunch of latent data that exists, for example, on the web, in this case, sentences. And all

you do is you split up that data in different ways. And so here I have an

different ways. And so here I have an example of a sentence that perhaps you split in half. And what you can see is that when you do this, you kind of create an implicit input output or input and labeled output pair. And so then if

your task for a model is given an input, predict the output. you've created an implicit piece of labelled data without requiring any degree of human annotation, human labor, or these other things that were traditional bottlenecks

in scaling machine learning data. And so

with this technique, it was suddenly very possible to create massive amounts of kind of implicitly labeled data. The

second key technique or observation was an architectural one, the attention architecture. If you've heard the word

architecture. If you've heard the word transformer, most often it's referring to a model that uses this attention architecture. And while I won't go into

architecture. And while I won't go into all the detail, the key insight here was one of how to scale compute. um

specifically attention the attention architecture is highly parallelizable and this made it much more efficient to scale up compute to very large degrees especially on top of GPUs without it requiring a huge amount of time or cost

and so as you can see with these two techniques we had a way to scale data we had a way to scale compute and so we started scaling models from that and what researchers started to observe is that as you scaled models to larger and

larger degrees you started to see emergent behavior emerge and so here I have a couple graphs of different behavioral traits you might ask a model to do. So on the left we have for

to do. So on the left we have for example how well a model performs certain arithmetic tasks. In the middle we have a graph of how a model might perform certain natural language understanding tasks. And the x-axis here

understanding tasks. And the x-axis here is how much compute how large the model is. And the y- axis is the accuracy on

is. And the y- axis is the accuracy on that task. And what you can see is that

that task. And what you can see is that for a lot of these tasks for a long time performance does not improve at all.

It's basically at zero. And that in a certain scale the performance jumps up.

And so this was really surprising right?

The model doesn't learn. it seems to not learn for a long time and then all of a sudden it can do something that it couldn't do before. And so researchers were surprised by this, intrigued by it and started pushing scaling further. And

so from that first transformers paper there was an insane increase in model scale over the next five six years. Uh

inc in fact this graph kind of visualizes that where in three years there was a 15,000x increase in the scale or the size of frontier models. If

you compare that to Moore's law, the green line here, which was doubling only every two years, but propelled the semiconductor industry forward for the last half a century, you can get a sense of why this started to be exciting. And

what this led us to as we continued scaling was essentially the fastest rate of technology adoption of a new technology of all time. Uh, Chad Shippet is now at almost 500 million weekly

active users. Almost a billion people

active users. Almost a billion people use AI on a monthly basis nowadays. And

on the right here, I have a fun graph.

It shows uh how long it took different technologies to reach a 100 million users. So for example, electricity took

users. So for example, electricity took 46 years, television took 26 years, the internet took seven, it only took Chat GPT 60 days. Um and what's interesting

is that we have not only now seen technology increase in this space, but revenue now as well. Um so I have a couple of illustrative examples here.

GitHub Copilot reached 400 million ARR after only three years since launch. Now

maybe that's a slightly unfair example because GitHub obviously had a lot of existing distribution. But then you look

existing distribution. But then you look to some of these newer startups like Midjourney and Cursor which hit between 100 to 200 million ARR in just a year or two and critically with somewhere

between 20 to 40 employees each. This is

unprecedented in the history of technology and is a lot of where the excitement around AI is starting to come from. What we also see is that if you

from. What we also see is that if you look at the technical metrics that matter in this space, they're also all following exponential curves. So, a

couple of examples here. The context

window of Frontier models, which is essentially how much data they can reason about when they're making an inference, has gone up somewhere between 100 to 500x over the last year and a half, depending on how you measure it.

The cost per token for a GPT4 level model. So, if you fix quality over the

model. So, if you fix quality over the last year and a half, for a given quality, the cost has reduced by over a thousandx. And then if you look at how

thousandx. And then if you look at how much compute is used to train frontier models. So this is essentially

models. So this is essentially correlating with the size or the amount of compute put into making them what they are. That has also gone up a

they are. That has also gone up a thousandx over the last year and a half.

And these are just a couple of examples.

But what's interesting about this space is that basically every metric follows the same kind of super exponential curve.

This is a fun variation of kind of what that means. Right? So there's a bunch of

that means. Right? So there's a bunch of ways you can benchmark foundation models, right? You can think of these as

models, right? You can think of these as like quizzes or exams for getting a model to try to do something. So for

example, you might come up with a science reasoning exam, a grade school math exam, a general reasoning exam.

I've plotted a lot of these in the graph here. And what you can see is that the

here. And what you can see is that the LM rate of improvement is so significant that it essentially beats all benchmarks almost as soon as we can come up with them. They all get saturated, right? And

them. They all get saturated, right? And

so one of the most interesting things about the space is we can hardly even come up with ways to effectively measure large language models because of how quickly they're improving. And now

indeed they can score almost perfectly on even professional level exams in areas like mathematics, science, and philosophy. Um, another interesting

philosophy. Um, another interesting viewpoint on how dramatic the rate of improvement is is looking at how long of a task that something that is for something that a human can do, an LLM

can reliably achieve. So this is a graph that's showing you on the x- axis the um model release date and on the y- ais the duration of tasks that a model can

reliably do at at least a 50% accuracy rate. And what you can see is that in

rate. And what you can see is that in 2019 2020 models could often only do tasks that a human might take somewhere between a couple seconds maybe 10 seconds to do. Um this is doubling every

seven months and indeed LLMs can reliably do tasks that might take a human an hour or multiple hours. If you

play this out, it's likely that LLMs will be able to reliably automate tasks that take days or even months over the next couple years, which is really unbelievable, right? Um, and here's a

unbelievable, right? Um, and here's a couple examples of what that reasoning capability improvement looks like in practice. On the left here, I have a

practice. On the left here, I have a graph from a paper called towards conversational diagnostic AI where they were computing uh comparing fine-tuned large language models versus doctors on various diagnostic tasks. Right? A

patient comes in presenting with certain conditions. What should you do next?

conditions. What should you do next?

What you actually see is that LMS now outperform doctors in many diagnostic tasks. Uh on the right here I have a

tasks. Uh on the right here I have a more math example. Um LMS can now solve geometry problems more than almost all humans on earth including the best mathematicians. This is from a paper

mathematicians. This is from a paper called alpha geometry. And so these are just fun examples. But what's very clear is that as we push the scaling further, LMS are now effectively becoming the

best in the world at almost all mainstream subject areas where we've traditionally considered humans especially strong. Um, and while most of

especially strong. Um, and while most of what I've talked about thus far has been oriented towards language, all of these seem same laws and scaling curves apply to other modalities. And so here's just

one example from the image diffusion space. Um, a couple of years ago on the

space. Um, a couple of years ago on the left, the imagining model from Google Deep Mind was the best in the world. It

was the Frontier. As you can see, it kind of looks like maybe a high school kids drawing or something like that. Um,

on the right, we have a more recent example from a startup called Visual Electric. And what you can see is that

Electric. And what you can see is that it's essentially indistinguishable from high-end photography. And so all these

high-end photography. And so all these same laws and scaling curves are applying to other modalities, which I'll get into more in a second. And so the main thing I want you to take away from all of that is that we figured out how

to scale models. We've pushed that to the max over the last five years, and we're starting to not only see real revenue and real companies being built on top of that, but more critically, we're seeing the capabilities of these

systems so far exceed what I think even the smartest researchers thought possible. And so with that, that takes

possible. And so with that, that takes us to where we are today. And the rest of the presentation is going to talk more about what's happening right now and what we might expect to happen over the next couple years. And so let's start with the model layer because

that's really the most critical. So

first, let's talk about cost for a second. Um, the training costs for

second. Um, the training costs for Frontier Foundation models are unbelievable. Uh, a leading model

unbelievable. Uh, a leading model conservatively now costs over $300 million to train and I'm not including any labor costs associated with that or data costs associated with that. Um,

here you can see a graph kind of showing how these costs have increased over time. So, GPT3, which is really the chat

time. So, GPT3, which is really the chat GPT moment, right? Um, that model was trained in 2020 and cost only about $5 million to train. And since then the si the cost of these models has gone up in

a very consistent exponential fashion from 10 million to 100 million to 200 million and now over 300 million for frontier models. What's interesting

frontier models. What's interesting though is that in some sense these are the fastest appreciating assets of all time. Frontier models are typically

time. Frontier models are typically depreciating or becoming commodity on just a 6 to 12month time scale. So here

I have a specific example. So GPT4 was released in March 2023. It cost about $100 million to train and it was of course a closed source model from OpenAI. DeepseekVL

OpenAI. DeepseekVL which was a model of very similar quality was released exactly a year later was open source and cost less than $10 million to train. The graph on the

right here is comparing DeepS versus GPT4 on a number of mainstream benchmarks. And what you can see is

benchmarks. And what you can see is indeed it performs almost identically.

And this is really the story of the foundation model space, right? A re a leading lab will now spend 300, 400, $500 million to train a new model, but within a year, there's an open source

version that's just as good. And so that creates very interesting market dynamics in the model space. Um, kind of playing off of that, what we see is that open- source continues to converge with closed source in this space. So what you see

here is a graph comparing the performance on a bunch of benchmarks for closed source models which are the light blue on top versus open source models which are the dark blue on the bottom.

And what you can see earlier on in 2023 and similar you know there was some degree of divergence in these models.

But as time has gone on uh the time lag between the best closed source models and the best open source models has gotten tighter and tighter and tighter.

And so in some ways this is a different lens on the point that I expressed in the last uh in the last slide and it gives you a sense of uh why some of the model providers are trying to figure out so much how do you go beyond just

becoming a model company and be more of an application layer company.

Um another interesting metric is so this is a graph of how long top models stay in the top model set and so there's a model proxy that a lot of developers use called open router that they use to make

calls to different model providers. And

if you look at the data from open router, the data is really interesting.

So this is a histogram of for a given new model that's in the top five of open router. How long does it stay in the top

router. How long does it stay in the top five? And so what you can see here on

five? And so what you can see here on the right is that there's a couple of models that last, you know, 30 plus weeks, 20 plus weeks, but the median time a model stays in the top five is just three weeks. So think about that.

You spend 300 $400 million to develop a new model and after a couple of weeks there's something better and people have moved on. It's really there's no

moved on. It's really there's no precedent for this in the history of technology.

So far, I've only talked about the cost of compute to train models. But what's

also important to consider is the data required to train models. And data

budgets are also insane as well. Um, so

I'll give you a couple of examples.

DeepMind spends over a billion dollars a year on data annotation, data labeling.

Um, OpenAI jointly for training and data spends about three billion in total, a huge portion of which is data. Um for

Llama 3 in particular, Meta spent over 125 million just on post-training data.

So again, not compute, just data. And um

on a more micro level, if you are a professional in an area like uh law or healthcare, if you're a doctor, OpenAI will actually pay you somewhere between $2,000 to $3,000 for a single reasoning

trace, which is really crazy. Um and so if you combine this all up, I have on the right here a graph that gives you a roughly illustrative spend of a frontier model. You might be spending somewhere

model. You might be spending somewhere between 150 to 300 million for training the base model. You're probably spending somewhere between 50 to maybe 150 for the post-training and then the data

itself is an additional you know 50 100 150 million. And so all in you're very

150 million. And so all in you're very quickly hitting 500 million plus for frontier models.

On a different lens um something else that's interesting to explore right now is that there's a little bit of a shift away from this idea of just how do we scale up parameter count to the max. And

so this is a graph of the number of parameters of frontier models. What you

can see is that kind of playing off of that graph I showed you earlier, the number of parameters has gone up in a super exponential rate, but more recently has come down a little bit. And

the reason for this is actually kind of interesting. I think it reflects a move

interesting. I think it reflects a move from this being more of a research denominated space to a more of an application denominated space. Large

models are more efficient. You can use less money to make them more powerful, but they're much harder to serve to users because they take a longer time to compute something and it costs a lot more data, a lot more money for them to

compute something. And so more recently,

compute something. And so more recently, there's been a lot more effort to oversaturate models. So to use less

oversaturate models. So to use less parameters, but train for a much longer amount of time. This makes training less efficient, but it makes serving a lot more efficient. And so it's likely that

more efficient. And so it's likely that this trend is going to continue in terms of you want kind of small to medium-sized models that have huge amounts of data distilled into them.

So this plays onto a broader trend that's perhaps the most important trend in the AI research world right now and that's pre-training as we know it. So

that initial curve I showed you of just scaling the base model as far as you can is kind of coming to an end. And the

reason for this is mostly a data one, right? So we had this technique of scour

right? So we had this technique of scour all the data on the web, turn it into this simplicity labeled data and then train on that. The problem is that there's only one internet and we're kind of running out of data. And so here you have a a slide from a presentation by

Ilia Sudskver who's one of the most famous AI researchers in the world who's essentially saying this right we keep going growing compute but we're running out of data and so the key question in large language models right now is what

comes next how do we keep scaling beyond just pre-training right and so there's a couple of ideas of what you might be able to do uh one path is that a lot of labs are taking and is certainly important is using more synthetic data

rather than real data this is important but it's not the only thing um second is we can build more complex systems, right? So maybe the models don't get

right? So maybe the models don't get better, but we just combine different models in different ways to build powerful products. That's definitely an

powerful products. That's definitely an important direction, but it doesn't really touch on the research side. And

the last, which is what people are most excited about right now, is inference time scaling or so-called reasoning models. And so let's spend a little bit

models. And so let's spend a little bit of time talking about those. Um, so a lot of researchers think that this inference time compute or reasoning models are really the new frontier. The

idea of these is actually really simple.

So here's a basic visual. Imagine you're

asking a model a complicated question, right? So, for example, what's the

right? So, for example, what's the implication of the new Canadian prime minister on foreign exchange rates, right? And what you do is you train the

right? And what you do is you train the model to not answer right away, but to instead think for a long period of time.

And so, I've visualized this here to the right where the model actually develops an internal monologue where it talks to itself or kind of thinks to itself for a very long time before it answers. And so

in this example, it may think for five minutes, coming up with a plan, identifying different things it needs to consider, and then after it synthesizes all of that, it gives you the answer.

And so to you as a user, it looks like it wrote a very small answer, but it actually output thousands, if not tens of thousands of different tokens. What's

interesting is that it seems like this approach of thinking before answering is actually a new type of scaling law. So

on the left here I have a really famous graph from OpenAI that conveys a lot of what I discussed in the earlier part of this presentation. Right? As you train

this presentation. Right? As you train models for a longer period of time with more data that's the X-axis the quality of those models goes up in a reliable fashion. That is the story of the last

fashion. That is the story of the last five six years right on the right I have a different graph. This is showing you on the x-axis the longer that you think

the better the quality is in a reliable scalable fashion. Right? And so this

scalable fashion. Right? And so this excites researchers a lot because it means that there's now a second exponential curve that we can start to play off of this idea of think longer,

you get better answers, right? Um and so here's an example of this, right? Um

small reasoning models can outperform massively mod larger models if they have enough time to think. So if you look on the top here, uh look at the orange line. This is a small three billion

line. This is a small three billion parameter model, but it's trained to be a reasoning model. The dotted line on top is a 70 billion parameter model that is not trained to be a reasoning model.

What you can see is that when the orange model doesn't have enough time to think, it's way worse than the 70 billion parameter model. But if it thinks for

parameter model. But if it thinks for long enough, it actually surpasses the 70 billion parameter model in spite of the fact that that model is 20 to 30 times larger. And so this gives you

times larger. And so this gives you maybe a more implicit sense of why this is so exciting. So how do you develop regime models? Right? There's really two

regime models? Right? There's really two key ways. The first is that you just

key ways. The first is that you just generate a lot of this form of reasoning data. You find a lot of examples of

data. You find a lot of examples of people outlining their thought process in complex domains. You might pay for this data. You might synthetically

this data. You might synthetically generate it in areas like mathematics or you can kind of train a special type of verifier or reward model that can guide a complex reasoning trace and you use

that as training data to train the models. The second approach is more of a

models. The second approach is more of a systems approach uh where you use a search technique at inference time. And

so in this case when the models generate an answer it generates part of an output. You have a second model that

output. You have a second model that says go more in this direction or think more in this direction and they kind of go in a loop right um and so the output of the first is you basically have a

model that's thinking to itself for a really long period of time in a large trace of tokens. That's kind of the example that I showed you earlier. And

in the second, you can think about it more as there's a secondary control system at inference time that's mediating the back and forth between a model and a verifier and that lets it think for a long period of time. Both of

these can work and both are interesting though I would say most models probably look like the former right now. So let

me give you an example of this. Some of

you guys may have used 01 pro. What's

actually happening here is you are taking a base reasoning model that's trained with that first method that I talked about and then you're sampling four different generations of that base reasoning model and then you have a

different model a verification model that says hey I think this is the best of those sampled outputs to you as a user this is all hidden but you end up with a really really good answer that's much better than what you might get with

a base model and so as you can see bridging off of this I've touched in the last couple examples of this idea of a verifier model or a reward model. This is a specialized type of large language model

that is taught to verify things, right?

And there's actually two types of verifiers. You can think of procedural

verifiers. You can think of procedural verifiers. These actually aren't large

verifiers. These actually aren't large pre-trained models, but they're domain specific ways of verifying a problem.

For example, in code, you can compile code. In math, you can use these theorem

code. In math, you can use these theorem provers that kind of show you whether or not a math proof is valid or inconsistent. And then on the right you

inconsistent. And then on the right you have learned verifiers where you basically are taking a base large language model and training it to do this form of verification or reward.

Right? The procedural verifiers tend to be more accurate in their domain but they're definitionally domain specific.

Right? The learn verifiers are definitionally more general but are typically less accurate in a given domain. And so a lot of the exploration

domain. And so a lot of the exploration right now in the ML world is how do you combine these two approaches or can you build very high quality generalist verifiers? But either way, this idea of

verifiers? But either way, this idea of verification and reward is becoming absolutely essential to develop these forms of reasoning models.

So, let's then talk for a second about context windows. Um, you know, a lot of

context windows. Um, you know, a lot of people are excited by the idea that these recent models have huge context windows in theory. There's a number of models out there that say they have 1 million, even 10 million token context

windows, right? As a recap, this is how

windows, right? As a recap, this is how much data the model can consider when it makes an inference. Um, and so yes, it's great that conduct windows are growing dramatically, but there is a little bit of a gotcha in this space. Um, and so

here I have an example from the Llama 4 model, which espouses to have a $10 million 10 million uh token context window. But if you actually look at the

window. But if you actually look at the paper for Llama 4, what they say is that we mostly only trained with context at 256k tokens. Um, and then they show a

256k tokens. Um, and then they show a very very simple eval of only retrieving a single piece of data from the context.

The problem is that in reality, most real world problems, real world questions, you need to reason over many pieces of data in the context in a complicated way. And so, not only is

complicated way. And so, not only is this eval actually not very reflective of real world use cases, but of course, if you never actually had data to make use of a 10 million uh token context window, you actually have no idea if the

model can actually make use of it. And

so there's this interesting dichotomy in the space where context windows are growing and that is great but good founders, good people building products in the space can't actually use them the way you might think.

Another interesting aspect of the model category right now is tokenization. Um

and so to give some context here when a model takes the input language that it gives you and has to figure out how to turn that language into a set of discrete tokens, right? And so on the right here I have some examples of let's

say you give the word egg to a model.

You've got to split that up in some way.

And so some models will split it up where it's like eG is one token, the period is another. Other models will say E is a token and then GG is a token and the period is a token. There's a lot of different ways to do it. The problem is

that when you tokenize words in this way and then you ask the model things that are directly related to the structure of the language, the models then get really confused at how to answer the question.

So many of you all have likely seen that while these models are brilliant, in many cases they get basic arithmetic questions wrong. That's because if you

questions wrong. That's because if you feed it something like 3.11 or 2.978, it'll split that up in a way that the model doesn't understand that that's a single numeric value. And so I have a

funny kind of blog post from Andre Kaparthy who's another famous uh scientist in the space in the left who's basically like tokenization is the heart of all weirdness in LLMs. Why can't they spell words? Tokenization. Why can't

spell words? Tokenization. Why can't

they do arithmetic tokenization? what is

the real root of all suffering tokenization. So there's a lot of desire

tokenization. So there's a lot of desire to try to figure out how do we fix tokenization in models to help them improve on these kind of language manipulation tasks but I think it's still not well understood exactly how

you might do that. Another really

interesting research direction right now is the world of mechanistic interpretability. So a big question in

interpretability. So a big question in foundation models is you have these huge models, you can ask them any question you can think of. Um how are they coming up with the answers that they give you, right? And can we understand the thought

right? And can we understand the thought process maybe to help better understand when the models are hallucinating versus when they're giving you a well thought through answer, right? And so Anthropic and a number of other model labs have

actually started to push this idea of mechanistic interpretability a lot, which is can we start to analyze how a neural network activates to understand what is its thought process when it gives us an answer. And so here's

actually a really cool example that Anthropic recently gave. And what they did is they identified a way to extract the neurons from the uh neural network that's being activated and to identify

which combinations of neurons in which order represent a given concept. And so

here is a visual of one example of this.

They identified a set of neurons that they thought pertained to the Golden Gate Bridge in San Francisco. And what

they showed is that any input that was given to the model that included or referenced the Golden Gate Bridge or even related concepts like San Francisco in some way activated this set of neurons that they thought related to the

Golden Gate Bridge. Taking it a step further, what they then started to do is ask questions that had nothing to do with the Golden Gate Bridge, but they would artificially amp up the neurons that they thought related to the Golden Gate Bridge. And interestingly, when you

Gate Bridge. And interestingly, when you do that, the model will answer the question, always answering in some way related to the Golden Gate Bridge. So,

you give it a math question, it'll give you an example related to the Golden Gate Bridge to answer the math question.

And so, this is broadly starting to be described as the idea of model steering.

And while the Golden Gate Bridge examples may be fun or silly, you might imagine that let's say you have a code focused model and you understand the set of neurons that relate to logical thinking, arithmetic thinking, things

like that. you might want to bump those

like that. you might want to bump those neurons up when you query the model for questions related to coding. And so this is still early research, but I think it's a really exciting area.

So beyond language, uh the other kind of big direction that matters in the space right now is multimodality. Over the

last couple years, vision language models or VLMs have really gained steam.

If you use any of the products like claude or chatbt now you can input kind of an arbitrary combination of text image video audio and it'll synthesize it all together and give you an output right and so that's cool and the way

that works actually pretty simple you just embed all of these modalities in a single kind of uh latent space and then you feed it to a pretty standard kind of language model output to give you language. What's earlier is the idea of

language. What's earlier is the idea of an omnimodal model. And so the difference here is imagine a model that can not only take any combination of these inputs but output any combination of those outputs in an interled

structured fashion. And so on the right

structured fashion. And so on the right here I have a um extract from a paper from meta that was basically showing you know asking a question about some birds.

And what you can see is that the answer kind of interleaves pictures of birds, answers about birds, more pictures of birds etc. Building models like this is still not well understood and is a bit tricky. It's really hard to get data

tricky. It's really hard to get data that reflects this well. And so this I think it has a lot of uh respects what you might expect to happen over the next couple years in omni modality.

There's also a lot of work right now in alternative architectures, alternative architectures, right? And so I talked

architectures, right? And so I talked earlier about the transformer architecture and that's really still the basis of a lot of these models, but there's some really cool explorations on the FA space now of other directions you

might want to go. And so really briefly some examples. There's a concept known

some examples. There's a concept known as state space models, which is you take attention, but you kind of relax it a little bit so that you can better handle very large context windows. This matters

a lot in some domains like audio. And

there's a pretty cool company called Cartisia that's pursuing that direction.

Um, another uh architectural trend is what's known as flow matching models.

And so this is a generalation of generalization of a diffusion model where that allows for more efficient learning than diffusion has traditionally allowed. Stability AI is

traditionally allowed. Stability AI is pushing this direction pretty hard. Um,

inductive moment matching is another interesting idea. It's a diffusion model

interesting idea. It's a diffusion model that allows you to make better use of predicting where you need to diffuse to.

And so it lets you create much faster results than you might otherwise expect.

Luma, which is one of the leading image and video companies, is exploring that.

And then finally, there's some really cool work in discrete diffusion models where you're using diffusion, which is traditionally reserved for images and videos, but actually applying it to language. And that's a bit non-intuitive

language. And that's a bit non-intuitive because language is a very discreet space and diffusion has traditionally been been considered a continuous process. Um, but it turns out it works.

process. Um, but it turns out it works.

And so Inception is a cool company working on that direction.

Touching on image models a little bit more. You know, I gave you that example

more. You know, I gave you that example earlier of how quality has increased a lot. But what I wanted to emphasize is

lot. But what I wanted to emphasize is that it's not just that quality has gotten better. Just like how in language

gotten better. Just like how in language we've seen a lot more sophistication in what the models can do, there's also a lot more precision and control in image models today. And so here's a couple of

models today. And so here's a couple of examples of this. On the left, um, many of you all probably saw this Giblify trend in OpenAI where you can upload an image, ask it to turn it into a studio

Gibli style image, and it will do this kind of giblication of the image. Um,

what's interesting is this is actually a really complicated use case. If you

think about it, you're giving it an image and what you're implicitly saying is, I want you to maintain all the structure, all the components, all the characters, just change the style. A

couple of years ago, an image model couldn't have gotten even close to this.

And you could have achieved it, but you had to train a bunch of specialized models and combine them in a really complicated way. Now, it just works. Um,

complicated way. Now, it just works. Um,

on the right, I have another example of this, which is a couple years ago. If

you asked for any kind of image that had text inside of it, it would be a disaster. Um, now you can not only get

disaster. Um, now you can not only get text and images, but the text can fit into the style of the image, which is really remarkable. And so again, just as

really remarkable. And so again, just as we've seen this improvement in the reasoning capabilities of language models, you've seen similar quality improvements in areas like images and of course video.

On the topic of video, I really think that we're about to hit the chat GBT moment for video. This is an example from Google's VO models, which are their newer video models. And what's

remarkable is that the quality is starting to get imperceptible from a real human video. For example, that dog video in the top right. Um, you

basically saw the water reflections, the light refracting in the water, these things that are crazy for a video model to do. And so, we're actually starting

to do. And so, we're actually starting to see a ton of startups being built around this now when a year ago, two years ago, it was still in that uncanny valley moment.

Robotics is another area where there's a lot of exciting work happening. This is

a video from Physical Intelligence, which is one of the robotic foundation model companies. And basically, they

model companies. And basically, they asked the robot, can you make a certain type of roast beef and cheese sandwich?

And what you can see is that the robot is correctly assembling the roast beef and cheese sandwich. However, the robot has never been trained specifically to do this task, and it's never seen this precise environment before. The reason

it can do this is because it's making use of vision language models that have such a latent understanding of the world and how the world operates that the robot is able to come up with a plan even in an environment that it's never

seen. If you have any background in

seen. If you have any background in robotics, this is crazy because even just a couple years ago, you had to do so much task specific and environment specific training for any type of

application or use case. And this is why people are really excited about robotics right now.

There's also uh some cool work in an area known as world models. And so I'll play this video. And what I want you to pay attention to is it's not only a video that's being generated, but you can see in the bottom right of the video

that there are keys getting pressed on a keyboard kind of like a video game, right? And so what's actually happening

right? And so what's actually happening here is that someone is controlling with a keyboard and mouse the character. And

the character is moving through the world just like a video game, but it's not a video game. There's no 3D model.

There's no Unity or Unreal Engine. It's

just doing frame by frame video uh prediction, but each next frame is conditioned on the current controls right now. And so what's cool is you can

right now. And so what's cool is you can actually create like dynamic video games from video prediction models in a way that it like the video game is consistent. It has physics concepts.

consistent. It has physics concepts.

It's really crazy. Someone's actually

released a Minecraft or a version of Minecraft that works this way and it doesn't work perfectly, but it actually plays far better than you might expect.

For now, these techniques are mostly being used to generate data for areas like robotics, but it's very likely that a lot of entertainment and media in the future has this kind of dynamic generation rather than being all static,

pre-rendered, pre-ompiled, things like that.

There's also a lot of progress right now on audio, voice, and speech. So, some of you may have tried these music models from products like they are extremely good. In fact, some of the songs on if I

good. In fact, some of the songs on if I heard on Spotify, I would think it is a real song made by a human. Um, audio and voice cloning is also really good. If

you use a product like 11 Labs, you can upload maybe 30 seconds of your voice and get basically a perfect speech clone of yourself. You know, on these

of yourself. You know, on these dimension, it's likely that in the future voice actors just license their voice as an API rather than actually doing voice acting directly. What's a

little bit newer are voicetovoice models. You can think about this kind of

models. You can think about this kind of like a language model that operates in the voice space or the sonic space, not in the language space. This is still very early. Most voice agent startups

very early. Most voice agent startups will still take audio, transcribe it to text, reason on the text with a large language model and then synthesize it back to voice. But if we can get voicetovoice working, it'll unlock a lot

more use cases because it will be much lower latency. And so phonic is one cool

lower latency. And so phonic is one cool company in that space.

The last model subject area that I wanted to talk about were these life sciences. And so I'll start with an

sciences. And so I'll start with an example. There's a pretty cool model

example. There's a pretty cool model called EVO 2 from the Ark Institute.

They describe it as a DNA foundation model. And so the actual idea here is

model. And so the actual idea here is pretty simple. Uh so as a recap, genomic

pretty simple. Uh so as a recap, genomic sequences are sequences of certain uh nucleotides, right? A G, T and C in a

nucleotides, right? A G, T and C in a random order depending on the bean. And

just like language, you can take a given genomic sequence and split it up, right?

Get an input and get an output just like we did with language earlier. And then

very similar to before, you take a language model base and you just say, hey, predict the output given the input, but instead only apply it to genomic sequences, right? and you end up with a

sequences, right? and you end up with a model that's very similar to a language model but only trained on genomes.

What's interesting is that if you do this, you get some really interesting behavior. Um, so how might you use

behavior. Um, so how might you use something like this? Well, one effect uh one use case is mutation effect prediction. And so the idea here is kind

prediction. And so the idea here is kind of like because these models predict probabilities of the next token. If you

change a genomic sequence, you can look at how that changes the probabilities of the next token. And if all the probabilities go way down, it's kind of an indication that that's a really weird sequence that might not exist naturally

in nature, might not be biologically viable. Maybe an analogy there would be

viable. Maybe an analogy there would be if I had a sentence, I went to the store and bought an elephant, language models would be really confused and not know what to say next. And so you can use this to understand biologically viable

sequences. Another use case is

sequences. Another use case is biological feature discovery. So you can take all those same ideas that I uh showed you in that Golden Gate Bridge example and apply it to these genomic models to actually identify biologically

relevant features or concepts that are influencing how the model is making a prediction. And then last, you can do

prediction. And then last, you can do something that's more like guided genome design where the genomic foundation model predicts possible sequences in the genome and you combine that with a

secondary model that for example might do um uh biological function prediction and you can run them in a loop, right?

generate a bunch of possible genomic sequences, predict the likely biological function of each of those sequences, pick the best ones, and repeat back and forth, right? Um, all of this stuff is

forth, right? Um, all of this stuff is still pretty early. It's researchy, but it's cool and it's likely to progress very quickly, just like we've seen with language models. Um, and this is just

language models. Um, and this is just one example in the sciences, but there's actually a ton of use cases of foundation models in the sciences right now. So, for example, given a function

now. So, for example, given a function that you want to have, how do you predict a protein that might create that function? Given a protein structure, how

function? Given a protein structure, how might you predict the geometry or the way that protein folds in a biological sample, which is basically how proteins affect the biological world? Um, given

how you might perturb a cell, how might that affect the expression of a cell?

And then all the way into other science domains, right? So given past weather,

domains, right? So given past weather, predict the future weather or given the a set of atoms and their coordinates in space, how might you predict the properties of the materials that that that um that they create? And so the I

would say the market maturity in a lot of these sciences categories is still very early. One particular problem in

very early. One particular problem in the sciences broadly is that the data is much noisier, much sparser and we have a lot less of it. But there's a lot of progress in these domains and I would expect that over the next couple years

many of them them become much more mainstream and really go from research to kind of real world applications.

And so that kind of gives you an overview of everything that's going on in the model space right now. From there

I want to go more into the applic uh the application layer of AI and I'll start by talking about use cases. Um there's

obviously a lot of use cases of foundation model based companies. So I

will not be able to comprehensively cover them all but really my goal here is to give you a sense of the broad categories where we see the most work happening. And so to start I would say

happening. And so to start I would say that really the marquee use case for large language models is search and information synthesis. If you've used a

information synthesis. If you've used a product like Perplexity or even ChatGBT or Glean, I would really put them in this category, right? You're using large language models to take a lot of information in the world and synthesize

it. They're really good at that, right?

it. They're really good at that, right?

Um, so this is kind of that next generation Google use case. But what you may be less familiar with is that there are literally thousands of domain specific versions of this, right? So you

see a lot of versions of this in investing where a certain type of investor or analyst might want to analyze a lot of information in a domain to help come up with some financial analysis or prediction or investment recommendation. You see it in the legal

recommendation. You see it in the legal domain where basically the whole job of lawyers is to an analyze unstructured information like case law, precedent, things like that. You see it in construction where construction workers

want to be able to understand the blueprints and instead of calling the project manager, can LM just tell them what's going on? What do they need to do today? You also see it in areas like

today? You also see it in areas like healthcare. There's a really cool

healthcare. There's a really cool company called Open Evidence that helps doctors understand all of the clinical research in a domain as part of how they're helping or diagnosing a patient.

And so these again are just a couple of examples, but I think this is likely the most strong use case for large language models where they're the most companies with very strong product market fit.

The second key use case that you might have heard about is software engineering. Um, you've probably heard

engineering. Um, you've probably heard of products like Cursor, Windsurf, Augment, or GitHub Copilot. Um the

degree to which AI is disrupting software engineering is really hard to convey. Um so first of all this is the

convey. Um so first of all this is the fastest growth market of all time.

Cursor is the fastest growth SAS company ever. They hit a 100red million in

ever. They hit a 100red million in annual revenue in less than two years.

Um and if you look at the market in aggregate it's an over a billion a year market in just a span of two to three years which is unprecedented. But more

importantly, what's crazy here is that you speak to software engineers who've used these products, use these tools, and they really feel like it's the biggest change to software engineering since maybe the invention of the compiler. And so, we're really excited

compiler. And so, we're really excited about this space. And what's interesting is that we're starting to see a lot of startups that go beyond just the co-pilot, right? Um, more specifically,

co-pilot, right? Um, more specifically, we're seeing LM start to touch the entire software development life cycle.

So we see LM enabled companies in areas like code review, documentation, code migration, prototyping, testing, and QA.

Basically, pick any subcategory of software engineering or the software development life cycle. And you're

starting to see profoundly innovative companies using LMS to rethink that space. We generally suspect that the

space. We generally suspect that the entire developer tools ecosystem is going to be fundamentally rethought in a world of large language models.

Bridging on the idea of software engineering co-pilots, this idea of co-pilots and agents also applies to basically all other forms of specialized and particularly highskilled knowledge

work. So we see cursor style products in

work. So we see cursor style products in all of these other domains including areas like PCB engineers, game developers, electrical engineers, accountants, 3D designers, mechanical

engineers. In all these spaces, you have

engineers. In all these spaces, you have someone who is a really high-skilled professional who's typically designing, building, and testing really complicated systems, whether you're writing code or working in a CAD tool or designing a

chip. And in all these cases, you can

chip. And in all these cases, you can build a bunch of co-pilot style workflows that dramatically leverage and accelerate that high-skilled knowledge worker. And so, we think all of what's

worker. And so, we think all of what's happening in software engineering is going to apply to these other domains as well.

Creative expression is another area where there's obviously huge impact. You

know, we've touched a little bit on the image models and the video models. Just

a couple of examples here. We see a lot of excitement in areas like video and animation from companies like Runway. I

now know of in Hollywood multiple examples of feature films that are being developed fully produced via generative models, which is crazy. Um, you see it in many forms of vertical design. For

example, Visual Electric is a cool company in brand design. um you could upload, for example, a picture of an object that you want to show and instead of doing high-end photography on that product, it just creates a perfect

render of it for you. You also see it in areas like 3D design, right? Um and so these are just a couple of examples, but I would say that basically every area of

design work and creative work is being rethought in some way by generative AI.

And then there's a lot of other cool stuff that I can't go into all the depth on, but I'll just give you a taste of it. Um, so there's a lot in verticalized

it. Um, so there's a lot in verticalized writing. So think if you want to get an

writing. So think if you want to get an immigration document or you want to submit a defense bid. There's a lot of these markets where you need to write a really complicated specific piece of content. LM are really good at that.

content. LM are really good at that.

Second education coaching and companionship. Um, so speak for example,

companionship. Um, so speak for example, is a cool language learning product, but there's so many areas where LMS are actually probably better than even the best teachers in the world at teaching you something, guiding you through something, or maybe even giving you

therapy. And there's a lot happening in

therapy. And there's a lot happening in that space. Um, voice agents, which I

that space. Um, voice agents, which I touched on a little bit earlier, are really exciting. There's so many domains

really exciting. There's so many domains in the world where there's no digitization. There's no APIs, but if an

digitization. There's no APIs, but if an AI can call people, you can suddenly automate things you could never automate before. So, Fair Health is a cool

before. So, Fair Health is a cool company that's using voice agents to help patients navigate um care post hospital. So, if they need to schedule

hospital. So, if they need to schedule certain follow-up appointments and understand, you know, what doctors exist in my area that'll take my insurance and have availability this time.

Traditionally, the only way to do it was to call them all one by one. Now AI can automate it for you. Um there's a lot of work in what I would describe as tier one labor automation. So there's many

jobs for example customer success or certain top ofunnel sales jobs where essentially there's someone who's a little bit lowkilled that needs to analyze a ton of information and all they're doing is deciding what pieces of

information do I uplevel to someone more senior. Um, and in all these spaces,

senior. Um, and in all these spaces, what we're seeing is that first line of defense is typically getting automated by by AI because it's more about analyzing a lot of information and getting coverage rather than it is

avoiding mistakes. And so, Drop Zone is

avoiding mistakes. And so, Drop Zone is one example of a company in security automation where you're actually partially automating that firstline security analyst in this sort of mechanism. Um, LM are very good at

mechanism. Um, LM are very good at translation. In fact, the transformer

translation. In fact, the transformer was uh originally invented to improve with translation. And there's a lot of

with translation. And there's a lot of use cases where basically what you're doing is translating something but in a maybe more nuanced way. So light table as an example is a company in the construction space that's exploring

let's say you're designing blueprints or architectural prints for a new thing that you want to construct. And

eventually you're going to have to go through review. Does it adhere to all

through review. Does it adhere to all the state codes, the regional codes, things like this? Traditionally that was didn't happen until like five months, six months later. You realize there's a mistake and you got to go back to the

drawing board six months again, right?

What if instead LMS or AI could understand all those rules that you need to follow and give you a first pass right when you're creating the thing, right? Um and so this sort of idea of

right? Um and so this sort of idea of like shifting compliance left or instant compliance is a really good use case.

There's a lot of AI products in that category. Um LMS as I've touched on are

category. Um LMS as I've touched on are good at kind of rethinking semi-structured systems of record. Uh

think for example CRM. We think in all of these categories there's going to be AI native products that kind of replace the incubants. Um, there's also some

the incubants. Um, there's also some cool second order effects of AI. So,

Profound is one interesting company in this space where, you know, traditionally there was a huge industry that got spun out around search engine optimization, right? How do I make sure

optimization, right? How do I make sure I show up correctly on Google? Well, a

lot of people aren't googling things anymore. They're going to chat GBT. And

anymore. They're going to chat GBT. And

so, the new question that a given brand or marketer needs to understand is how do I show up on chat GBT? Do I show up on chat GBT? Right? And so, Profound is a company that's based like search engine optimization for chat GBT style

products. And so this is one example of

products. And so this is one example of I think a broader trend which is as AI so substantially changes the way that we work there's going to be a lot of new needs in that world. Right? Finally, one

last thing I'll touch on is there's some really cool work in synthetic data. LMS

are very good at impersonating people.

And so there's a lot of cases where you might have traditionally done a lot of user interviews or surveys or things like that where instead you can now just ask LM, hey pretend you're this type of user. what would you say? Um, and what's

user. what would you say? Um, and what's interesting is I've now seen a number of cases even with very large brands where these forms of synthetic surveys actually exactly mirror a real survey.

And so there's going to be a lot more work in these kind of market researchy categories. And so that's just a quick

categories. And so that's just a quick overview of kind of the big use cases where we see a lot of activity in the foundation model space. From here I want to talk a little bit more about building

foundation modelbased products and some of the trends that we see there. So to

start, if I were to describe the overarching kind of arch of products in the space over the last couple years, it really went from model to retrieval augmented generation to agents. And so

let's kind of go through that in a little bit more depth. In the early days, you had these very simple products that basically just did a single large language model call to do something like generate a little text or summarize something. And so Notion AI is a good

something. And so Notion AI is a good example of that. You could query the AI in the Notion product and ask it to summarize maybe a paragraph, right?

Useful, but very simple. What you then saw were a bunch of products that started to combine a model with a lot of data. This is often described as

data. This is often described as retrieval augmented generation. So when

the user enters a query, you first search and find a lot of pertinent data.

You give that data to the LM along with the user's query and then you output an answer. Most of the early copilot or

answer. Most of the early copilot or software engineering style products did this. For example, in the code space,

this. For example, in the code space, you might want to find the relevant pieces of the codebase and then give that to the model alongside the user query. What we've seen more recently is

query. What we've seen more recently is the idea of combining not only model plus data but adding tools. And this is really how I describe an agent, right?

And so here a model is determining given a user's query, what do I need to do?

What tools should I use? How should I use them? And also what data should I

use them? And also what data should I retrieve? I would argue that a lot of

retrieve? I would argue that a lot of the newer deep research style products kind of fit this description. And

indeed, if you look at the new startups being founded, I would say almost all interesting startups today fit more in this agent category.

So let's talk about agents a little bit more. Um I think agents are a little bit

more. Um I think agents are a little bit illdefined, but if I were try to give one succinct definition, I would really say that agents are models using tools in a loop. And so the idea is actually pretty simple, right? You give a query

to a large language model. That large

language model has access to some environment it can understand a lot of tools. Given the query, it comes up with

tools. Given the query, it comes up with a plan and a step. It might call an API, search the web, you know, read some files. It then does that step and then

files. It then does that step and then analyzes what happens. Did things

change? Did I succeed? Did I fail? And

it keeps circling over and over and over until it thinks it completed the task that the user asked it to do. Uh what's

really interesting is that, you know, even just a year ago, this kind of architectural pattern didn't really work honestly at all, but now it works really, really, really well. And so,

again, this is one sense of how fast the space is moving. And so some of the most common tools that you see agents use include things like searching files, writing code, calling APIs, searching the web, or using a browser. But the

number of tools you might use can go far beyond this, right? And so let me give you an example um of what this looks like in reality, right? A lot of leading agent startups will actually recurse

somewhere between 50 to 100 times for a single user query. So there's a cool company called Basis. It's an AI accounting startup. You can think of the

accounting startup. You can think of the form factor of the product as pretty simply. Imagine any question you might

simply. Imagine any question you might ask an accountant to do with a spreadsheet. You can instead ask basis

spreadsheet. You can instead ask basis to do. And so maybe I'm asking it

to do. And so maybe I'm asking it something simple. I want to help

something simple. I want to help reconcile this month's collections with last month's revenue. Right? And to

answer this, what it's actually doing is chaining 30 to 60 large language model calls, including planning, retrieving data, writing and running code, browsing the internet, manipulating the spreadsheet, and also accessing all the

other accounting tools that you might access. And what's crazy is that this

access. And what's crazy is that this works really, really well. And so this gives you a sense of kind of the sophistication that now exists in AI products where we're now having very

complicated systems, not just calling a model once.

And so, as I kind of touched on, well, generalist agents are still probably not here. There's a couple of startups that

here. There's a couple of startups that have tried to be like the everything assistant, the everything agent that can do anything for you. That's a really hard problem. But there's now a number

hard problem. But there's now a number of startups that are kind of like vertical specific constrained agents that work really really well. So for

example, I would put lovable in a lot of the coding products in this category. Um

agent uh products in areas like customer support like Sierra also work really well and this is going to continue over the next couple of years.

What's interesting about agents in particular is that if you query users of a lot of the leading agent products, you get very polarized reactions. And so

some of you all may have heard of this product Devon, which is trying to be an autonomous software engineering agent.

And if you look online, you find a lot of people who are like, Devon is horrible. Like, you know, I tried it. It

horrible. Like, you know, I tried it. It

didn't work at all. I had to always correct it. I would never use it again.

correct it. I would never use it again.

But you also find a lot of people who are like, Devon is actually the most productive software engineer at my company and produces more PRs than anyone else. And so, how do you

anyone else. And so, how do you reconcile this? Right? What I'm kind of

reconcile this? Right? What I'm kind of starting to observe is that in most agent categories, there's a bit of a learning curve. It's actually difficult

learning curve. It's actually difficult for a human to understand when do I use a agent, how do I use an agent, how do I what are my expectations around how to review the agent, not dissimilar for maybe a first-time manager who's never

managed people before, right? And so,

um, it's interesting to see how agent products try to educate or teach users and better define that kind of human agent relationship.

Um, so as you saw with that kind of accounting example, I think one thing I want to get across is that most good AI products nowadays, the teams think more about systems than models, right? So I

kind of showed you that basis example where really it's a system, there's a control flow, there's a model, there's a bunch of tools, there might be multiple models, right? And it's the whole system

models, right? And it's the whole system in totality that creates the user experience, right? And what I observe is

experience, right? And what I observe is that in most good LM products, um, you actually often want to break the problem down into this kind of more systems approach. And so I'll give you a more

approach. And so I'll give you a more specific example of this, right? Imagine

I have a product that helps people understand certain political questions or political discussions, right? Um, you

know, for example, I might ask it the question, what are the best arguments for and against the claim that social media hurts democracy? Right? Now, you

could definitely give this question to a large language model and you'd probably get a pretty good answer, right? But

let's say you really wanted to optimize for this type of question, right? What

you might do is instead break the problem down. And so maybe I start by

problem down. And so maybe I start by splitting this up and saying, okay, I want to generate arguments for the question and I also want to generate arguments against the question. And

maybe I have two large language models that are prompted or fine-tuned specifically to be really creative and generate a lot of ideas, right? And so I get some top answers for some top answers against. Then maybe I have a

answers against. Then maybe I have a different set of models that are trained to be critics. They're really negative.

They like to kind of make fun of things or describe why they won't work. And I

pass those hypotheses to the critic models who identify maybe the top answer for and the top answer against. And then

maybe I have a final LLM that I fine-tune separately that's more of a judge, right? That's kind of analyzing

judge, right? That's kind of analyzing fairness, thinking holistically about these two options. And I ultimately give you a final answer. What I can tell you for sure is that the second architecture almost certainly generates better

results than the first. And this is how a lot of good agent teams or AI startups actually think about solving problems. It's much more of a systems problem. And

so here's a quote that kind of illustrates that. You know, we might

illustrates that. You know, we might think of OpenAI as the company that's most allin on models, but this is a quote from the CPO or chief product officer of OpenAI. And what he's saying is we actually use ensembles of models

much more than people might think. You

know, we might have 10 different problems and solve them with 20 different model calls, all of which are different specialized fine-tunings, right? And this is invisible to you as a

right? And this is invisible to you as a user, but it kind of goes back to my example of 01 Pro earlier. It's much

more of a system that's happening under the hood with many different models being called in complicated ways even if to you as a user it feels like you just queried a model once. And so a interesting question then is like how do

you design these systems, right? And so

there's an interesting uh graph here from a paper called large language mo large language monkeys which basically shows that if you take a kind of low-end model and you just keep asking the model over and over and over the same question

and then you have some kind of voting mechanism or judging mechanism for how to aggregate across those answers. If

you ask that one model that's bad enough times, it will be better than much better models, right? And so this is a very simple idea of a system. I just

query one model like a hundred or a thousand times and vote on the answer.

But even that simple system, you get huge performance increases, right? And

so there's a lot of other kind of systems paradigms you might see in the space. You might query the model many

space. You might query the model many times like I just described. You might

do a fan out of answers and kind of find the most common answer. Um, you might break the problem down into substeps and kind of fan out, pick the best, fan out, pick the best. There's different ideas and in some ways I think the design

space is still probably underexplored but in many ways I think this is the future of where a lot of AI systems will go and what's interesting is that there's starting to be some frameworks that make it easier to explore this

stuff because at a certain point it's probably impossible actually for a human to come up with all the ways you might design an AI system. Take the graph on the right here. It's an illustrative system or maybe for a given query I

break the problem down into three steps and for each steps I generate over a thousand answers have some heristic for how to pick the best one and then go to the next step. A human's likely not going to come up with that but

frameworks can permute many different possibilities of how you combine models and pick the best one for you. And so

DSPI and Ember are two interesting examples of this.

Um I touched on context windows earlier um and how sometimes they're a little bit misleading. I want to talk a little

bit misleading. I want to talk a little bit more about context windows because I think another big question when you're building AI products is does search or information retrieval matter in a world

where I can just stuff all the data into the context window right and I think what's likely very true is that even as context windows continue to increase in size uh search and retrieval are not going away and I'll give you some basic

examples of why so from a quality perspective if you look on the graph on the left this is comparing a retrieval augmented generation system in green

that uses search plus AI with a long context model that where all the data fits in the context window right and what you can see is that even in cases where all the data fits in the context

window and even in cases in fact where you're only using like a fifth of the context window so would be the kind of left side of this graph that search plus AI system is still dramatically higher

quality there's also a cost consideration so here on the right you can see comparing a rag system to a long conx model for a certain set of tasks Um, if you were running the rag system for a day, it might be $78. If you're

running the long conx window model instead, it would be $1,500.

And then if you look at latency, which is the other metric that matters a lot, let's assume that you want to kind of analyze a hund a million context window uh size, right? So you can either search over a million contexts or you can put

it all in the model. Um, doing this with search requires 600 milliseconds. Doing

it with a long conx model requires over a minute, right? And so what you can see is that in all these cases on quality, cost, latency, you're still one, if not two orders of magnitude better by using

the more complicated system than just using the context window. And so while the really simple use cases maybe you can get away with just context, it's very likely that search and information retrieval will remain a critical aspect

of most complicated or sophisticated AI products.

I then want to touch a little bit around the idea of product and design. Um, and

my argument would be that there is still a huge amount of room for AI companies to differentiate mostly on product and design even if they don't do anything different technically. And so I'll give

different technically. And so I'll give you an example of this. Um, many of you have likely used these AI note-taking products, right? And there's a lot of

products, right? And there's a lot of them like Firefly, Assembly, etc. And if you had asked me two, three years ago, you know, is there room for a new AI note-taking app, I would have said, no way. Absolutely not. I would never

way. Absolutely not. I would never invest in that. Right? because it kind of felt like a commodity category to me to be honest. Um, and then Granola came out. I'm not an investor granola. I just

out. I'm not an investor granola. I just

love the product. And they had fundamentally rethought what it meant to be an AI note-taking app in a world of large language models as opposed to just applying LLMs to kind of the existing AI

note-taking hypothesis or thesis, right?

And the result is that I and almost everyone I talked to immediately used that product and is like wow that's a hundred times better a thousand times better than what existed before for what was really actually just a product and

design innovation right and I don't mean to diminish granola it was very hard and very complicated to build such an elegant beautiful product and if anything what I'm saying is that I think there's a lot more room for kind of

designoriented founders in the AI space because the technology is good enough already to reinvent many of these categories the question is just can we start to rethink the assumptions in those categories.

Bridging off of that, I really think that the UX design patterns for foundation model based products are still really early. So on the left here, I have an image from cursor. Um, if you haven't used it, when you're writing

these codegen tools or using these codegen tools, you can pick from this drop down that shows you, you know, 10 plus models. And I can use 01 mini, I

plus models. And I can use 01 mini, I can use mini 128k, I can use claw 3.5 sauna, etc. In my mind, this is kind of crazy because the product is asking the

user to be an expert in evaluating 10 plus models that are changing all the time. They're changing every three weeks

time. They're changing every three weeks for all these different use cases, right? Like that complexity should be

right? Like that complexity should be solved by the company, not the user. And

in a lot of ways, this reminds me of older patterns in traditional technology waves where, you know, in the early days of mobile, you had to pick your preferred network type. Um, in the early internet, you had to pick your preferred

media codec for a video player, right?

like these we see these sorts of patterns as crazy today but I really think the same idea applies to these idea of model picker UIs and I think this really just emphasizes how early we

are in the design patterns for building AI based products.

Another really interesting thing to consider for people building AI based products is how do you balance building something for users today versus letting the models get better and solve your

problems implicitly. And so I'll give

problems implicitly. And so I'll give you an example of this. Over the last couple years, there were tons of products built around the idea of fine-tuning for image generation. So,

you know, let's say you want to take a picture of yourself and give it a certain style or a certain structure.

The only way to do that a couple of years ago was you had to fine-tune a specialized model for that use case. And

so, the way all these products work is you sign up, you upload a bunch of your own images, you wait for them to train a model just for you, and then after 15 minutes, you can use their model. behind

the scenes that company had to build a lot of infrastructure around fine-tuning per customer, storing a given model per customer, etc. Okay, now uh the recent

image plus version of chatgbt emerges and guess what? You can do incontext learning natively without doing anything. No fine-tuning, no custom

anything. No fine-tuning, no custom model, no nothing, right? And so in the blink of an eye, all of the product form factor, all of the workflow and most of the infrastructure that all these AI

image generation products have built has become completely obviated. And this is the risk of being a founder right now, right? You can spend so much time

right? You can spend so much time solving a problem today, but it turns out that all that work becomes technical and product debt in just a year or two.

And so the really good founders in this space think endlessly about this question. And often there's no right

question. And often there's no right answer because you need to be able to solve users problems today. You can't

wait forever. But how you balance this is one of the most interesting questions in building applied AI startups right now.

So from there I wanted to talk a little bit more on the tool side of things which which I touched on earlier for agents. Um some of you may have heard of

agents. Um some of you may have heard of model context protocol. It is really emerging as kind of the open ecosystem standard for tool use. Right? So just

like HTTP emerged as the common way we access websites. There was the desire to

access websites. There was the desire to have a standard way to expose services to agents. Right? And so the way it

to agents. Right? And so the way it works is pretty simple. You have what are known as MCP clients. You can think of this as something like claude or chatgbt which might want to access tools, right? And then you have MCP

tools, right? And then you have MCP servers which you can think of are basically a service that exposes itself or exposes its API as a tool that an agent can then use reliably. Right? So

in this example, I might be using claude and then I might have it attached to servers for Gmail, for Figma, for Blender and then as I use Claude, the agent and Claude can determine when to use those tools and how to use those

tools, right? Um, this was released late

tools, right? Um, this was released late last year and is now supported pretty officially by basically every major player in the space. And so while things might still change, it's likely that this will be the kind of standard deacto

way that agents use tools. And the

reason this matters, it's going to make it much easier to build agents in the future because you won't have to build custom integrations into every single API that you might want to use. And so

here's a really cool example of this.

Um, someone hooked up Blender, which is a 3D modeling tool, to Claude. And what

you can see is that all they're doing is talking to Claude or Enthropic in the web URL. And on the right, Claude is

web URL. And on the right, Claude is controlling and designing a scene in 3D for them, even though that person knows nothing about 3D modeling or how to use Blender, yet in the end, they get a

really good result. And so, this is just one kind of cool example of how powerful it is to connect tools and systems to these AI models. Um, speaking of tools, I think one thing that's becoming

increasingly clear as I talk to founders is that the interface for tool use matters a lot. And so this is a research uh paper um uh from a talk that came out of replet. And so let's consider a

of replet. And so let's consider a coding agent that can access a couple of basic tools. It can edit files, it can

basic tools. It can edit files, it can search files, it can view files, and it can man it can manage its context a little bit. uh what you see on the

little bit. uh what you see on the bottom is that really subtle changes in the way that that tool is defined massively impact the quality of the agent. So for example, if you look at

agent. So for example, if you look at the search here, if the search API or search tool use call both summarizes the search results and shows the search results, you get much higher quality

than if all it does is show the results.

Similarly, this is a bit a little bit non-intuitive. Imagine the file viewer,

non-intuitive. Imagine the file viewer, the best file viewer didn't show all the files. It didn't show 30 files, it

files. It didn't show 30 files, it showed a 100 files, right? Like a weird in the middle value. And so actually optimizing tool use, it's becoming clear, matters a lot for improving the

quality of agents. And what this is actually leading to is that in spite of the fact that MCP is becoming a standard, I'm still actually talking to a lot of startups who are saying, you

know what, MCP is a nice starting point, but for my agent, I really need to build first class integrations optimized for my agent for each of the tools. So this

is a quote of a series A um agent or AI sort of that I work with and they were literally like our agent was over 10x better once we stopped using standard MCP and started just building deep

custom integrations and different tools and so I think there will continue to be this tension for a little bit of you can expose a naive MCP server but it's not going to work as well as building a first class integration and it's going

to be interesting to see how that evolves.

One other thing I wanted to touch on on the product side is the idea of personality. So um the paper on the

personality. So um the paper on the right gives you kind of an interesting hint at this. What these researchers from Stanford basically showed was that if you take a a kind of base model that has had none of all the work we've

started doing over the last couple years to make it better at instruct following instructions and answering questions and reasoning. You just take the really

reasoning. You just take the really primitive base model and you ask it to do creative tasks and you compare it to the frontier best models where we've done all this post- training and optimization etc. The base model is

better. Why is that? It's because we

better. Why is that? It's because we spend so much time trying to get these models to be good at answering questions and following orders and doing what we say that you kind of get them to lose

some of their creative freedom in doing that. Right? But there's a lot of use

that. Right? But there's a lot of use cases where you care a lot more about the creative freedom than someone just answering your question correctly, right? And so there are these inherent

right? And so there are these inherent trade-offs in personality just like we see in humans. And I think it has underexplored how do you emphasize different types of personality in different use cases, right? So in a lot

of creative expression categories like design, you probably care more about creativity and randomness and answering questions. In areas like education, you

questions. In areas like education, you probably want a better balance of like the model being more of an authority and telling you you're wrong rather than just being a sycophant and doing what you say. In areas like therapy, you may

you say. In areas like therapy, you may want a model that's not so focused on answering questions, but maybe asking questions, right? And so I think this is

questions, right? And so I think this is a really underexplored area of product development in the space. And over time, it wouldn't surprise me to see more variance of model personalities from

these large providers.

Finally, I wanted to just briefly touch on the fact that the infrastructure ecosystem for building foundation model based apps has matured massively over the last couple years. So everything

from running inference to managing data to doing eval and observability through embeddings tools, things like this. Um a

couple years ago, you basically had nothing. You had to do it all from

nothing. You had to do it all from scratch. Now there's such a foundation

scratch. Now there's such a foundation that it's much much easier to build products that are really good really quickly. And so this is also part of

quickly. And so this is also part of what's accelerating startup growth in this category.

So that covers the product side. I want

to finish by talking more about market structure, market dynamics. Then I'll

end with a little bit of what might come next. Um so first a remarkable

next. Um so first a remarkable statistic. In 2024, 10% of all venture

statistic. In 2024, 10% of all venture dollars went to foundation modelbased companies. And what you can see on the

companies. And what you can see on the right here is that you know 2020, 2021, even 2022, it essentially rounded to zero. And now we're at 10% after a

zero. And now we're at 10% after a couple of years. It wouldn't surprise me actually if next year or in 2025 it's even higher than this. And so this just gives you a sense of how much excitement

maybe even hubris and over excitement exists in the foundation model category.

Um at the same time though I think this is partially justified right these foundation model vendors and startups are not only in the billion plus runway

run rate but they are accelerating their revenue growth at that run rate. And so

this is on the left some revenue data from OpenAI that shows that at 2025 they'll probably end at about 12 to 15 billion in revenue. Um Anthropic over doubled its revenue from a billion to

over two billion in just a single quarter. Um and so you know there's a

quarter. Um and so you know there's a method to the madness and there's no precedent for this degree of growth at this scale.

What's also interesting in the model category is that some of the model players are pushing really hard to be application companies or really consumer product companies rather than API

companies. And so here you have a graph

companies. And so here you have a graph that shows the relative percentage of revenue for chatbot subscriptions versus API revenue. What you can see is that

API revenue. What you can see is that OpenAI is nearly 70 to 80% chatbot subscriptions now whereas Anthropic is still predominantly API revenue. And I

think this is a reflection of what I was discussing much earlier in the presentation, which is that the model layer continues to commoditize. Open

source continues to be a serious threat.

And even though the revenue growth of these players is crazy, I think if you just stay a model provider, you have very little revenue durability and stickiness. It's very easy to switch a

stickiness. It's very easy to switch a model API like that. But if you build a consumer brand, a consumer subscription, that looks really different. And so

indeed what you see is that all the leading model companies are really pushing to be app layer companies. So

for example both Anthropic and OpenAI have hired former consumer product people to lead product. Actually both of their product leads used to work at Instagram. Um you can see this in

Instagram. Um you can see this in acquisitions these players are making.

OpenAI recently announced it was starting to acquire the AI coding company Windsurf because coding is a really good application category in AI.

And indeed, there's even rumors that some of these players are trying to develop a novel social media platform built on AI. There was also the recent news that OpenAI acquired a consumer hardware device company, right? And so,

it's very likely this trend continues and that the only way to succeed at the large scales in the model layer is to bundle or vertically integrate with the application layer category.

Another reason for this is that the big cloud vendors are finally catching up.

You know, two years ago, Google was a disaster in spite of the fact that transformers had actually come out of Google, but they increasingly seem unstoppable. This is a graph of two

unstoppable. This is a graph of two metrics. It's showing all the models

metrics. It's showing all the models that exist out there. On the y-axis is quality. The x-axis is cost. And so this

quality. The x-axis is cost. And so this cost quality curve is really the key thing that matters in large language models. And what you can see is the

models. And what you can see is the entire PTO frontier now is owned by Google. That is for any trade-off

Google. That is for any trade-off between cost and quality, Google has the best model. Now things change quickly so

best model. Now things change quickly so this could change but I think this gives you a sense of the concern I would have if I were running a big model company that wasn't Google because Google could offer this as a loss leader. Google has

crazy economies of scale and Google can treat this more as an on-ramp to the rest of their cloud infrastructure.

Right? And so I think this is also reflective of why you see the model companies pushing so hard into the application layer.

Another really interesting question in my mind is that there's starting to be all this investment into foundation model companies that are not just these purely digital companies like image text video right for example I have a bunch

of robotics data here and you know these companies are raising a lot of money at similar valuations to the LLM companies the image companies and the question is will they be able to defy gravity like

we saw in images and text in terms of their revenue growth or will they ultimately ultimately be limited by the fact that in these physical domains There's just operational concerns and hardware concerns that are very

difficult to get around. I think this is really the billion dollar question for these people that are investing, you know, hundreds of millions of dollars in these physical foundation modelbased startups.

If you look then at the application layer, you see a similar dichotomy, right? Where yes, the valuations are

right? Where yes, the valuations are really, really high, but there's also unprecedented revenue growth, right? So

on the right here I have a graph from red from redpoint that basically showed that in 2023 to 2024 the average revenue multiple of AI startups was dramatically higher than non-AI startups but their

average growth rate was also much higher and on the left here I have a number of examples of companies that have had unprecedented revenue growth in the application category uh including some you know Bolt for example went 0 to 20

million in just 60 days and so there remains a method to the madness but the question in a lot of these cases is how durable is the revenue, how defensible is it, how sticky is it, right? Um,

another view on this is that if you look at even just the most well-known publicly published AI native applications that have real revenue targets that have been listed in the public domain, uh, AI native

applications are now at over a billion dollar plus run rate. And this is a vastly conservative estimate. There's

many startups that I know have very high revenue that are not in this list. And

so the revenue is real here. Make no

doubt about it. Um

the other interesting thing is that you know I think especially a year ago two years ago there was a big question of whether large language models and AI are more of a sustaining innovation that

helps incubants right in theory it's not that hard to add AI model calls to existing products. So maybe this means

existing products. So maybe this means that big companies like Salesforce Adobe etc will just win. You know what advantage does the startup have? Um, and

I think what's now empirically true is that in fact the incubids do not have the advantage in this category. And I

think that's mostly because building really good AI products, kind of like what we saw with the granola example, is so much more than just slapping a model call on. You really have to reinvent the

call on. You really have to reinvent the workflow, right? And so even in a couple

workflow, right? And so even in a couple of these categories where the big company has had a ton of resources and gone all in on AI, for example, GitHub copilot and coding, Adobe Firefly and

Creative Expression, you have seen the startups absolutely dominate and I would expect this to continue to be true. I

really think it's a startup's game for the most part in the AI world right now.

And kind of touching on this question of revenue growth, I think there's a huge risk uh risk of the novelty effect in AI startups right now. This is a graph of the revenue curve of a consumer startup,

Lensza. It was one of those early AI

Lensza. It was one of those early AI image style transfer startups. And what

you saw is they very quickly got to a huge amount of revenue and then just as quickly dropped off. And we see a lot of startups that have this effect mostly because a lot of people are curious about AI. They want to try try AI, but

about AI. They want to try try AI, but that doesn't mean that they're going to stick around, right? And so as an investor or as an employee, you've got to be really careful of what's defensible and what will be um kind of

maintained over time versus what's not.

And so overall, if I were to summarize, you know, what's happening here is real, but the market also feels really bubbly.

Um this is just one example that I found kind of humorous, interesting. Um there

was a French foundation model startup called uh H, and they raised a $220 million seed basically just on an idea.

And within three months of founding the company, three of the key co-founders left, right? And so there's just a lot

left, right? And so there's just a lot of interesting behavior in the market right now that I think makes it simultaneously super exciting and super bubbly. And it's really hard to

bubbly. And it's really hard to understand how to navigate it as a result. And I think at the end of the

result. And I think at the end of the day, the people who win for sure are the GPU ecosystem and companies like Nvidia because no matter what happens, we're going to still keep needing an

exponential growth in compute hardware.

So I want to finish with just a little bit of discussion on what might come next and playing things out a couple of years, right? Um the first is that it's

years, right? Um the first is that it's very clear that operating as an AI native company is going to look fundamentally different. Um so here I

fundamentally different. Um so here I have an extract from a recent memo that the CEO of Shopify sent to everyone at his company. And basically what it said

his company. And basically what it said is that using AI is a fundamental expectation and if you don't use it, you know, you're going to get fired. you

have no way that you can contribute here if you don't know how to use AI tools.

And um I really think that companies that don't do this are going to fall massively behind because when you infuse large language models and these research tools into everything you do as a

company, the degree to which you can be more efficient is really unbelievable.

Um and you can see this in startups today, right? There's a number of these

today, right? There's a number of these startups that are 10 people, 20 people, 30 people hitting 50 plus million ARR who are not only building AI products but infusing AI into the way that they

work, right? And so Gamma is an example

work, right? And so Gamma is an example of a company that's just a couple of years old. They've raised less than $15

years old. They've raised less than $15 million and they haven't used most of it. They're only 30 people and yet

it. They're only 30 people and yet they're over 50 million ARR, right? And

so this is what incubates are competing with. And so if you don't as a company

with. And so if you don't as a company learn how to use AI quickly, you're probably going to get screwed. Um

related to this, it's really interesting to see how much the composition of teams is changing. Um so I have two

is changing. Um so I have two representative quotes here, right? I was

talking to a VP of product at a kind of preipo startup recently and what he was saying is that I really don't see a difference between designers and product managers in our company anymore, right?

Because AI has made a lot of the functional design skills less useful.

And so really it's just a question of who has taste and who understands the customer, right? Um, similarly, I was

customer, right? Um, similarly, I was talking to a CMO of a public tech startup recently and she was saying that AI has completely changed how she thinks about hiring. If you think about the

about hiring. If you think about the traditional marketing org, you probably have someone who knows how to edit videos in a tool like Premiere. You

probably have someone who knows how to make motion graphics and a tool like After Effects, etc., etc. Well, AI kind of obviousates the need for a lot of these functional skills because these generative models can do a lot of it for

you. And so, the result is you probably

you. And so, the result is you probably just want to hire generalists now who know how to use AI. And so she had completely changed how she thought about hiring and told her team you can't hire specialists anymore. And so this again

specialists anymore. And so this again gives you a sense of how dramatic the shift is going to be in companies if they kind of really leverage or make use of modern AI tools.

Related to this, I think that learning to manage fleets of AI workers is going to be a new skill that's really not different from managing people. Um I was talking to a CTO of one of the top codegen startups recently and he was

saying, "I haven't written a new line of code myself in three months. I just

manage agents." Now, he's maybe an extreme example because he's trying to dog food his own product, but I really see this start to take shape where a lot of knowledge workers are thinking more about how do I use AI rather than how do

I do the work myself? And I think it's interesting to start to see this design pattern emerging of the agent inbox. I

think a lot of the future is going to be in a tool like a project management tool or a email tool. You're actually

interacting maybe more with AI agents that are communicating you than you are with your co-workers or with other humans.

Um, another interesting shift that's happening is that a lot of products as a result of the rise of agents and AI are going to actually start to be designed more for AI as the consumer rather than

the human. So, I'll give you two

the human. So, I'll give you two examples of this. Um, there's a thing called a cursor rules file, which is basically a way of defining certain instructions that a coding agent can understand when it's answering

questions. What you're starting to see

questions. What you're starting to see is that a lot of the biggest developer tools companies like Cloudflare in this example are actually spending just as much time writing cursor rules files for their APIs as they are traditional

documentation that a human might read because if AIs are more responsible for coding it actually matters more to make sure your APIs are accessible to the agents rather than to the humans. Right?

Um, similarly on the right I have a quote from a database company called Neon where the CEO is basically saying that now over 80% of our database instances are created by agents not

humans right and so I really think there's going to be this huge shift in the world especially lower levels of infrastructure where everything you're doing as a product is serving AI not

serving humans and so where will value be destroyed right it's hard to perfectly answer this questions but a couple of rough lines of thinking um First is I think there's a

huge shift from outsourced labor to in in-house labor right if you think about a lot of corporations traditionally I might pay external consultants or services firms to do you know various functional things that maybe I don't

have the expertise to do myself and so I need to pay someone else obviously working with thirdparty service contractors is not a good experience in many cases though as AI democratizes access to a lot of functional skills

those value chains are going to shift and it's going to be more about software empowering teams rather than services doing human labor for work. Um, second,

I touched on this a little bit with the CMO quote, but I generally think there's going to be dramatic shift from specialist to generalists. And so tools and companies and services that were very oriented towards servicing uh uh

servicing specialists may be hit and I think there's going to be a lot of new opportunities to build new types of tools that serve these more generalist user personas that are going to emerge with AI. Third is that I think a lot of

with AI. Third is that I think a lot of middle management is going to get eroded. You know, so many of these jobs

eroded. You know, so many of these jobs that are just about mediating communication, taking information from here and moving it over here. Honestly,

a lot of the traditional corporate America jobs are frankly irrelevant in a world where LLMs can automate almost all forms of communication and knowledge transfer. And so, I think we're going to

transfer. And so, I think we're going to go move to a world of much flatter organizations with people who are all doers who are doing with AI. And so

tools and services and companies that are reliant on this like super layered middle management hierarchy I think will become a lot less relevant. One example

of this is that I think tools for project managers will not really be useful anymore because project managers as a job will likely go extinct. Right?

I touched on this a little bit but I think incubants in these categories that are directly in the line of fire of AI like companies in the creative tools space, companies in the CRM space are in real trouble. I think if they don't make

real trouble. I think if they don't make large acquisitions in these spaces, it'll be probably impossible for them to adapt. And then finally, I think the

adapt. And then finally, I think the companies that don't go through that organizational pain like we saw with Shopify of kind of adapt or die will end up dying anyways, right? Because they

will be out competed by much leaner companies that can move at a much higher rate. And so I'll close with kind of the

rate. And so I'll close with kind of the interesting question, right? Is AGI

close? Um, one of the funniest observations for me is that it's the smartest AI researchers in the world who actually think this. And so there's this classic Dunning Krueger curve where if you talk to people who know nothing about AI, you know, they're like, "Oh my

god, AGI is about to be here." And you talk to the person who knows a little bit, it's like, uh, these are just statistical models. You know, how smart

statistical models. You know, how smart can they be? Like they're useful, but they're not going to change the world.

But then you talk to the people who've actually trained these systems, built these systems, see how fast they're improving, and they also think AGI is here in three years. And so I don't know if I know the answer, but I find this kind of fact or observation really

interesting. And so with that, hopefully

interesting. And so with that, hopefully you've gotten an overview of basically everything that's happening in the AI space today. I really appreciate you

space today. I really appreciate you taking the time. Um, if you have more questions, you want to follow up, please email me. My email is here. And if you

email me. My email is here. And if you want to see the full presentation with a lot more detail, um, that's just in the link where you would have seen this. And

so, thank you so much.

Loading...

Loading video analysis...