How to Build a Self-Improving Company with AI
By Y Combinator
Summary
Topics Covered
- AI Makes the Traditional Hierarchy Obsolete
- Build a Self-Improving Company Loop
- Burn Tokens, Not Headcount
- Record Everything to Make Your Company Legible
- Software Is Ephemeral, Context Is King
Full Transcript
This is based a little bit off a talk Diana gave. There's a video up over the
Diana gave. There's a video up over the weekend which is super cool. Um Jack
Dorsey was tweeting some stuff like two or three weeks ago that I thought was super cool and I've kind of um stolen a bunch of those ideas and shove them into here. This talk is like pretty
here. This talk is like pretty conceptual and high level about thinking about how to build companies. So the
Roman legions were designed to project power over two continents or something from Rome at the center to like these people on Hadron's wall up in
Scotland. And the idea was um this
Scotland. And the idea was um this nested hierarchies with consistent spans of control and you had like named individual with spans of control to pass
orders down and send information back up the hierarchy. And if you think about
the hierarchy. And if you think about most companies today, they are organized like a Roman legion where human beings are the conduit for information flowing up and down. And so Jack Dorsey's tweet which I thought was great was it's like
this underlying assumption that hierarchically organized companies are the are the way that we should be organizing like our economic units of value. And I think AI basically breaks
value. And I think AI basically breaks that. If you talk to people a year ago
that. If you talk to people a year ago about how AI was useful, they talked about productivity, like co-pilots, making engineers 20% more productive,
adding co-pilots to workflows, shipping more software. But I think that is
more software. But I think that is actually a broken way of thinking about AI. That's like Pete had a great blog
AI. That's like Pete had a great blog post. We're basically just like taking
post. We're basically just like taking the old way of working and adding like a more powerful engine onto it. And
instead of that, I think you can reimagine like what a company is and how it acts. And so as Gary's talking like
it acts. And so as Gary's talking like he I genuinely believe can produce more code than an entire engineering team.
The thing that's really stuck with me is this idea of like extracting the domain knowledge from your company and defining it as a as like context or a set of skills or whatever you want to call it.
But like this idea that there's domain knowledge or business knowledge or like some knowhow that's inside the heads of people and in Slack messages and in
emails and in notion. All of this like information together defines how your company works. And if you can make that
company works. And if you can make that legible, you suddenly can can move from this hierarchal organization to a sort of intelligent AI powered organization
with AI native software. AI isn't the some it's not something you bolt onto the side of a company. It's not like a tool you give to your engineers to make them more productive. But I think you
can reimagine what a company is as a set of recursive self-improving AI loops. I
think this is really, really, really important because when it gets there, I think the company starts to self-improve even when you're sleeping. So, let me give you an example. Diana's talks about
this as well. this AI loop. You start
with like a sensor layer, which is like that's a fancy word, but really it might be like emails from your customers.
Might be support tickets, code changes, people canceling their subscription, product telemetry. It's like sensor data
product telemetry. It's like sensor data to get information from the outside world. And then a a policy layer,
world. And then a a policy layer, decision layer, like rules about what you can do, what it has to ask a human permission for, what it must log. A tool
layer, that's kind of Gary's skills and code. Like the tool layer is Gary's
code. Like the tool layer is Gary's code. It's basically deterministic APIs,
code. It's basically deterministic APIs, things like query my database or look at my calendar. Um, a set of tools that the
my calendar. Um, a set of tools that the the AI can call a quality gate like that might be evalistic checks, safety filters, human review for high-risk
stuff. and then a learning mechanism.
stuff. and then a learning mechanism.
It's like your system interacts with the real world, picks up where it doesn't work, and loops back into the top again.
And if you can run every single step of that without human intervention, without with minimal human intervention, your system gets better and better and better while you're sleeping. And I can give you actual examples of this that are
live right now. We started with an agent that you can ask and it it has deterministic tools to query our database. Pretty simple, like when did I
database. Pretty simple, like when did I last have office hours with this company? Then it got a little bit
company? Then it got a little bit smarter which was like for this company I'm doing offices hours with right now they need introductions for anyone in petrochemicals or something and it could query the database in different ways and
use rag and all sorts of stuff to like come up with five relevant founders for you to meet. But again this is like this is a sidekick right this is an agent this is like the old this is last year's version of how AI is making me better as
a group partner. It's making me 20 or 30% more effective. The aha moment for me came when we put a monitoring agent on top of that which looked at every
single query every single YC employee was doing and saw when it worked and when it did not work and when it did not work it's like oh why not what would
have made this query work do we need different deterministic tools do we need to update the skills file do we need a different database view do we need a new index and this happen this literally happens overnight now let's write the
code put in a merge request to the YC codebase have an agent review it and merge it and deploy it. So when a human comes the next day to ask the same query, it will now succeed. For me, that
was like the holy [ __ ] [ __ ] right?
That's not just AI making you 20 or 30% more valuable. It is the AI going
more valuable. It is the AI going through this loop to figure out how to self-improve. And I think basically if
self-improve. And I think basically if you can identify parts of your company that work like this and eliminate as have the human and kind of a monitoring of supervisory capacity,
you can just throw tokens at this problem and your company will get better. And so other examples might be
better. And so other examples might be if you have product analytics, having an agent go through your product analytics to to figure out what part of your sales funnel is presenting the highest amount
of friction, researching best practices, putting in place an AB test, running it for a week, picking the best version, and deploying it. Then doing that again and again and again for your product.
Just have a self-optimizing like product loop. Or you do it with customer service
loop. Or you do it with customer service queries. You have customer suggestions
queries. You have customer suggestions coming in and in and in. you triage it with a kind of you have to have an agent which is like your chief product officer and your chief technology officer who make kind of judgment calls about okay
this is a suggestion we just don't want to do we'll discard it but no this is a suggestion which is now in line with our road map um we can do it overnight let's write the code let's deploy it let's ship it to the customer without a human
being involved so I think if you can think about each part of your company as a self-improving like recursive AI loop it becomes very very different to this like hierarchically organized Roman legion from a company so what So like if
you want to do this, what are the implications? One is like burn tokens,
implications? One is like burn tokens, not headcount. We are seeing companies
not headcount. We are seeing companies get to demo day with about 5x more revenue per employee than they did 18 months ago. And I think that's going to
months ago. And I think that's going to continue to series A and series B. And
so I think you're going to be constrained on token usage, not on headcount really, really soon. The blunt
measure now is just like measuring everyone's token usage, which is obviously like dumb and gameable at the extreme, but directionally I think is correct. We're in the phase of like what
correct. We're in the phase of like what is possible right now and so everyone should be experimenting to the max to figure out what we can even do with this crazy new intelligence we have. As soon
as you turn it into a leaderboard and people get promoted or fired based on it, obviously it gets gamed, obviously that's dumb. But I think directionally
that's dumb. But I think directionally figuring out who in your organization is token maxing, who is not is like a good way to think about which employees you should be spending your time with. I
think middle management is done. I just
don't think you need middle management for this coordination problem. I think
AI should be doing it. And for me, there are two roles. Jack Dorsey has three. I
actually don't like the third one, so I deleted it. But there are two roles that
deleted it. But there are two roles that really, really matter for me. I think
everyone just has to be an IC now, a builder, an operator. And I think crucially having directly responsible individuals to get anything done I think you need a named human not a committee not a group of people just a single
person and I think you can build companies based on IC's effectively I think just middle management is is over so building this self-improving company that's a dream and by the way I think
like people are at the bleeding edge of this right now I'd be interested to see where you all are but it feels like people are like exploring the boundaries here I'm not sure anyone has a truly self-improving company in every
function. I might be wrong. You might
function. I might be wrong. You might
prove me wrong. What would I do? First
of all, this is really, really important. I would make the entire
important. I would make the entire organization legible to AI. What does
that mean? It means you've got to record everything.
Simplistically, all of our um partner emails. Now, if you email a YC partner,
emails. Now, if you email a YC partner, that email is in the YC database. Every
Slack message, every DM, every office hour we've started recording for the last three or four months. every single
thing that happens, if it is recorded, it happened to the AI. If it did not get recorded, it is it did not happen to your intelligence. You know what I mean?
your intelligence. You know what I mean?
And so, I was talking with some founders over here um just now and we're having like really good conversations about their company, but every conversation I had, I was like, "Fuck, I need to be recording this conversation." Because
some guy wanted an introduction to I can't even remember who the introduction was now. Who was that? I was talking to
was now. Who was that? I was talking to someone about and I promise you an introduction. said yes. And I said,
introduction. said yes. And I said, "Email me afterwards cuz I would I'm going to forget this. I'm going to talk to 20 people." Yeah. So, it needs to be on my phone or a clip or or smart glasses or we deck out every room with like microphones. But basically,
like microphones. But basically, everything needs to be recorded so that it can be legible to the AI. And then,
as Gary talked about like diorization, you cannot pump in 100,000 hours worth of recordings into a context window. So,
you have to diorize it. You have to basically aggregate it down, synthesize it into the important parts, and then give the AI breadcrumbs. It's like,
okay, so here's an example. Who's read
the user manual? The YC user manual.
Hopefully, everyone in this room has at least opened the user manual at one point in time, right? Like, it's fine.
It was written 5 to 10 years ago, most of it. It's kind of out of date. So, Haj
of it. It's kind of out of date. So, Haj
thought uh last weekend, since now we've got about 2,000 hours of recorded office hours in the last 3 months, why don't we regenerate the user manual? And so you can click like you give it a set of instructions. You basically diorize it
instructions. You basically diorize it down, synthes like categorize it into certain areas like fundraising, hiring, co-founder disputes, whatever. And then
write me a new user manual. And by the end of the weekend, he had 150 page user manual, which is dramatically better than the existing user manual. And now
we can also update it every single month. So our user manual becomes
month. So our user manual becomes self-improving. Every new piece of
self-improving. Every new piece of advice we give, it's compared with the existing user manual and either incorporated or thrown away. So the user manual becomes this up-to-date living brain of the advice we give to founders.
And obviously it doesn't stop as a user manual. You then pump it in as context
manual. You then pump it in as context to an AI agent and suddenly you can ask a super intelligent AI and get the combined wisdom of 16 YC partners in one,
but only if it's legible. So you have to record everything. The second point is
record everything. The second point is kind of the same, right? Like if it creates an artifact that can self-improve, it's legible. If it
doesn't, you throw it away. The third
point then is that every function can generate this used to say dashboards.
It's not just dashboards. It's on demand software. Codeex 55 is now good enough.
software. Codeex 55 is now good enough.
You can oneshot most simple inter like most internal software dashboards you can oneshot to a pretty high level of quality. I tried it over the weekend on
quality. I tried it over the weekend on a bunch of our stuff. It's just unreal.
So all of your internal operations teams should be sitting on this layer of like kind of intelligence understanding and then creating their own dashboards and their own workflows. And I would see
that those as entirely disposable. I
would very preciously store all the data. So as Gary said, he puts it all
data. So as Gary said, he puts it all all of his emails in markdown. Never
throw anything away, but then treat the the software as ephemeral. You can you can generate it, you can regenerate it.
The valuable part is like the comprehension inside people's heads of like this is how the function works.
This is how we run a YC event. Whatever
the software to actually run the event, you can generate for the event. You can
throw it away. The mo the models get smarter in a month or two. Throw the
software away. Give it your original set of instructions and regenerate the software. So I think the business
software. So I think the business context and and skills are the valuable part. I think the software on top of it
part. I think the software on top of it is ephemeral. So what what are humans
is ephemeral. So what what are humans for in this world? I think basically we're talking about a company brain and I know a bunch of people in this room are building this but the bit in the middle like all of your data, all of
your emails, your DMs, the skills, the knowhow that is like the company brain and I think the humans sit around the edge of this interfacing with the real world. So it's where this intelligence
world. So it's where this intelligence makes contact with reality. Human beings
reach into places the models can't go yet. That might be like a conference. It
yet. That might be like a conference. It
might be a I'm trying to think of examples. I would say a phone call, but
examples. I would say a phone call, but I think the AI can reach into phone calls pretty easily now. Um I think it's like novel situations, ethical considerations, high stakes moments, you know, it's like it's where the founder
comes to us and is like thinking about breaking up with their co-founder, right? It's like those real high stakes,
right? It's like those real high stakes, high emotion moments where you really want a human being. I think that's where the human fits for all of you like sales conversations. I think that's a human
conversations. I think that's a human being in the room for the next 20 years.
So the humans live I think around the edge and I'm over time and cool vision should bullhorn me. I will leave you this one question. If you were building
your company today would you start it in this shape for most of you you're small enough to build it right and so I don't think you have any excuse and I know there are a few of you who are in the
process of ripping up and rebuilding your company. So with that I will stop
your company. So with that I will stop um and we'll hand over to Pete. Thank
you for listening.
Loading video analysis...