OpenAI's Codex Lead: Why Coding as We Know It is Over
By 20VC with Harry Stebbings
Summary
Topics Covered
- AI Automation Explodes Engineer Demand
- Product Managers Become Optional
- Human Prompts Bottleneck AGI
- Build Tools for Individuals First
- Single Super Agent Wins Market
Full Transcript
You still need software engineers today.
You still need designers. I'm a PM. Do
you need PMs? You know, you can have some fun jokes about that. I don't think you need them.
>> Today, joining us in the hot seat, we have Alexander and Bericos, product lead for codeex at OpenAI. This is an incredible discussion. Time to get the
incredible discussion. Time to get the notebook out. For me, the most exciting
notebook out. For me, the most exciting future with AI is one where everyone just feels like a superhuman, like empowered by AI. And for that, we need tools that everyone feels fluent with.
>> Your job is the success of codecs.
Actually our job is the distribution of intelligence and this is really unintuitive but like we put all this effort into training these models and then we serve these models to our competitors.
>> This is so difficult for me as a venture capitalist to understand. Elon said that coding is one of the first professions to be largely automated. Do you agree?
>> For sure I would agree that coding is one of the first domains where LLMs are really good. But what does it mean for
really good. But what does it mean for coding to be automated? It's like kind of a heavy statement, right? for
example.
>> Ready to go.
Alex, I'm so excited for this, dude. I
told you I've been at a PE conference and all I could think was, "Thank God I've got Alex next because this is going to be a great one." So, thank you so much for joining me, man.
>> So excited to be here. Thank you.
>> Now, this is weird first start, but roll with it. you'll you'll understand my
with it. you'll you'll understand my British intricacies. I'm fascinated by
British intricacies. I'm fascinated by people's motivations. Are you motivated
people's motivations. Are you motivated more by the fear of losing or like the thrill and excitement of winning? I I'm
a maximalist. I'm definitely much more motivated by the idea of winning than the fear of losing. But I'll I'll admit to you something. when I was running a startup before joining OpenAI and one of my darkest moments and there were many
dark moments while I was running the startup was recognizing that I had spent the past few months trying to avoid losing and all of a sudden I was like oh my god that is why I'm so unhappy and
that's probably why the startup isn't going well and so when we flipped you know I basically every now and then I have to reach myself and like flip back into this idea of winning but really what motivates me even more than that is I think I just love building things and
building things for people and man I am so excited excited for this year because many amazing things that don't exist yet are going to be built and given to a lot of people.
>> I'm diving right in. Elon said that coding is one of the first professions to be largely automated. Do you agree given your position and what you see dayto-day?
>> I think for sure I would agree that coding is one of the first domains where LLMs are really good. You know what does it mean for coding to be automated? It's
like kind of a heavy statement, right?
Like for example, now that we no longer write assembly, like when that change happened and we moved to higher level languages, did we say coding is automated? Not really, right? We were
automated? Not really, right? We were
just able to write much more code and then as a result actually there was much more demand for code and there were many more software engineers required. But
yeah, part of what they used to do is automated in the same way that like do you know the origin of the word computer?
>> No. Um, I might pronounce the location wrong, but I think it was at Bletchley Park. There were all these machines for
Park. There were all these machines for like decoding German Enigma. And like
there were humans who would like punch out punch cards and like put them into the machine and do a bunch of like tabulated math. I'm probably butchering
tabulated math. I'm probably butchering this, but basically there was like intensely manual part of work. And even
like the first spreadsheet software was kind of loosely based off this idea that you would have an office full of desks arranged in a grid and people doing tabulations and then passing their sheets to the next person. And so all
these things like those specific tasks have become automated but every time that's happened there's been an explosion in demand for the output and so the you need many more people actually to do that kind of work even if
the specific task has changed.
>> So you think we will have more engineers in 5 years not less.
>> Yeah and I you know sometimes we change what terms mean right like the term computer now refers to something else but the now we have the term software engineer and so I definitely think we'll have many more builders. You know,
something interesting that I'm observing now is like there's this compression of the talent stack. Like, you know, you still need software engineers today. You
still need designers. I'm a PM. Do you
need PMs? You know, you could have a fun fun some fun jokes about that. I don't
think you need them. Um, but maybe, you know, maybe when you say engineer, you might be thinking of someone who's like much more full stack, right, than than has been true before. Like even if you go back a few years, it was much you had
many more places where there was like the backend engineer and the front-end engineer, right? Whereas like now at
engineer, right? Whereas like now at least if I think about the codeex team like there's very few like that's much less the case and things are much more full stack right and so I think this compress this talent stack will compress but we'll still have people building.
>> Why do you think we don't need PMs in this world? You you dangled the carrot.
this world? You you dangled the carrot.
>> Yeah. Yeah. It's it's my fun joke. I
think well first of all I think it's incredibly hard to define what a PM is, what a product manager is. I kind of think of the role as like actually explicitly undefined and your goal is
just to adapt to whatever the team or business needs. And you know often if
business needs. And you know often if you have a bunch of people like say here like trying to build as quickly as possible then what a what a product manager can do is spend time like taking
a few steps back and trying to look around corners and figure out what to do. You know collaborate with the the
do. You know collaborate with the the folks and go to market and maybe be the the team's like greatest cheerleader and quality raiser. But like all of those
quality raiser. But like all of those things I just described, which are maybe my current role, could be done by a really strong edge lead or a designer who thinks a lot about product. And so I think it's like often useful to have
product managers, but you probably don't want many of them until the team is really large.
I was stalking the [ __ ] out of you for the last few days which was a very fun expedition into your writing into your tweets into your prior interviews and you said that human typing speed and
validation work is the key bottleneck to AGI not model compute or architecture and it kind of left there and I was like help me understand why human typing speed and validation work is the key
bottleneck and and what you really meant by that.
>> For sure. Okay, that's a that's a fun one. I think there are multiple
one. I think there are multiple bottlenecks, but that's maybe the most sort of click- bay one. So, uh, if you don't mind, we'll do the slightly socratically. Like, how many times would
socratically. Like, how many times would you say you use AI today?
>> 30 plus times a day.
>> Okay, cool. How many times do you think assuming it was like zero energy expenditure from you, how many times you think AI could help you per day?
>> I mean, in in everything, I think we'll have inference running 24 hours a day across every single thing.
>> Exactly. And like I hear things now from engineers like at OpenAI and also outside who are telling me like you know I constantly have Codex running. I never
close my laptop and if it's not running while I'm in a meeting I'm like wasting my time. I need to make sure Codex
my time. I need to make sure Codex always has work for me that it's doing.
And that's like super cool and super exciting but that's a lot of work right to like manage this manage these agents and make sure they're always working.
Going back to the 30 times per day thing. Yeah. Like when we look at how
thing. Yeah. Like when we look at how often uh you know Codex users are using codeex, it's like kind of this like tens of times kind of range. I think AI should be helping us tens of thousands of times per day. You know, compute
budget permitting we'll and we'll get there over time. And but the problem is like at least if I think of myself like I work on this stuff. I know I should be using AI for everything, but I'm too
lazy to like type out that many prompts and I am too uncreative to figure out all the ways that AI can help me. And so
I end up kind of at a similar number as you. Um, and I'm, you know, I still am
you. Um, and I'm, you know, I still am at the point where when I use AI to do something cool like prep for this conversation with you, I'm like kind of proud of myself. I'm like, "Oh, cool. I
managed to use AI in this new way." And
but that's fine for people like you and me who are like really interested in this topic, right? But I don't think most people we should expect to like in order to benefit from, you know, AGI
should need to like put so much effort into how to use this tool. It should
just be effortless for them. And so I think the world we want to get to is one where to use AI you don't really need to like figure out the right way to prompt.
It's just super easy for you and you don't even need to recognize that AI could help you. It's just like knows you connected to your context and chimes in helpfully.
>> That's where I think like Claude has done well in terms of the packaging they've done like Claude for legal claude for Excel where you can implement it and have a DCF model. I'm not into models but like better than one could do
before. Do you think it is your job then
before. Do you think it is your job then to productize the prompts and the human actions to remove that bottleneck?
>> Yeah, totally. So I I think that it is our job to make sure that we have the models with the with amazing capabilities and then eventually to get
to a world where this is like highly productized and so you just have this like magic text box or audio input or whatever or you can just add AI to your like group chat and it just starts to help. But I think there's quite an
help. But I think there's quite an interesting in between stage and I think that that is actually where the most value lies right now. So here's what I mean. You could try to productize like a
mean. You could try to productize like a specific feature of AI for a specific market and I you know that many companies are doing this but I think it's a little bit hard to know what exactly will work what is the right form
factor. You know I someone was on your
factor. You know I someone was on your podcast earlier and I they said something that I thought was quite interesting about how you you cannot adopt AI at enterprise without fees.
Yeah, it was Matt Fitzpatrick from um Invisible AI.
>> Yeah. So, so even though I am literally hiring FDs and if you're an FD, please apply for a job with me. I actually
disagree with that entirely. So, what I think we need to do is build tools for people. Like you can use FTEEs, you
people. Like you can use FTEEs, you know, uh as as Fatrick said on the podcast like to automate workflows, right? But then you're limited by like
right? But then you're limited by like what you from your top down perspective can do and what you from your FTE staffing can can staff to be built, right? But for me, the most exciting
right? But for me, the most exciting future with AI is one where everyone just feels like a superhum um or a god just like empowered by AI.
And for that, we need tools that are for people, for individual users, and that everyone feels fluent with. I think the the phase that's most interesting that we're at now is building for the kind of
people who are interested in figuring out how to use AI. So what we need to ship and I think this was like the genius of like when cloud code first shipped. What they really got right um
shipped. What they really got right um was they had this tool that was super easy to use in whatever context you want just in your terminal and people started experimenting with where to use it. And
so I think as we think about AI being used outside of coding work, one of the most important things we can do is not overly build it like okay this is AI capabilities but only specifically for
finance only for specifically for this workflow but actually build a much more open-ended tool that someone can just use for any given task creatively. Yeah.
>> But does that not put the onus or the effort back on the user back to the point of your bottleneck of human action and lack of activity on them? If you
don't define the task, you put the responsibility on them for the defining the task which humans lack the ability or inclination to do.
>> Yeah, I think that so that's why I think it's the bottleneck. So basically here are the three phases in my mind. First,
let's have agents work really well for software engineering and coding because LLMs happen to be good at that. Next,
let's realize that for an agent to be useful more generally, it using a computer is super valuable. And also
we'll realize that all agents are actually coding agents because coding is just the best way for an agent to use a computer. So let's take that same super
computer. So let's take that same super flexible idea but make it available to anyone who's excited to explore and tinker. And we're already seeing people
tinker. And we're already seeing people start to do this with like the codeex app like people like Codex app is built for soft for builders but we're seeing builders use it for all sorts of non-coding tasks. Then finally, once we
non-coding tasks. Then finally, once we see what's working, let's build that like productization that you were talking about where you have highly specific features that just work immediately out of the box for people.
And I think we're going to speedrun this entire like one, two, three journey um in the next months.
>> My challenge with what you said about kind of FDs and implementation within enterprise is data security, sensitivity permissioning access provisions is really freaking hard and
people are much less intelligent and uh confident than we give them credit for.
I think especially in large enterprises, sorry. Um, and I think you actually need
sorry. Um, and I think you actually need an FDE to go in and custom fit a lot of the different horizontal solutions to make it work. Am I wrong?
>> I think you're right. If you're trying to go like all the way from 0 to one and you have this like and I say I don't mean grand negatively here, but if you have like a grand vision for some like ultimate workflow automation system,
then yeah, you're going to have to clear through all of these security hurdles, all these like compliance hurdles that are really real, right? Build
connections to all these data systems and like systems of record and action.
Um, yeah, so you're going to need an FD to do that. What I've seen is that when we do these things top down, we're we end up like massively underleveraging the potential of AI in like helping that
company. Whereas
company. Whereas if you can maybe do that in parallel, right? But if you can just give AI to
right? But if you can just give AI to the people like actually doing the work, um they can start to like get a mental model for how AI can help and then they can start pulling AI into their
workflows at the same time. Here's just
like an analogy or or something here is like imagine if um you know you work in like a customer support role and AI is being brought into your role and starting to automate like meaningful
chunks of your work but you've never heard of chat PT nor are you allowed to use it right so in this world in that scenario you have like no intuition for what this thing is whereas in a world
where actually you've been using chat PT for work at the same time as like parts of your work are getting automated by an LLM you have much more intuition for how this works And you know, I would argue you feel much more empowered about this idea that
it's being accelerated. You have some degree of control to steer like where these automations are built as opposed to like it's like this complete like Xmachina kind of thing. Um that is quite disempowering. So bringing this back
disempowering. So bringing this back like I think there is a way to do this because the data control issues you mentioned are real right but at the end of the day every tool every feature
every workflow is for a human who is somewhere right an employee somewhere and that employee is accessing that accessing that tooling via their browser or via their file system like at the end
of the day right and so at the end of the day everything comes to an interface that an agent running locally on your computer can work with and you know I I think it's quite unusual like in OpenAI
we're building a browser Atlas right and you might wonder why um and there are many reasons why but I think one of the key reasons is that by building a browser we can build sort of and by
controlling it like tightly end to end we can build like safe agentic browsing for enterprise that is a way to access things agentically that is that are otherwise not yet built out by FDS
>> there are so many questions that I have to ask you I I want to go back before I lose thread you mentioned about engineers like not closing their laptop tops because they don't actually want to lose productivity and time with with
building with codecs. You partnered with Cerebrus and Cerebrris is the fastest provider obviously of inference out there. Amazing win I think for both
there. Amazing win I think for both bluntly. How important is speed for
bluntly. How important is speed for developers when using codeax and in the future of AI code?
>> It's I mean these simple answers it's it's super important. Um, we
>> and so is it like an inference monopoly like you have it now and competitors don't?
>> This is just my opinion, but I don't think we're going to end up in like this kind of monopolistic world. I think
there's so much competitive pressure that there'll be like multiple answers to this. But I will say that we have
to this. But I will say that we have like news coming about coming out about that partnership soon and I'm very excited for these kinds of things to ship. It's it's going to be awesome. But
ship. It's it's going to be awesome. But
even so like you know with uh GPT 5.3 codeex that model is like significantly uh more efficient than prior models. And
so we in the feedback we've heard is that people actually feel like now this is like a very competitively fast model than before. So there's a lot of things
than before. So there's a lot of things you can do just in terms of the model.
There are also things you can do like improving uh how you do inference. So we
recently rolled out a change where in the API like those models are served like 40% faster and in codeex they're served like a quarter f 25% faster. So I
think like speed matters a lot and we're kind of approaching it from all angles like both the hardware how you do inference and the model level.
>> You mentioned earlier about kind of putting it in the hands of users and we talked about inference there. One of my dear friends is Jason Lmin from Zaster and he says that actually inference is the new sales and marketing. Instead of
sales and marketing teams, you're paying for inference so users can onboard quickly, easily, see value, and you will actually see the removal of sales and marketing teams. It's kind of like
nextgen of PLG.
>> H um I don't know. I think I struggle with that. I think I think like you know
with that. I think I think like you know fundamentally in this new world where anyone can build and it is increasingly easy to build things like what but what is hard right? I think having a good
relationship with customer, knowing what they need is as hard as ever, maybe even harder as it's just like there's just more stuff in the market to choose from.
You know, the other things that are hard are like building the right thing, having a really high quality thing. But
going back to the sales and marketing thing, like I don't think that goes away because I think that's as like I said, I think that's just gotten harder as the as the markets any given market gets more competitive with more software out
there. Can I ask how much of internal
there. Can I ask how much of internal code for you today is produced by codeex? I remember like claude for work
codeex? I remember like claude for work Boris said was like 100% or nearly 100%.
How much is internal codeex used?
>> So like I'll speak for myself and then for the team I would say like most people that I know are basically not opening editors anymore.
>> Um and this was a step function change that happened in it. It's been happening gradually, but I'd say that the external market touch point for this was like GPD 5.2 codecs where all of a sudden the
model was like way better at running for longer uh handling tasks end to end, you know, managing its its context um and following instructions and so we kind of saw this inflection point and that's
actually why part of why we built the app. Um so you know broadly I think
app. Um so you know broadly I think before GPD 5.2 two codecs the the kinds of AI features we were using to write code were like tab completion or maybe
you were pair programming with the model and in my mind you know you still needed to be at your laptop with your hands on the keyboardish and like it might go off and do a little bit of work but you know you you kind of still need to be there and like drive it's just like handling
these small things for you and then at the time of GPD 5.2 Codex in December, we kind of switched to like actually I'm just going to fully delegate this task.
It's like, you know, I'm going to have a do a plan with it, make sure we like the spec that it's going to do, and then I'm just going to go let it cook. And this
is quite a different way of working. So,
it's like it's changing like literally as we speak. And so, part of why we we built this Codex app that we released last week is because we wanted to build like a form factor or user experience where it felt like very ergonomic to be
delegating instead of pairing with an agent. uh and so like delegating to
agent. uh and so like delegating to multiple agents at once and so even at OpenAI this is changing massively um I don't have a percentage stat for you but I would say like the vast majority of
code is written by AI and I would say that now probably like most people are not even like opening IDE maybe if they are opening IDs to like maybe you want to own the interface right so you'll
like help flesh out like the interface between like two modules and then like AI fills it out or maybe you want to like collaborate on a plan but then have AI fill it out the code itself is not being written by humans anymore.
>> Will we have ids as a part of the stack in 24 months time?
>> Okay, so the the formal definition, right, integrated development environment? I mean that that phrase is
environment? I mean that that phrase is so squishy that like literally anything could be an IDE, right? So I don't think that's very useful. If that's the answer, then yes, you could even argue the Codex app is an ID. I don't I don't
think it is. Like for me, I think of an IDE as like a a really powerful editor.
We explicitly didn't build editing into the Codex app because we wanted it to be really clear how you're meant to use it.
So you know it has a lot of affordances for managing multiple agents for delegating um for for reviewing changes.
It has really prominent skills which are an open standard that are really useful for doing non-coding work stuff like you know triaging tasks um or monitoring deploys or something but it doesn't have text editing.
>> If we assume a large percentage is done by codecs in terms of the code produced how do you do coding reviews and is AI responsible for internal coding reviews?
So the there are a few things here. Um
first off the spec for what you want to do or the plan becomes more important than ever, right? So like think like architecturally like how should this code work.
>> Um so you know we recently shipped like a very prominent plan mode that works a little differently than others where you have the agent go off and like propose how it's going to do something. It's
like quite a long plan and then it asks you questions about if you agree on how it wants to do it or if you want to have input. And this is very similar to like
input. And this is very similar to like if you had a new hireer who was new to your codebase and um you know they had to present a sort of a a request for comments to the rest of the team before
they started doing the work. So even
though that's not formally code review I would say review of the plan is actually something that's becoming more important because we're entering more of this like delegation phase of working with agents.
So that's an underrated thing. Um then
okay there's actual code review. I think
a problem that I hear a lot of people talking about, especially in the open source world, is like a lot of AI slop.
Like people will just be submitting PRs to these open source repos and they're trash and like maybe the user hasn't even the person submitting the PR hasn't even tested them or definitely hasn't
reviewed the code. I think this is a problem. And so a common practice with
problem. And so a common practice with codeex is to have codeex like review its own PR uh or its own change. And Codex
is actually incredibly good at this.
we've explicitly trained the model to be good at code review. Um, and you know that included things like making sure it's like really good at uh creating like high signal feedback so it'll like
basically have few false positives of criticism which means you can really trust when it has feedback. And so we not only do we encourage people like on the team and elsewhere like to like just ask Codex to review, you can then also
set it up to just like automatically review. So like nearly all code at
review. So like nearly all code at OpenAI is reviewed by Codex automatically whenever you push it to a good repo. Actually, like one one fun
good repo. Actually, like one one fun thing uh for people who haven't tried Codex yet or didn't try it recently is uh sometimes the way that people like see how good our models are is by asking
Codex to review a different model's code and and basically they're like, "Oh, shoot. I should probably just be using
shoot. I should probably just be using Codex to write my code in general."
>> You said something really interesting there. You said for those that maybe
there. You said for those that maybe haven't tried it yet or you are coming back to it, how do you think about retention with this category? I remember
Tom Blonfield who's a YC partner tweeted months and months ago but it stuck with me a weird brain um about the ease of transition between different providers whether it was curs or claw code or
codeex I can't remember which one it was to be honest but how sticky are users and how do you think about retention >> we've taken this like kind of counterintuitive approach with codeex to
just build it super openly so like the codeex core harness is open source and we're always trying to make it easier for people to switch so for instance Um when we first launched Codex last
year uh we created like I mean it's created as even a heavy word it was just we just established convention which is called agents.mmd this is basically a
called agents.mmd this is basically a file that you can put instructions for the agent in we didn't call it codex.mmd
we just wanted it to be something that all agents can use and pretty much every agent except claude uses agents.mmd
which is awesome and then just last week actually uh we helped push for putting skills which are a standard for like giving the agent instructions and scripts we pushed for those to be stored in sort
of a neutral named folder called agents um instead of in like codecs or something and again everyone has jumped on it except the usual suspect uh so I think it's really great for the
developers to have a lot of choice um and we're trying to make it even easier for people to try different things now that said I think these coding tasks right where you're asking an agent to
write some code they're quite hermetic and what I mean by this because you can it's like or maybe an analogy in TV would be like episodic, right? Like you
can come in and you've got this like open-ended like agents file that like any agent can read from. You've got
these skills that any agent can use. Um
and you can ask the agent to write some code and it produces a patch and that patch goes into git. So kind of like both ends of this are pretty neutral, vendor neutral. So very easy to move
vendor neutral. So very easy to move between for now. As agents start to do work that is not writing code but more general work again for software engineers or beyond for any builder
they're going to need to start interfacing with other systems right so as they start maybe your agent is talking to Sentry right or it's talking to your Google Docs or something
then I think these agents become much stickier because actually deciding to connect an agent to that system is a sticky decision and if you're an enterprise really trusting that the
agent is going to have access to these tools, but there are really good secure guard rails and sandbox and like controls over how the agent works with these systems, I think is critically important. And that's not something that
important. And that's not something that you're going to want to to do multiple times. And so, you know, we've been kind
times. And so, you know, we've been kind of building codecs knowing that this is coming. Um, and so we have like the most
coming. Um, and so we have like the most conservative sandboxing approach.
Sandboxing is kind of like a set of controls, OS level controls over what the agent can do. Um,
>> but I'm I'm a fan of seven powers, this brilliant book which talks about kind of seven ways that businesses acrew value and sustainability and like you know your stickiness or your retention is one. If we're on the same team with
one. If we're on the same team with codeex, how do we create retentive patterns behaviors programs to ensure that people stay with
codeex and they don't flip to cursor when there's a better model or claw code when there's a better model.
>> Yeah. I mean it's interesting because I think on the one hand like we we think about this obviously we're running a business but you know our our mission here is to like ensure that like we safely deliver the benefits of AGI to
all humanity and so something that's like unintuitive to people about like the codeex team >> Alice you actually I'm I know but your job is the success of codeex I guess
>> our job is the distribution of intelligence right um and so we're obviously building out codecs and this is really unintuitive to a lot of listeners but like we put all this effort into training these models and
then we serve these models to our competitors, right? And from our
competitors, right? And from our perspective, >> this is so difficult for me as a venture capitalist to understand. You are aware of this.
>> Yeah, I'm totally aware of this like where OpenAI is like a really interesting and unusual place to work.
But basically, because we're playing such a long game for us, like if the competition gets better, we learn. It's
actually helpful for us. And so we're pushing really hard at growing codecs and finally growing >> if if they're closed >> Yeah.
>> and they improve, you don't learn.
>> I don't think so. Like for example, there are a bunch of recent launches.
Like even today, I literally just like quote tweeted a thing this morning about a launch from Warp. No particular
affiliation, right? And there are a bunch of cool ideas in there about how they like framed up the way that their agent can work in the cloud at the same time as working locally. For me that's like inspiring and I think I see all these things from various companies and
like one of the coolest things about the space is it's like we're all kind of inevitably reaching the same conclusions together and then building things out.
And so you know on the codeex team I think we have some massive advantages right we have the massive distribution advantage with chatbt we have the massive like capability advantage of training our own models to be good in
our harness and building our harness to be good at the new models and like no one else has early access to those. And
so I think we're we're playing to win and we have a a really big advantage or a number of advantages but we're also playing this long game where you know again we serve our models to everyone uh where we push for open standards so that
everyone can use like all the things that we're pushing for as well. Can I
ask you what will be the defining factor of winning? And I I know I'm using
of winning? And I I know I'm using venture language and you're you're brilliant in kind of much more free and open. Uh but what was like the defining
open. Uh but what was like the defining factor of winning? Again, if I push you, is it like GTM, which is like the biggest enterprises in the world do want to work with Open AI? I have many friends in your sales team. The inbound
that you get from the largest brands is incredible. So GTM because of the
incredible. So GTM because of the incredible brand product execution and just codecs being a freaking awesome
product or compute inference speed actual like compute advantage which one is the defining winner >> okay so I think if we're going to talk
about it more from an open AI perspective obviously this is way above my pay grade but I would say it's comput advantage and having the best models right and in order to achieve that we
then need to build businesses to generate revenue and also that something we've that's really interesting we noticed with having the Codex team which is a sort of combined team of research
and product is also by building these these successful products we create a lot of pressure to improve the model in sort of a faster way so that's maybe the
company perspective right if we come to the product perspective I think the single most important thing we can do is build a a really good product that people want to use and And like I was saying earlier, I think we really want
to build products for individuals and then allow the like people to become fluent in those products and then like pull in automation. And I I think that may be counterintuitive but will result
in way more impact than anyone purely approaching it from like the enterprise workflow perspective. Um so you know I
workflow perspective. Um so you know I think that's mostly a question of product execution and then that works for say like proumer. When it comes to enterprise, the go to market side is
really important. Like something that
really important. Like something that I've learned the hard way is if we go to an enterprise and we're just like, "Hey, we're here like feel free to use the stuff." That doesn't work. There's
stuff." That doesn't work. There's
actually quite a lot of education that needs to be done and there's a lot of like configuration that we need to support and sort of like education of the broader team. So like that motion looks much more like coming in pitching meeting the head of developer experience
or whatever understanding how they want their team to operate and then giving them tools to like propagate that mechanism of operating to the rest of the team. You you said the word revenue
the team. You you said the word revenue there which is one metric to measure a business against. When you think about
business against. When you think about like your metric of success which you sit down with Sam or Brad or whoever it is and say hey this is what we're optimizing for. What is the metric that
optimizing for. What is the metric that you use as the defining northstar for your progression?
>> It's actually not revenue is the primary. The primary is active users.
primary. The primary is active users.
>> Uh how do you measure active users like daily active users?
>> Yeah, we so we measure weekly active users and it's um it's just like uh you know did this person like actually do a turn in our product? You know, did they
send a prompt? Is weekly active a frequent enough metric do you think?
Sounds nice. But if this is actually replacing the IDE, is daily active not better?
>> I think daily active will be better soon. Yeah, we just happen to use weekly
soon. Yeah, we just happen to use weekly active. It's like a standard here. And I
active. It's like a standard here. And I
think as we were getting started, it made sense. But I I I actually agree
made sense. But I I I actually agree with the the criticism there. It's like
we should probably just be a daily. Like
I think we we need to be getting to a world where for any given task that you have your first instinct is to ask an agent to help, right? It's kind of like you know how like with Google search it's just like okay anything I need to do I just like go into this text box and
I can get navigate it to the right location. Then you have chatbt it's like
location. Then you have chatbt it's like for any information I need I can go into this text box type it out and get information that helps me. And I think the next phase that we'll see this year is like for any task I need to do as
opposed to just get information. I go to this text box or this input and something happens that helps me even if it's not the full task even if it's only a small part of it. You said about chat and the interface there. I'm I'm really
fascinated by this because it is a seemingly incredibly efficient input function for busy humans. But I I spoke to Anish Akaya who's a GP at Andre and it came out the other day and he's like
no no no no this was created by Sam and Elon and it works for very efficient people but most of the planet want browserbased discovery interactions UIs do you think that chat will be the
enduring UI in the next wave of AI interaction with humanity the simple answer is yes but actually I think there's two components here like if we if we just imagine the future like just like let's think of some sci-fi movie
right like what does AI look I I I I believe that sci-fi is a really good predictor of what the simp the future should look like. And usually
it's pretty simple because it's a story.
And I think simple is usually right.
It's going to be some just like entity that I can talk to however I want about whatever I want, right? I thought like I shouldn't have to navigate to a place where I work with like my coding AI and then I have this like different place
for my like sales AI and I have to like be like, "Hey, I'm now talking to sales thing and like do that." It's just like I'm just going to talk to a thing and it's just going to help. So I think what we're going to have is that we'll have
chat or voice. Basically conversational
interface will be sort of the the pillar of everything that you can talk to about anything um and that you can add into any group chat or whatever so it can like discover
how to help you. But then if you're like a power user and you're very good at a specific thing, you probably don't want to be disintermediated by having to talk to another person. It'd be like if you had an executive assistant, but you can
only work by talking to them. That's
like super annoying, right? So, at some point, you want to you want to get to the show notes and like look at them yourself and like edit them yourself, right? You want to edit the thing
right? You want to edit the thing yourself. So, I think we'll pair chat
yourself. So, I think we'll pair chat with like functional like graphical interfaces that are bespoke to like what someone needs. So, like in my case, I
someone needs. So, like in my case, I will probably chat to like do my, you know, podcast prep. But when it comes to like actually looking at product and code, I probably want like the codeex
app that I can go into and get deep in.
Whereas maybe if we're talking to a marketer, maybe that marketer will like chat to ask questions about the product.
They're not going to download the codeex app just to ask questions about the product, but maybe they'll have like a super custom guey for like ad analytics or something that they go into.
>> Totally get that. And it it kind of wrongly assumes on my behalf a consumer interaction at some point in that journey. And I want to ask you, how do
journey. And I want to ask you, how do you think about like agentto agent experiences and designing experiences for agents? We spoke about, for example,
for agents? We spoke about, for example, going to large enterprises and how you can be helpful. I'm just using the most boring thing ever, expense approval. You
could have agent submission of expenses on my behalf for my trip to San Francisco and then the agent on the flip side doing approvals for that from OpenAI's compliance department. Agent
agent. How do you think about that and that paradigm shift? My like quickest answer to this is that like we've noticed as we build codecs that the best
like the best interfaces for codecs to do work are also tend to be the best interfaces for humans. So like when people ask like oh like how can I make my codebase like more efficient for the
agent to work with? The answer is often like well have you looked at it yourself and is it is it easy for a human to work with? So like a very specific example
with? So like a very specific example would be like running tests in a codebase. naively if you just like set
codebase. naively if you just like set up most test runners, they just like emit all the outputs of all the tests.
And so like as a human, it's really annoying because you have to go in and like find the one that failed and it's like you've got to read hundreds of thousands of lines. Turns out that's terrible for AI as well. But if you filter it down to just only emit the
failed test, better for humans, also better for agents. So probably the agent to agent interaction points will be very similar to like if there was a human in the loop and that's nice because it means you can kind of atomically replace
individual systems. I mentioned our show on LinkedIn and a wonderful investor from a different company. It's like
Harry Potter, you know, Voldemort. It's
like, you know, he who shall not be named. Um, I don't want Sam to kill me,
named. Um, I don't want Sam to kill me, but from another company was like, you got, you ask him, ask him, how do you think about a coding data mode? And does
Anthropic have all the data now?
>> I definitely don't think they have a significant advantage in terms of data um on coding. I think that from what we've seen and you know and I I would
defer to my research team on this but I feel like we we feel like we have plenty enough data to build really good coding models. I actually think the the place
models. I actually think the the place that's more interesting for getting data now is like as we get into like knowledge work tasks. That's kind of data that's like not really like available most places on the internet.
And so you start to have like really interesting brainstorms for like how to help a model be good at it. Maybe you
have to like pay people um to like simulate doing tasks so that you can like learn these trajectories for the model. Maybe you should acquire
model. Maybe you should acquire startups, you know, that are no longer in business or that and uh but have have a lot of like data like say they're Slack or something. Um yeah, I think that that kind of knowledge work task
distribution is like much harder than coding.
>> That's so interesting you said there about kind of the data that doesn't exist so to speak. How do you think about your interactions with the data providers, your mccor, your cheerings, your invisibles, your du d of the world?
Like will you spend 10x there or will you go we are spending too much on data we should do it ourselves and do data acquisition?
>> Yeah. I mean, I think the way that we think about these things is just like how do we move as quickly as possible?
And so, you know, getting becoming able to set these things up inhouse is like very expensive in time and we're a small team. So, what I have observed so far is
team. So, what I have observed so far is that if we need to run a data campaign at scale, we're usually going to enlist help from one of these companies.
>> On the consumer side for codeex, we spoke about like enterprises and going into them, how to engage in terms of developer experience, developer relations. Do you compete with a lovable
relations. Do you compete with a lovable and a rapid on a like low-end consumer basis in a year or two's time? Is that a business where you're like, you know what, Codeex is not for every person to
create an about me or a small business to create their own site. How do you think about consumer in that way?
>> Yeah, I would say that right now it doesn't feel like we're competing super directly. Um, but you know, I don't know
directly. Um, but you know, I don't know if you saw our Super Bowl ad, uh, the tagline of which is just you can just build things. um with the app we noticed
build things. um with the app we noticed that like many many people who are less technical are starting to build things and so the kinds of things they're building are much more hello worldy and
so I think that we will see some overlap in use cases um where you have you know people just pulling up codecs because they have it as part of their chatbt
actually like a big announcement um last week was that we're now offering some codecs to people even on free chatbt plans or on the go chhatp plans so this is This is massive just in terms of like
bringing availability to everyone. Um,
and so I think we're definitely going to see people with like a free chatbt plan coming in and just like building simple things where they otherwise might have gone to a specialized tool.
>> What would you most like to do differently, but for whatever reason you can't?
>> This is an interesting one. I feel like it's been a very good few weeks for us.
Um, so we're very I'm pretty jazzed by everything that's happening. But maybe
the feeling that I had the most Yeah, >> that's really interesting. You said it's been a very good few weeks for us and I feel that. Does the team feel changing
feel that. Does the team feel changing winds of momentum both in positive and negative cycles?
>> Absolutely. We we are very attuned to it, right? Like if you look at the the
it, right? Like if you look at the the history of codeex, the first thing we launched last year was like this amazing idea that people were super excited about. It's like, hey, we're going to
about. It's like, hey, we're going to give the agent its own computer in the cloud. You can have as many of them as
cloud. You can have as many of them as you want work for you in parallel on tasks. super great idea. To be honest,
tasks. super great idea. To be honest, it didn't work as well as what we shipped later. It was not the best. Um,
shipped later. It was not the best. Um,
and then since August with GPT5, we started pushing really hard on interactive coding, which is where most of the competition in the market is. And
you know, we went on an absolute tear. I
feel like the public metric we had was like since August, we grew by like 20x.
And then like even like late in the year, we like doubled from December to now. I forget the exact number there but
now. I forget the exact number there but like you know that was competing neck and neck but the the shift that we feel last week is you know we h we felt like
we had the most intelligent model that was cemented with 53 codecs. We had
feedback around our model being slower and like maybe less fun to work with and like being less good at communicating with you while it was working. We
addressed that feedback.
Um, and that's true even compared to like the the other competitor model that launched like 20 minutes before us and was like maybe this is spicy. It was
like soda for 20 minutes. Soda means
state of the art. Um, and then you know we we'd always been getting a lot of feedback on like the quality of the user experience in codecs. Our most popular surface was the ID extension and our CLI
which is a command line interface was less polished. But with the app, the
less polished. But with the app, the feedback has been like resounding from the market. Uh that this is like a
the market. Uh that this is like a really high quality experience. It's
like simple, like unintuitively simple and people are just loving using even our biggest credits are converted. So
yeah, and then we and then we had the Super Bowl ad and then we went to free.
And so going back to your question of like what am I most want to do differently? The first is I actually
differently? The first is I actually want to get back to cloud. Um when we pivoted our strategy from like building the cloud like focusing on the cloud agent last year to working interactively
the thinking was very simple. It was
just and it's kind of like what I was telling you about FTEES actually. If you
go too far ahead to workflow automation before your end user is fluent with the tooling and can get it to work simply then there's like this disconnect and you just have this pipe dream idea that's not like effective except for the
most power users. But once you have this base where people are using your tool every day like you said and they're configuring it and every time they use it it gets better then like the step up to like letting it run independently in
the cloud is a much smaller step up right. Um so I think it's time for us to
right. Um so I think it's time for us to like get back to like building out the cloud product and making it super tightly integrated with the local product. It already is somewhat
product. It already is somewhat integrated. And the other thing I want
integrated. And the other thing I want to do differently is um start thinking more about the bottlenecks like codegen writing code has become like you know
basically trivial now um but the hard part is like what you were talking about with like code review right like how do we know the code quality is good how do we know we're doing the right things and those bottlenecks I think are under
underappreciated still and underinvested in so like I think we want to get to a world where you can have an agent that is unbottlenecked right that you trust to like own an entire microsystem or
internal tool or whatever and can do the full iterative loop including feedback from users without having to go through human review. And that is a really hard
human review. And that is a really hard problem to solve both from an intelligence perspective but also from like a safety perspective and a controls perspective.
>> How much weight should we place on benchmarks and eval?
>> I think probably this is an annoying answer for you. It's like some right like they do tell you they kind of in my mind they give you a good measure of intelligence,
right? Um, and so you can put weight on
right? Um, and so you can put weight on those for intelligence. Um, and
especially before evals are saturated, I think you when you see meaningful progress in those benchmarks, it's like very um very helpful. Um, and then I think you have to pair that though with
like what it feels like to use the model. And that's that's a vibes thing.
model. And that's that's a vibes thing.
Like whenever I talk to any like even internally or even talking to like customers of our models, I'm always surprised by how vibes based the uh evaluation of how it feels to work with a model is
>> how vibes based life is. People want to work with people they like is the lesson that I give to kids they like.
>> Yeah, relationships matter. Um can I ask you? I think that cursor will lose half
you? I think that cursor will lose half of their revenue this year. I think
we'll go from a billion to 500 million as a bold statement. Agree or disagree.
Can I just like no comment?
>> Uh yeah, you totally can.
>> I don't know. I I think it's really hard to say like like more serious answer here is just like I think they've built a really successful business. We see
them a lot when we're in enterprise.
>> I think one of the >> Yeah. or is it just claw code because I
>> Yeah. or is it just claw code because I don't know anyone that has no >> I see cursor a lot more than cloud code and it makes sense to me like my sort of
narrative for this is that you you have to meet people where they're at right and so for most people like they're used to using an IDE they're they've been used to using tab completion even before
there was AI right like tap completion existed pre AAI and then AI just made it better and so I think what's like coolest about cursor from my perspective is that it meets developer ers exactly where they are and it's a sort of a
switch. It's like you used to be using
switch. It's like you used to be using VS Code or something. Switch to cursor.
Almost nothing is broken about your workflow. Everything works just certain
workflow. Everything works just certain aspects got better. And obviously VS Code I still use VS Code. There's like
reasons you might like it more. Um and
you know they're improving rapidly as well. But I think that pitch from cursor
well. But I think that pitch from cursor lands well with a lot of people. And so
you know the bet on cursor I think is that they can like continue meeting people where they are and then like ladder into these more advanced agentic features. Um, so you know that that
features. Um, so you know that that relationship with a customer is valuable and it's hard to I don't think that goes, you know, goes away.
>> Do you think it was the right strategic decision to start building their own models?
>> It's hard to say, but I feel like there is a bit of a a gap in the market right now for that kind of model. And again,
if we think about what is the the at least my thesis, you know, I'm not like super close to like working with cursor or anything, but like my thesis for like how they win is that they meet everyone where they are and they like make it
really easy to like step up into using more advanced agentic workflows, right?
And so maybe they noticed that um the models that for example we were putting out or some of the competition were putting out were like kind of slow relative to what their customers wanted.
You know, my my first magic moment in cursor was like when I hit command K the first time. That's like a feature that
first time. That's like a feature that lets you select some code and just like edit it in line. And I was like, this is incredible, right? And so if they
incredible, right? And so if they noticed a lot of their customers want to be able to to kind of pair with the AI and then maybe after pairing with the AI for a while, then they start doing more delegation and then they move it to the
cloud, then there's a gap for that fast model that they trained. So I think that makes sense in that context. in terms of like market composition as an investor I have to think through how do I think
about the eventual state of this given market kind of a terminal stage how do you think about that is it like Uber andyft and like the majority of the
market will be on codeex or claw code or is it like a AWS Azure Google cloud and a 33 33 33
>> I think this might end up with fewer providers that are capturing a lot of you in the long run and here's why like and maybe this is a bit spicy but I
think that we are kind of in this temporary phase where we have agents that are really good at coding right and and if you look back last year like maybe more people thought we would have agents that are good at other domains
too but that didn't happen last year so so we only have PMF for coding agents like in the industry overall I would say right and then there's some like very narrow other use cases like customer
support etc um but I I think that's probably temporary and then over time I think we're going to end up with agents that kind of can do anything for you.
This kind of what I was saying earlier like there's just like a super assistant you talk it to it about anything and then there is like specific like UI that you can go look at if you happen to be deep in a specific function. So in that
world, I don't think you want like 12 agents at the company and you have to like go your employees have to go figure out the right one to talk to because then they won't achieve fluency. And if
they don't won't achieve fluency, then they will also won't like pull automation into their roles. But if you have this one thing that you can talk to about anything, right? So your
onboarding is just like go talk to this thing about anything you need, then people will develop muscle memory to go to it. It'll become the center of
to it. It'll become the center of gravity of work and people will pull in automation. So I think that that future
automation. So I think that that future makes much more sense and I think like as the people building chatbt were like really well set up to deliver that. This
this is kind of a stretch but an analogy here is I used to work at Dropbox and for a while this is before Slack was big and for a while we thought you know I wonder we wondered if people should like
go comment on like documents in Dropbox or and then or if they should like go talk about the documents in Slack and it was like obvious that it was like more
optimal for people to like put comments on the right time stamp in the video in Dropbox or like comment on the document in Dropbox, right? Right? So it was more optimal. However, what we saw is that
optimal. However, what we saw is that Slack is just such a a center of gravity of people just like talking to each other. Like nobody wants to comment on
other. Like nobody wants to comment on the document. I just want to slack you,
the document. I just want to slack you, right? And so we saw that like there was
right? And so we saw that like there was this really big pull towards things happening in Slack even if it was less efficient. And I think we're going to
efficient. And I think we're going to see something similar at work where if there is a single agent you can use for nearly anything, it there will just be this giant pull and everyone will talk about how they use that one agent for things. You know, teams will share best
things. You know, teams will share best practices with each other. there'll be
hackathons around how to use that best thing. Um, yeah, and you'll end up with
thing. Um, yeah, and you'll end up with just a handful of these.
>> You said about kind of agents not really proliferating in terms of usage other than coding and actually maybe this being the time and and you customer support is one of the examples.
My question to you is I'm an investor today. I'm looking for companies which
today. I'm looking for companies which will acrue value over time and provide incredible products to customers.
There is a belief that the durability of revenue of large SAS companies today is zero and that SAS is dead because the model providers you anthropic others are
going to come for our lunch so to speak.
What would you advise me?
>> Like things are built for humans. Like
otherwise, what's the point, right? Even
even SAS tools are built for humans. And
so for me, I think my question is like, does this SAS company own a relationship with a human on the other end of of things? And if it does, then I suspect
things? And if it does, then I suspect it's it's not going away. Um, you know, or does the SAS company own some like really important system of record? It's
probably not going away. Maybe those
both of those two things the interaction with the human and the system of record are like more important than ever actually. On the other hand, is the SAS
actually. On the other hand, is the SAS company like a kind of a glue layer? Uh
but it doesn't own either of those two things. I'm not the expert here, but I'm
things. I'm not the expert here, but I'm more nervous about that kind of company.
>> So then if we take that stance, Salesforce and Service Now, you know, they're down 20 30 40%. They shouldn't
be.
>> I don't think they should be. I don't
know. What do you think? I would love to hear your take on this. I think it's massively exaggerated. I I think there
massively exaggerated. I I think there are some companies that legitimately should be. Respectfully, I think Dropbox
should be. Respectfully, I think Dropbox is in a very difficult position. Um, and
I think your I think your Monday.coms of
the world though, for the majority of SMBs and consumers who use it, which is large majority of their market, actually, could they vibe code a to-do list? Yes. Would it be costefficient to
list? Yes. Would it be costefficient to do so? Not really, actually, by the time
do so? Not really, actually, by the time you customize it and perfect it. And to
be honest, a to-do list is generally pretty bland in terms of what you need to do. Add task, complete task, show
to do. Add task, complete task, show historical tasks, assign to new members.
It it's not very difficult. And so,
actually, I think you just keep it. And
so, I think it's massively overblown.
Um, and I think that's the classic knee-jerk reaction from Marcus.
>> I completely agree. I mean, if anything, like now that it's so much easier to build um >> but I do think, sorry, I I do think like I think you're going to come for customer support and I wouldn't want to
be in that category.
>> I think this maybe changes what kind of founder you invest in, right? Like I
think there was this maybe temporary phase where that I liked personally as a product builder. There was this phase
product builder. There was this phase where you would invest in like the the person who can just like build good product and you could kind of ignore if they had a good thesis around a customer or go to market or distribution or
anything like that because it was so hard to build good product, right? And I
think that was a that was an anomaly. If
we look at where we are now, like maybe that kind of founder is not the founder you should invest in because it's like kind of relatively easier to build good product and you need to go back to like investing in the founder who's like thought through distribution who has a good good domain expertise of what to
build for a specific customer. So again,
if you were on my team as an investor, how would you think about interesting areas for us to invest in in companies that will acrue value and not be threatened by model providers? Because
again like you're going into health, you go into code obviously codeax is very clear, you go into customer support.
[ __ ] where are you not going? Where's
Claude code not going?
>> I'm tempted to just say like I don't know. I could I I think it's a hard time
know. I could I I think it's a hard time to be an investor. It's uh the market is so dynamic it's hard to say.
>> It's a really tough time to be investing today.
>> My my answer is kind of twofold actually which is like number one I look for things with phys physical infrastructure. I don't think you're
infrastructure. I don't think you're going into energy supply. And then two is like the fintech and banking integrations, gnarly financial products.
I don't think open AI is going to go into building 500 relationships with banks in Southeast Asia.
>> Yeah, I I I tend to agree. I I again comes back to are you going into like a gnarly complicated market where customer relationships and like knowledge of the market are everything? That still seems
great. How bad is the war for talent?
great. How bad is the war for talent?
From the UK, we look at SF and I say to companies, it's better to build in Europe because it's impossible to acquire talent and it's impossible to retain it. Am I wrong?
retain it. Am I wrong?
>> I think that the war for talent is incredibly fierce right now. You know,
obviously at Open EI, we have an incredibly strong brand and so we're able to attract a lot of talent. Um, but
even so, um, we put a ton of effort into like closing candidates that we're really excited about. Um, even like even we feel that it's not like you don't just get whoever you want for free.
>> Can I ask at the entry price that you get stock at, is it still attractive for the best talent?
>> I haven't had anyone tell me anything to the contrary.
>> To what extent do you think about like finding the perfect fit versus finding someone who's good enough?
>> So, you know, earlier I made my joke about like PMs kind of being optional.
>> Yeah.
>> Um, I think that's not actually true.
You still need product people, but I do think that they have to be the perfect fit. And if you if you have someone
fit. And if you if you have someone who's like not the perfect fit, they might just do more harm than good. So,
it's kind of means that like we're way more selective than I might have been in other roles.
>> I'm a CS student, okay? I'm at Stanford.
I'm imperial. I'm at Cambridge. I'm
wherever. ETH. Great institution. What
would you advise me knowing all that you know now that would help me navigate the next 5 years of my career? I want
to be valuable to the AI ecosystem environment as an engineer entering the workforce in the next year. Basically,
there's actually never been a better time to be an engineer because you have incredible tooling available to you to get an incredible amount done and your ability to like ramp into like a complex
codebase that you might be hired into has never been faster because you can go ask AI like a ton of questions about the codebase and you can ask it to plan out changes that would otherwise take you
like days to research maybe. So I think first off I would say like you should be like very optimistic but then of course like about you want your abilities once you're at the job
then now the question is how do you get the job? I think that because it's never
the job? I think that because it's never been like easier to build things, the thing that becomes scarcer is like agency, taste, and like quality. And so
I would urge you to like like just build things and you know demonstrate your agency and your taste around what you build and like build things that are of high quality and then share those things
like you know we get a lot of inbound for from folks um you know both applying for jobs through the careers page or also on social. This is just me but when someone writes to me with like some
interesting thoughts and like a link to an interesting project that gets my attention much more than like a normal resume does. Final question before we do
resume does. Final question before we do a quick fire. What has claw code done well that you sit back and you learn from?
>> Number of things. Um, so like I was saying, I think a way back last year they made something that was really easy to use and just like worked with all
your tools with zero setup by running it locally in your terminal. And um when we um started investing much more in the codeex CLI and you know shipped great models for it like GPD5, our growth
exploded. And so I think that idea of
exploded. And so I think that idea of just like meet people where they're at, give them something easy to use, let them ramp from there and like figure out how to use it has been awesome. So, um,
that's probably the biggest learning we've had from them.
>> What mistake do you think they made that you've also learned from having had the benefit of seeing them make it?
>> I think that they overindexed on their initial success with their command line interface tool.
Um, I think at the end of the day it's like not the friendliest UI and it makes it hard to extend beyond like pure builders and it makes it um difficult to like truly delegate to agents because
like effectively to delegate through that kind of interface. You have to be like kind of a power user of like your terminal or T-Max or something. And so
that's that's why we built the app. And
I think the market reception around the app to me like it was kind of a risk when we started, but it makes me like really feel good about that decision because it's the app, the Codex app is
like a much more intuitive uh simple interface to like get started with. It's like less scary, but then it
with. It's like less scary, but then it naturally leads you to this idea of like I'm going to take my hands off the keyboard and like delegate to the agent.
>> You mentioned Dropbox earlier. It's the
the alumni from Dropbox is incredible. I
mean really like amazing to see the talent that's come out of Dropbox.
What's your single biggest lesson from Dropbox that has shaped some of your thinking now with OpenAI?
>> Oh, I I don't need to think about that one. That that's kind of the thing I was
one. That that's kind of the thing I was telling you about earlier, right? Like I
think when you're building tooling for people, like for end users, you have to think about like that tooling as a system of engagement, right? If people
don't want to use your tool, if it doesn't like naturally feel like the easiest way to get something done, then people just won't use it, right? And so,
like, again, I learned that from watching how Slack just absolutely took off. Um, and so I think about that a lot
off. Um, and so I think about that a lot now when we were building these agents.
I'm like, if we build our agent purely as like, you know, workflow automation, then it's always going to be like pulling teeth to get that thing started, right? You're going to need to hire
right? You're going to need to hire Accenture or someone to come in. They're
going to need to deploy FTEEs. It's
going to be tough. But if you can build a system that like people just love using even if they only use it for partial tasks over time they'll get better and better at using it and then that you'll get connected to the tools
you want over time and then you can start lading in automation. Obviously
these aren't mutually exclusive.
>> How on earth do you reinvigorate growth at Dropbox today?
>> At least from when I was at Dropbox the thing we were uniquely good at was desktop software.
And desktop software is it's funny it was never not back but anyways it's so back um basically because if you're solving
for productivity and knowledge work um yes there are systems of record everywhere that you need to connect with but everything at the end of the day happens on the user's computer either in their browser or you know just like
locally in apps on their computer and so I do think that the the the fastest way we're going to see productivity gains from agents at work is going to be at first meeting users on their computer, working with the stuff that they have
available to them, you know, without having deployed FTEEs to set anything up. And then over time, you'll connect
up. And then over time, you'll connect in these various systems. And so, if I was Dropbox, I'd be thinking about how do we leverage our unique domain expertise in like building really good like desktop software uh and this sort
of collaborative layer on top of your computer. How do we leverage that to
computer. How do we leverage that to enable productivity agents? It's a bit broad, but I think that's the angle you go for.
>> No, I love it and I really appreciate the response. Final one before we do a
the response. Final one before we do a quick fire, I promise. I've been brought up in a world where margin matters.
Software margins are wonderful and it's what makes software a brilliant category to invest in. We're seeing margin profiles that are very different in inferenceheavy players in particular.
To what extent should I put that out of mind and appreciate that cost will come down, cost of tokens will come down and actually it's about usage and customer love.
margins will come or no margins are actually freaking important. Keep keep
that focus.
>> I think both costs are going to come down significantly. And I also think
down significantly. And I also think that you know if this is the year of agents being deployed like broadly at work connected then this is also the year where they're going to have to be connected to all these various systems
and I think that's going to be very sticky and so I view this year as a race. And so I think you want to win
race. And so I think you want to win that race and you should be okay take you know taking some hit to margin in the meantime. Dude, quick fire round.
the meantime. Dude, quick fire round.
So, I say a short statement, you give me your immediate thoughts. Does that sound okay?
>> Yeah.
>> What have you changed your mind on most in the last 12 months?
>> When I joined OpenAI, I thought that this was a little longer than 12 months ago, but when I joined OpenAI, I thought that we would all just be hanging out with our computers screen sharing, but within a year from there, you know, we'd
have this agent that we're just talking to. Um, that was completely wrong. Um, I
to. Um, that was completely wrong. Um, I
think the rate of like progress in like multimodal models was like slower than I expected. uh multimodal means, you know,
expected. uh multimodal means, you know, like models that work with like video and audio. So instead, what happened was
and audio. So instead, what happened was that we saw that like agents that work with your computer through code are the way. And so for me, that's been a
way. And so for me, that's been a complete rethink in terms of like how we bring the benefits of AI to like just people generally. It's not not through
people generally. It's not not through video and audio primarily.
>> Which lesser known competitor do you respect most and why?
>> First one that came to mind was AMP.
>> Um I think they're building Yeah. AM.
>> Okay. um it's out of out of the folks at Source Graph. Um their product has a
Source Graph. Um their product has a great reputation of just being like, you know, punching way above its weight. But
I think the other thing that I really respect is that they helped initiate this whole like standardization around like agents. MD and like agents/skills,
like agents. MD and like agents/skills, which are what I was saying earlier about like making it so it's easier for users to manage all these different agents that they're trying. Um, you
know, we obviously put out agents.mmd,
but they put out agent.mmd and basically Quinn started this all by putting out a tweet that said, "Hey, if you guys buy the domain agents.mmd, will standardized to your your spelling." And as small as that was, that initiated this whole standardization that I think has been
awesome in the community.
>> Do you think the response to anthropics ads was the right response?
>> I mean, there were so many different responses. The one that I heard,
responses. The one that I heard, obviously, I think was right. The one
that I heard was well one company's being pretty negative about the future and the other company us OpenAI is being really positive and just telling people they can build things and to dream. I I
thought that response was brilliant.
>> I mean Sam wrote an essay.
>> Do you think it was a good response?
>> I think so. I mean I think as one of the cool things that I love about OpenAI is like people are like very unapologetically and authentically themselves.
>> Um and so for me that was just like a very authentic response and I like that we do that. What's the hardest product decision you've had to make since being at Codex?
>> Well, I can tell you the most painful product decision we had to make.
>> Great.
>> Um, for a while, um, Codex Cloud was like effectively unlimited. not free
like you needed to pay for chachbt but then you had unlimited usage and uh you know we every day that we left it that way we knew that would be harder to wind back it being like unlimited but we were
just so focused on competing on our other things that had more PMF that we kind of punted that decision out and when we when we wound back um that unlimited use to some like more reasonable limit there was a lot of
blowback from users and it was a very small minority of users who like thought everything should be kind of like pseudo free forever But that blowback affected us everywhere because like the social
chatter doesn't really distinguish uh between these things. So um I think the lesson I learned the hard way there is like you can't can't make things unlimited for too long.
>> Dude, it's like pricing grandfathering.
Pricing is just it's such a hard thing.
What do we do today in engineering or product that in 5 years time you'll look back on and go, "Oh my god, can you believe that we did that?"
>> Well, one is just editing code by hand.
Um I think probably another one, this is maybe spicier, but another one might even be uh like actually managing the
deployment and monitoring of um systems by hand. Like I basically think that
by hand. Like I basically think that probably big companies will take a long time to like deploy this, but many startups might actually kind of start building on a completely new stack
that's like fully AI managed. To be
clear, the stack doesn't exist yet, but a fully managed AI stack where because like basically it's been built to give you really strong deterministic guard rails over what the agent can do and
like control over to like roll back deploys and everything like that. And so
we'll get to a world where the way you start a company is you start by getting an agent and just asking it to build things and then you get more agents in that and then maybe eventually you add
you add your co-founders to this service that you use to work with agents. And so
you end up like maybe your main communication tool is actually your agent communication tool and then maybe uh you're not actually like handholding this like very point painful CI and deploy process but you're just like
having agents do things.
>> Weird question but I'm intrigued. Are
you the one providing agent guard rails?
And what I mean by that is you agents can go anywhere within an enterprise.
Are you responsible providing those guardrails or is there a third party matter provider who is saying hey or Alex you can't go into that that's human resources or oh you can't go into that that's marketing how do you think about
guardrail provisioning and is that the role of the agent provider or a third party provider I think we'll probably see both like we are putting a lot of effort into agent guardrails like I said
we have I think the most we're basically the only company that cares about OS level sandboxing for coding agents for instance there's none that exists on Windows We're the ones building that. Uh, and
we're doing it in open source, so hopefully other people can use it. We
think about that a lot. Um, we chat supports connectors, so you know, you can talk to your like Google Docs or something. And we put a lot of effort
something. And we put a lot of effort into guardrails around what the agent can do with your Google Docs. Um, so
those are just two examples, but we think a lot about this. And I think probably though the way that we'll do it will not be sufficient. like there'll be third parties who provide like very bespoke things for very bespoke, you
know, company needs and there'll probably be a mix of both.
>> Final one for you, my friend. What are
you most excited about when you look forward 10 years? This is probably going to happen in much less than 10 years.
But like my mission sort of personally when I joined the company was I just felt like even with the models we had a year and a half ago, there was so much just capability overhang or just ability
for these things to be useful, but we hadn't built the right products around that. And so people like me were getting
that. And so people like me were getting more benefit than like people like my grandma. And so what I'm most excited
grandma. And so what I'm most excited for is to get to like a form factor for AI that means that they're just helping everyone regardless of whether they're in tech and especially if they're not in
tech or especially if they're older. Um
and so you know the concrete vision I have is like at some point we'll like add an agent to um like our family WhatsApp or something and it'll just
start like being useful um to the family without anyone having to think harder about it than that. Um, there there are many other ways that that could happen, but I think concretely that's the most obvious thing we could do with like my grandma.
>> Dude, I so appreciate you. I so
appreciate you putting up with my wandering questions and my very uh episodic mind. You've been fantastic,
episodic mind. You've been fantastic, man.
>> Thanks so much. I mean, I appreciate you putting up with my wandering answers.
So, all good here.
Loading video analysis...