Andrej Karpathy: From Vibe Coding to Agentic Engineering

By Sequoia Capital

Summary

Topics Covered

December Was the Stark Transition to Vibe Coding
My Menu App Was Completely Unnecessary
AI Can Hack Code But Can't Tell You to Drive
We Are Summoning Ghosts, Not Building Animals
You Can Outsource Thinking But Not Understanding

Full Transcript

We're so excited for our very first special guest. He has helped build

special guest. He has helped build modern AI, then explain modern AI, and then occasionally rename modern AI. He

actually helped co-ound open AAI right inside of this office. Was the one who actually got Autopilot working at Tesla back in the day, and he has a rare gift of making the most complex technical

shifts feel both accessible and inevitable.

You all know him for having coined the term vibe coding last year, but just in the last few months, he said something even more startling. That he's never felt more behind as a programmer. That's

where we're starting today. Thank you,

Andre, for joining us.

Yeah. Hello. Excited to be here and to kick us off.

Okay. So, just a couple months ago, you said that you've never felt more behind as a programmer. That's startling to hear from you of all people. Um, can you help us unpack that? Was that feeling

exhilarating or unsettling?

Uh yeah, a mixture of both for sure. Uh

well, first of all, um I guess like as many of you, I've been using agentic tools like lot code, adjacent things, uh for a while, maybe over the last year as it came out and it was very good at you know chunks of code and sometimes it would mess up and you

have to edit them and it was kind of helpful and then I would say December was this uh clear point where for me I was on a break so I had a bit more time.

I think many other people were similar and uh I just started to notice that with the latest models uh the chunks just came out fine and then I kept asking for more and it just came out fine and then I can't remember the last time I corrected it and then I was I

just you know trusted the system more and more and then I was vibe coding and uh so it was kind of a I do think that it was a very stark transition. I think

that a lot of people actually I tried to I tried to stress this on uh Twitter and or X because I think a lot of people experienced AI last year as ChachiPT adjacent thing. Uh but you really had to

adjacent thing. Uh but you really had to look again and you had to look as of December uh because things have changed fundamentally and uh especially on this like agentic coherent workflow uh that

really started to actually work. Um, and

so I would say that um, yeah, it was just that realization that really uh, uh, had me um, go down their whole rabbit hole of just, you know, infinity side projects. Uh, my side projects

side projects. Uh, my side projects folder is like extremely full with lots of random things and, uh, just, uh, V coding all the time. Uh, so, uh, yeah, that kind of happened in December, I would say, and I was looking at the

repercussions of that since.

Um, you've talked a lot about this idea of LLMs as a new computer. um that it isn't just better software, it's a whole new computing paradigm. And um software

1.0 was explicit rules, software 2.0 was learned weights, software 3.0 is this.

Um if that's actually true, what does a team build differently the day they actually believe this, right? So uh yeah, exactly. So software

right? So uh yeah, exactly. So software

1.0, I'm writing code, software 2.0, I'm actually programming by creating data sets and training uh training neural networks. So the programming is kind of

networks. So the programming is kind of like arranging data sets and maybe some objectives and neural network architectures. And then what happened is

architectures. And then what happened is that basically if you train one of these GPT models or LLMs on a sufficiently large set of tasks implicit basically um implicitly because by training on the

internet you have to multitask all the things that are in the data set. Uh

these actually become kind of like a programmable computer in a certain sense. So software 3.0 know is kind of

sense. So software 3.0 know is kind of about uh you know your programming now turns to prompting and what's in the context window is your lever over the interpreter that is the LLM that is kind

of like interpreting your context and uh performing computation in the dig digital information space. So I guess um yeah that's kind of the transition and I think there's a few examples of that

really drove it home for me and maybe that might be instructive. Uh so for example when you when openclaw came out when you want to install openclaw you would expect that normally this is a bash bash script like a shell script. So

run the shell script to run to install open claw. Um but the thing is that in

open claw. Um but the thing is that in order to target lots of different platforms and lots of different types of computers you might run an open claw.

This these shell scripts usually balloon up and become extremely complex. But the

thing is you're still stuck in a software 1.0 universe of wanting to write the code. And actually the open claw installation is a is a copy paste of a b bunch of text that you're supposed to give to your agent. Uh so

basically it's it's a little skill of uh you know copy paste this and give it to your agent and it will install open claw. And the reason this is a lot more

claw. And the reason this is a lot more powerful is you're working now in the software 3.0 paradigm where you don't have to precisely spell out you know all the individual details of that setup.

The agent has its own intelligence that it packages up and then it kind of like follows the instructions and it looks at your environment, your computer and it kind of like performs intelligent actions to make things work and it debugs things in the loop and it's just

like so much more powerful, right? So I

think that's a very different kind of like way of thinking about it is just like what is the piece of text to copy paste to your agent? That's the

programming paradigm. Now I think one more maybe uh example that comes to mind that is even more extreme than that is when I was building um menu genen. So,

menu genen is this idea where you um you come to a restaurant, they give you a menu. There's no pictures usually. So, I

menu. There's no pictures usually. So, I

don't know what any of these things are uh usually like 30% of the things I have no idea what they are, 50%. So, I wanted to take a photo of the restaurant menu and to get pictures of what those things

might look like in a generic sense. And

so I built I've vcoded this app that basically lets you upload a photo and it does all this stuff and it runs on Verscell and uh it basically rerenders the menu and it gives you like all the

items and it gives you a picture that it uses an image um you know generator uh for to basically OCR all the different titles uh use the image generator to get pictures of them and then shows it to

you. And then I saw the software 3.0

you. And then I saw the software 3.0 version of this which is which blew my mind which is literally just take your photo give it to Gemini and say use Nanobanana to overlay the the things

onto the menu. Uh and Nanabanana basically returned an image that is exactly the picture of the menu that I took but it actually put into the pixels it rendered the different things in the

menu and this blew my mind because actually all of my menu gen is spirious.

It's working in the old paradigm that app shouldn't exist. uh and uh yeah the software 3.0 paradigm is a lot more kind of raw. It just um your neural network

of raw. It just um your neural network is doing more and more of the work and your prompt or context is just the image and the output is an image and there's no need to have any of the app in

between. Um so I think that people have

between. Um so I think that people have to kind of like reframe you know not to work in existing paradigm of what things existed and just think about it as a speed up of what exists. It's actually

like new things are available now. And

going back to your programming question, it's not even I think that's also an example of working in the in the old mindset because it's not just about programming and programming becoming faster. This is more general information

faster. This is more general information processing that is automatable now. So

um it's not just even about code. So

previous code worked over kind of like structured data, right? And uh you write code over structured data. But like for example with my LLM knowledge basis project um basically you get LLMs to create wikis for your organization or

for you in person etc. This is not even a program. This is not something that

a program. This is not something that could exist before because there was no there was no code that would create a knowledge base based on a bunch of facts. But now you can just take these

facts. But now you can just take these documents and uh basically uh recompile them in a different way and uh reorder them and create something that is uh new and interesting uh as a reframing of the

data. And so these are new things that

data. And so these are new things that weren't possible. Uh and so I think this

weren't possible. Uh and so I think this is uh something that I keep trying to get back to as to not only what can we do that existed that is faster now but I think there's new opportunities of just

things that couldn't be possible before and I almost think that that's more exciting.

I love the menu genen progression and dichotomy that you laid out and I think even I'm sure many folks here followed your own progression of programming from last October to early January February

this year. Um, if you extrapolate that

this year. Um, if you extrapolate that further, what is the 2026 equivalent um, for building websites in the '9s, building mobile apps in the 2010s,

building SAS um, in the last cloud era, what will look completely obvious in hindsight that is still mostly unbuilt today?

Um, well, going with the example of menu, I guess, uh, so a lot of this code shouldn't exist and it's just neural network doing most of the work. Um I do think that the extrapolation looks very weird because you could basically

imagine I don't I yeah so you could imagine completely neural computers in a certain sense you feed raw videos like imagine a device you takes raw videos or audio

into basically what's a neural net and uh uses diffusion to render a UI that is kind of like you know unique for that moment in a certain sense and um I kind of feel like in the early days of

computing actually people were a little bit confused as to whether computers would look like calculators or computers would look like neural nets and in 50s and 60s it was not really obvious which way would go and of course we went down the calculator path and ended up

building classical computing and then neural nets are currently running virtualized on existing computers but you could imagine I think that uh a lot of this will flip and that the neural net becomes kind of like the host

process and uh the CPUs become kind of like the co-processor so we saw the diagram of you know intelligence compute is going to of neural networks is going to take over and become the dominant spend of flops so you could imagine

something really weird and foreign when where neural nets are doing most of the heavy lifting. They're using tool use as

heavy lifting. They're using tool use as this like you know um historical appendage for some kinds of like deterministic tasks. Uh but what's

deterministic tasks. Uh but what's really running the show is these uh neural nets that are in a certain way.

Um so you can imagine something extremely foreign as the extrapolation but I think we're going to probably get there uh sort of piece by piece. Um and

I don't yeah that that progression is TBD I would say.

I'd like to talk a little bit about um uh this concept of verifiability, the fact that AI will automate faster and more easily domains where the output can be verified. Um if that framework is

be verified. Um if that framework is right, what work is about to move much faster than people realize and what professions do we have that people actually think are safe but that are

actually highly verifiable?

Uh yes. So I I spent uh some time writing about verifiability and um basically like traditional computers can easily automate what you can specify in code and uh kind of this latest round of

LLMs can easily automate what you can uh verify in a certain in a certain sense because the way this works is that when frontier labs are training these LLMs these are giant reinforcement learning environments. So they are given

environments. So they are given verification rewards and then because of the way that these models are trained they end up basically uh progressing and creating these like jagged entities that

really peak in capability in kind of like verifiable domains like math and code and adjacent and kind of like stagnate and are a little bit um you know rough around the edges when uh things are not kind of like in that in that space. So I think the reason I

that space. So I think the reason I wrote about verifiability is I'm trying to understand why these things are so jagged. Um and some of it has to do with

jagged. Um and some of it has to do with how the labs train the models but I think some of it also has to do with um the focus of the labs and what they happen to put into the data distribution. Uh because some things

distribution. Uh because some things basically are significantly more valuable in economy and end up creating more environments because the labs wanted to work in those settings. So I

think code is a good example of that.

There's probably lots of verifiable environments they could think about that happen not to make it into the mix because they're just not that useful to have the capability around. Um, but I think to me the big um I guess like the

big mystery is uh the favorite example for a while was that how many letters are are in a strawberry and the models would famously get this wrong and it's an example of jaggedness. Uh the models now patch this I think but the new one

is I want to go to a car wash to wash my car and it's 50 meters away. Should I

drive or should I walk? And

state-of-the-art models today will tell you to walk because it's so close. How

is it possible that state-of-the-art Opus 4.7 will simultaneously refactor a 100,000 like codebase line codebase or find zero day vulnerabilities and yet

tells me to walk to this car wash? This

is insane. And to whatever extent these uh models are remain jagged, it's an indication that number one maybe something's slightly off or um number two you need to actually be in the loop

a little bit and you need to treat them as tools and you do have to kind of stay in touch with what they're doing. And so

I think all of my writing long story short about verifiability is just trying to understand um why these things are jacked. Is there any pattern to it? And

jacked. Is there any pattern to it? And

I think it's some kind of a combination of verifiable plus labs care. Maybe one

more anecdote that is instructive is uh from GPT 3.5 to GPT4 people noticed that chess improved a lot and I think a lot of people thought oh well it's just a progression of the capabilities but

actually it's it's more that uh I think this is public information I think I saw it on the internet um a huge amount of like um data of chess made it into the pre-training set and just because it's

in a data distribution uh basically the model improved a lot more than it would just by default. So someone at OpenAI decided to add this data and now you have a capability that just peaked a lot

more. And so that's why I think I'm

more. And so that's why I think I'm stressing this um dimension of it as we are slightly at the mercy of whatever the labs are doing, whatever they happen to put into the mix. And you have to actually explore this thing that they

give you that has no manual. And it

works in certain settings, but maybe not in some settings. And you have to kind of um explore it a little bit. And uh if you're in the circuits that were part of the RL, you fly. And if you're in the circuits that are out of the data

distribution, uh you're going to struggle and you have to kind of figure out which which circuits you're in in your application. And if you and if

your application. And if you and if you're not in the circuits, then you have to really look at fine-tuning and doing some of your own work because it's not going to necessarily come out of the LLM out of the box.

I'd love to come back to the concept of jagged intelligence in a little bit. Um,

if you are a founder today and thinking about building a company, you are trying to solve a problem that you think is tractable, something that uh is a domain that is verifiable, but you look around

and you think, "Oh my gosh, well, the labs have really really started uh getting to escape velocity in the ones that seem most obvious, math, coding, and others." What would your advice be

and others." What would your advice be to to the founders in the audience?

Um so I think maybe that comes to the previous question of I do think that verifiability because it um let me think. So verifiability makes something

think. So verifiability makes something tractable in the current paradigm because you can throw a huge amount of RL at it. Um so maybe one way to see it is that uh that remains true even if the

labs are not focusing on it directly. So

if you are in a verifiable setting where you could create these RL environments or examples then that actually sets you up to potentially do your own fine tuning and you might benefit from that.

But that is fundamentally technology that just works. You can pull a lever if you have huge amount of diverse data sets of RL environments etc. Uh you can use your favorite fine-tuning framework and um and uh pull the lever and get

something that actually uh works pretty well. So um I don't know what the

well. So um I don't know what the examples of this might be. Um, but I do think there are some very valuable uh reinforcement learning environments that people could think of that I think are not part of the Yeah, I don't want to

give away the answer, but there is one domain that I think is very uh Oh, okay.

Sorry, I don't mean to vape post on on the stage, but there are some examples of this.

On the flip side, what do you think still feels automatable only from a distance?

I do think that ultimately almost everything can be made uh verifiable to some extent. some things easier than

some extent. some things easier than others. Um because even for like things

others. Um because even for like things like writing or so on, you can imagine having a council of LLM judges and probably get get to some get get something uh reasonable out of the um

from from this kind of an approach. So

it's more about what's easy or hard. Um

so I I do think that ultimately um uh yeah, I think uh everything everything is automatable.

Amazing. Okay. Um, so last year you coined the term vibe coding and today we're in a world that feels a little bit more serious, more regent engineering.

What do you think is the difference between the two and what would you actually call what we're in today?

Uh, yeah. So I would say vibe coding is about raising the floor for everyone in terms of what they can do in software.

So the floor rises, everyone can vibe code anything and that's amazing, incredible. But then I would say agentic

incredible. But then I would say agentic engineering is about preserving the quality bar of what existed before in professional software. So you're not

professional software. So you're not allowed to introduce vulnerabilities due to VIP coding. Um you are um you're still responsible for your software just as before, but can you go faster? And

spoiler is you can but how do you how do you do that properly? And so to me agentic engineering when I call it that because I do think it's kind of like an engineering discipline. You have these

engineering discipline. You have these agents which are these like spiky entities. They're a bit fable, a little

entities. They're a bit fable, a little bit stocastic, but they are extremely powerful. is how do you how do you

powerful. is how do you how do you coordinate them to go faster without sacrificing your quality bar and doing that well and correctly um is the the

realm of agentic engineering um so I kind of see them as as different like one is about maybe raising the raising the floor and the other is about um you know extrapolating and what I'm seeing I

think is there is a very high ceiling on agentic engineer uh capability and you know people used to talk about the 10x engineer previously I think that this is

magnified a lot more 10x is uh is not uh the speed up you gain. Um and I think uh it does seem to me like people who are very good at this um peak a lot more than 10x uh from from my perspective

right now.

I really like that framing. Um one thing that when Sam Alman came to AIN last year, one memorable thing he said was that people of different generations use chatpt differently. So if you're in your

chatpt differently. So if you're in your 30s, you use it as a Google search replacement. But if you're in your

replacement. But if you're in your teens, tragic is your gateway to the internet. What is the parallel here in

internet. What is the parallel here in coding today? If we were to watch two

coding today? If we were to watch two people code using OpenClaw, Claude Code, Codeex, one you'd consider mediocre at it and one you would consider fully AI

native. How would you describe the

native. How would you describe the difference?

I mean I think it's a just trying to get the most out of the tools that are available utilizing all of their features investing into your own um kind of setup. Uh so just like previously all

of setup. Uh so just like previously all the engineers are used to basically getting the most out of the tools you use either it's vim or v code or now it's you know cloth code or codec or so

on. So um just investing into your setup

on. So um just investing into your setup um and um utilizing a lot of the you know uh tools that are available to you.

Um and I think it just kind of looks like that. I do think that um maybe

like that. I do think that um maybe related thought is um a lot of people are maybe hiring um for this right because they want to hire strong agentic

engineers. I do think that um what I'm

engineers. I do think that um what I'm seeing is that uh the you know most people have still not refactored their um their hiring process for a gentic

engineer capability right like if you're giving out puzzles to solve and this is still the old paradigm I would say that hiring have to has to look like give me a really big project and see someone

implement that big project like let's write say a Twitter clone uh for agents and then uh make it really good make it really secure and then have some agents

uh simulate some activity uh on this Twitter and then I'm going to use 10 codecs 5.4x for X high to try to break your break your um uh this website that

you deployed and they're going to try to basically break it and they should not be able to break it. And so maybe it looks like that, right? And so yeah, watching people in that that setting and building bigger uh projects and uh

utilize utilizing the tooling is maybe what I would uh look at for the most part.

And as agents do more, what human skill do you think becomes more valuable, not less?

Uh so um yeah, it's a good question. I

think um well right now the answer is that the agents are kind of like these intern entities right so it's remarkable um you basically still have to be in charge of the aesthetics the the

judgment the taste and a little bit of oversight maybe one one of my favorite examples of like the the weirdness of agents is um for menu genen uh you sign

up with a Google Google account but you um purchase credits using a stripe account and both of them have email addresses and my agent actually tried to basically

um like when you purchase credits, it assigned it using the email address from Stripe to the Google email address like there wasn't a persistent user ID that that uh for people it was trying to

match up the email addresses, but you could use different email address for your Stripe and your Google and basically would not associate the funds.

And so this is the kind of thing that these agents still will make mistakes about is like why would you use email addresses to try to crossorrelate the funds? They can be arbitrary. You can

funds? They can be arbitrary. You can

use different emails, etc. Like this is such a weird thing to do. So I think people have to be in charge of this spec, this plan. And um I actually don't even like the plan mode. I I would I

mean obviously it's very useful, but I think there's something more general here where you have to work with your agent to design a spec that is very detailed and maybe it's uh maybe basically the docs and then get the agents to write them and you're in

charge of the oversight and the top level categories, but the agents are doing a lot of the under the hood. And

um so I think you're not caring about some of the details. So as an example also with um arrays or tensors in neural networks. Um there's a ton of details

networks. Um there's a ton of details between PyTorch and NumPy and all the different like pandas and so on for all the different little API details. And I

I already forgot about the keep dims versus keep dim or whether it's dim or axis or reshape or permute or transpose.

I don't remember this stuff anymore, right? Because you don't have to. This

right? Because you don't have to. This

is the kind of details that are handled by the intern because they have very good recall and but you still have to know for example that um you know there's underlying tensor there's an underlying view and then you can manipulate view of the same storage or

you can have different storage which would be less efficient and so you still have to have an understanding of what this stuff is doing and some of the fundamentals um so that you're not copying memory around unnecessarily and

so on but uh the details of the APIs are now handed off so it um you're in charge of the taste the engineering the design um and that it makes sense and that you're asking for the right things and that you're saying that okay that these

have to be unique user IDs that we're going to tie everything to um and so you're doing some of the design and development and the engineers are doing the fill in the blanks and that's currently kind of like where we are and

I think that's what everyone of course is seeing I think right now do you think there's a chance that this um taste and judgment matters less over time or will the ceiling just keep rising

um yeah it's a good question I would Okay.

Um, I mean, I'm hoping that the that it improves. I think probably the reason it

improves. I think probably the reason it doesn't improve right now is again, it's not part of the RL. There's probably no aesthetics cost or reward or it's not good enough or something like that. Um,

I do think that when you actually look at the code, sometimes I get a little bit of a heart attack because it's not like super amazing code necessarily all the time and it's very bloaty and there's a lot of copy paste and there's awkward abstractions that are brittle

and like it works but it's just really gross. Um, and I do I do hope that this

gross. Um, and I do I do hope that this can improve in future models. Um, a good example also is this uh you know micro GPT project which where I was trying to simplify uh LLM training to be as simple

as possible. The models hate this. They

as possible. The models hate this. They

can't do it. I tried to I keep I kept trying to prompt an LLM to simplify more simplify more and it just can't you feel like you're outside of the RL circuits.

It feels like you're obviously you know you're pulling teeth. It's not like light speed. So I think um I do think

light speed. So I think um I do think that people are still remain in charge of this. But I do think that there's

of this. But I do think that there's nothing fundamental again that's preventing it. It's just the labs

preventing it. It's just the labs haven't done it yet almost.

Yeah.

So I'd love to come back to this idea of uh jagged forms of intelligence. you

wrote a little bit about this with a very thoughtprovoking piece around animals versus ghosts. Um, and the idea is that we're not building animals, we are summoning ghosts. Um, and these are

jagged forms of intelligence that are shaped by data and reward functions, but not by intrinsic motivation or fun or curiosity or empowerment. Uh, things

that kind of came about via evolution.

um why does that framing matter and what does it actually change about how you build and deploy and evaluate or even trust them?

Uh yeah, so yeah, I think the reason I wrote about this is because I'm trying to wrap my head around what these things are, right? Because if you have a good

are, right? Because if you have a good model of what they are or are not, then you're going to be more competent at uh using them. Um and I do think that um I

using them. Um and I do think that um I don't know if it has I'm not sure if it actually has like real power. I think

it's a little bit of philosophizing. Um,

but I do think that um I think it's just um coming to terms with the fact that these things are not, you know, animal intelligences. Like if

you yell at them, they're not going to work better or worse or it doesn't have any impact. Um, and uh it's all just

any impact. Um, and uh it's all just kind of like these statistical simulation circuits where the the substrate is pre-training so like statistics and then but then there's RL

bolting on top. So, it kind of like increases the dispendages and um maybe it's just kind of like a mindset of what I'm coming into or what's likely to work or not likely to work or how to modify it. But I don't actually I don't know

it. But I don't actually I don't know that I have like here are the five obvious outcomes of how to make your system better. It's more just being

system better. It's more just being suspicious of it and um figuring out over time.

That's where it starts. Um okay, so you are so deep in working with agents that don't just chat. They have um real permissions. They have local context.

permissions. They have local context.

they actually take action on your be your behalf. What does the world look

your behalf. What does the world look like when we all start to live in that world?

Uh yeah, I think I think every a lot of people probably here are excited about what this agent uh you know native agentic environment looks like and everything has to be rewritten.

Everything is still fundamentally written for humans and has to be moved around. I still use most of the time

around. I still use most of the time when I use uh different frameworks or libraries or things like that, they still have docs that are fundamentally written for humans. This is my favorite pet peeve. Like I don't uh why are

pet peeve. Like I don't uh why are people still telling me what to do? Like

I don't want to do anything. What is the thing I should copy paste to my agent?

Like uh so it's just um every time I'm told, you know, go to this URL or something like that, it's just like ah you know. So um everyone is I think

you know. So um everyone is I think excited about how do we decompose the workloads that need to happen into fundamentally sensors over the world, actuators over the world. How do we make

it agent native? Uh basically describe it to agents first. um and then have a lot of automation around um you know the

um yeah around data structures that are very legible to the LLMs. Uh so I think um yeah I'm hoping that there's a lot of agent first um infrastructure out there and that you know for Menuguen famously

when I wrote the uh not I'm not sure how famously but when I wrote the blog post about Menuguen um a lot of the work a lot of the trouble was not even writing the code for Menugen it was deploying it in versell because I had to work with all these different services and I had

to string them up and I had to go to their settings and the menus and you know configure my DNS and it was just so annoying and so that's a good example of I would hope that menu gen that I could

give a prompt to an LLM build menu genen and then I didn't have to touch anything and it's deployed in that same way on the internet. Uh I think that would be a

the internet. Uh I think that would be a good kind of a test for whether or not uh a lot of our infrastructure is becoming more and more agent native. And

then ultimately I would say yeah I I do think we're going towards a world where um there's agent representation for people and for organizations and um you know I'll have my agent talk to your

agent uh to figure out some of the details of our meetings or or things like that. So,

like that. So, um I do think that that's uh roughly where things are going, but um yeah, I think everyone here is excited about that.

I really like the visual analogy of sensors and actuators. I actually hadn't thought of that. That's super

interesting, right?

Um okay, I think we have to end on a question about education. Um because you are probably one of the very best in the world at making complex technical concepts simple and deeply thoughtful

about how we design education around it.

Um, what still remains worth learning deeply when intelligence gets cheap as we move into the next a era of AI?

Yeah. Uh, there was a tweet that blew my mind recently and I keep thinking about it like every other day. It was

something along the lines of um, you can outsource your thinking but you can't outsource your understanding.

And um, I think that's really nicely put. I so

yeah because I still I'm still part of the system and I still I still have to somehow information still has to make it into my brain and I feel like I'm becoming a bottleneck of just even knowing what are we trying to build why is it worth doing uh how do I direct you

know how do I direct my my agents and so on so I do still think that ultimately something has to direct the thinking and the processing and so on and um that's still kind of fundamentally constrained

somehow by understanding and this is one reason I also was very excited about all the LM knowledge bases because I feel like that's that's a way for me to process information and anytime I see a different projection onto information. I

always like feel like I gain insight. So

it's really just a lot of prompts for me to do synthetic data generation kind of over over some fixed data. Uh so I I really enjoy uh whenever I read an article I have my uh you know my wiki that's being built up from these

articles and I love asking questions about things or um and I I think that ultimately these are tools to enhance understanding in a certain way and this is still kind of like a bit of a

bottleneck because then you can't direct the you can't be a good director if you still uh because the LM certainly don't excel at understanding you still are uniquely in charge of that. So, uh,

yeah, I think, uh, tools to that effect, I think are incredibly interesting and exciting.

I'm excited to be back here in a couple years and to see if we've been fully automated out of the loop and they actually take care of understanding as well. Uh, thank you so much for joining

well. Uh, thank you so much for joining us, Andre. We really appreciate it.

us, Andre. We really appreciate it.

Loading...

Loading video analysis...