Andrej Karpathy Just 10x’d Everyone’s Claude Code
By Nate Herk | AI Automation
Summary
Topics Covered
- One X User Cut Token Usage 95%
- Give AI a High-Level Idea and It Builds
- One Article Spawned 23 Wiki Pages Automatically
- Markdown wikis beat semantic search on cost
- Wiki graphs work until you hit millions of docs
Full Transcript
What you're looking at right here is 36 of my most recent YouTube videos organized into an actual knowledge system that makes sense. And in today's video, I'm going to show you how you can set this up in 5 minutes. It's super
super easy. You can see here how we have these different nodes and different patterns emerging. And as we zoom in, we
patterns emerging. And as we zoom in, we can see what each of these little dots represents. So, for example, this is one
represents. So, for example, this is one of my videos, $10,000 aentic workflows.
We can see it's got some tags. It's got
the video link. It's got the raw file.
And it gives an explanation of what this video is about and what the takeaways are. And the coolest part is I can
are. And the coolest part is I can follow the back links to get where I want. There's a backlink for the WAT
want. There's a backlink for the WAT framework. There's a backlink for Claude
framework. There's a backlink for Claude Code. There's a backlink for all these
Code. There's a backlink for all these different tools I mentioned like Perplexity, Visual Studio Code, Nano Banana, Naden N. It also has techniques like the WT framework or bypass permissions mode or human review
checkpoint. So, as this continues to
checkpoint. So, as this continues to fill up, we can start to see patterns and relationships between every tool or every skill or every MCP server that I might have talked about in a YouTube video. And I can just query it in a
video. And I can just query it in a really efficient way now that we have this actual system set up. And the crazy part is I said, "Hey, Cloud Code, go grab the transcripts from my recent videos and organize everything. I
literally didn't have to do any manual relationship building here. It just
figured it all out on its own." And then right here, I have a much smaller one, but this is more of my personal brain.
So this is stuff going on in my personal life. This is stuff going on with, you
life. This is stuff going on with, you know, UpAI or my YouTube channel or my different businesses and my employees and our quarter 2 initiatives and things like that. This is more of my own second
like that. This is more of my own second brain. So I've got one second brain here
brain. So I've got one second brain here and then I've got one basically YouTube knowledge system and I could combine these or I could keep them separate and I can just keep building more knowledge systems and plug them all into other AI
agents that I need to have this context.
It's just super cool. So Andre Carpathy just released this little post about LLM knowledge bases and explaining what he's been doing with them. And in just a matter of few days, it got a ton of traction on X. So let's do a quick breakdown and then I'm going to show you
guys how you can get this set up in basically 5 minutes. It's way more simple than you may think. Something
I've been finding very useful recently is using LLM to build personal knowledge bases for various topics of research interest. So there's different stages.
interest. So there's different stages.
The first part is data ingest. He puts
in basically source documents. So he
basically takes a PDF and puts it into Cloud Code and then Cloud Code does the rest. He uses Obsidian as the IDE. So
rest. He uses Obsidian as the IDE. So
this is nothing really too game-changing. Obsidian just lets you
game-changing. Obsidian just lets you visually see your markdown files. But
for example, this Obsidian project right here with all this YouTube transcript stuff that actually lives right here.
This is the exact same thing. Here are
the raw YouTube transcripts. And here's
that wiki that I showed you guys with the different um folders for what Cloud Code did with my YouTube transcripts.
And then there's a Q&A phase where you basically can ask questions about YouTube or about the research and it can look through the entire wiki in a much more efficient way and it can give you answers that are super intelligent. He
said here, "I thought that I had to reach for fancy rag, but the LLM has been pretty good about automaintaining index files and brief summaries of all documents and it reads all the important related data fairly easily at this small
scale." So right now he's doing about
scale." So right now he's doing about 100 articles and about half a million words. So there's a few other things
words. So there's a few other things that we'll cover later, but the TLDDR is you give raw data to cloud code. It
compares it, it organizes it, and then it puts it into the right spots with relationships, and then you can query it about anything. And it can help you
about anything. And it can help you identify where there's gaps in that node or in that, you know, relationship, and it can go do research and fill in the gaps. All right. So why is this a big
gaps. All right. So why is this a big deal? Because normal AI chats are
deal? Because normal AI chats are ephemeral, meaning the knowledge disappears after the conversation. But
this method, using Karpathy's LLM wiki, makes knowledge compound like interest in a bank. People on X are calling it a game changanger because it finally makes AI feel like a tireless colleague who actually remembers everything and it
stays organized. It's also super simple.
stays organized. It's also super simple.
It will take you five minutes to set up.
I'll show you guys. You don't need a fancy vector database embeddings or complex infrastructure. It's literally
complex infrastructure. It's literally just a folder with markdown files.
That's it. You literally just have a vault up top. So in this example, it's called my wiki. You've got a raw folder where you put all of the stuff. And then
you've got a wiki folder, which is what the LLM takes from your raw and puts it into the wiki. So in here you have all the wiki pages which it will create but then you also have an index and you have a log. So for example in my YouTube
a log. So for example in my YouTube transcripts vault here is the index. You
can see that I have all these different tools which I could obviously click on and it would take me right to that page or after that I have all the different techniques agent teams sub agents permission modes the WAT framework and
then we've got different concepts MCP servers rag vibe coding we've got all these different sources which are you know the YouTube videos and then when I have people or when I have comparisons they will be put in here in the index
and then we also have a log which is the operation history so in this case in the YouTube project the log isn't huge cuz I only ran one huge batch of the initial 36 YouTube videos, but now every time I
have one, I say, "Hey, can you go ahead and ingest the new YouTube video into the wiki and then we'll see every single time we update this." And then, of course, you need your claw. MD to
explain how the project works and how to search through things and how to, you know, update things. It's also a big deal from a cost perspective, token efficiency, and long-term value. One X
user turned 383 scattered files and over a 100 meeting transcripts into a compact wiki and dropped token usage by 95% when querying with Claude. And obviously
token management and efficiency is a huge conversation right now and will always be. The other thing that's really
always be. The other thing that's really cool about this is there's not really like a GitHub repo you go copy or there's not a complicated setup. You
literally just say hey cloud code read this idea from Andre Karpathy and implement it. And people on X are now
implement it. And people on X are now talking about like this is how 2026 AI agentic software and products will be made. You just give it a highle idea and
made. You just give it a highle idea and it goes and builds it out. And Karpathy
even said, "Hey, you know, I left this prompt vague so that you guys can customize it." And I'll show you the
customize it." And I'll show you the ways in my two different vaults right now that it changed things a little bit based on the context and understanding of what the project is actually for.
Okay, so this was the original tweet I just showed you guys and then he followed up and said, "Hey, this one went viral. So here is the idea in a
went viral. So here is the idea in a gist format." So if you open this up,
gist format." So if you open this up, this is basically just another explanation of the core idea of how this works and why the architecture, indexing, all this kind of stuff. And by
the way, this is the part where he says, "Hey, this is left vague so that you can hack it and customize it to your own project." So we're going to come right
project." So we're going to come right back to this in a sec, but the first prerec that we're going to do, it's not necessary, but I like to have a nice little front end to see the relationships, is we're going to go to Obsidian and download it. So, if you
just go to obsidian.mmd, you can see this is the completely free tool and you're going to go ahead and download it. So, just for your operating system,
it. So, just for your operating system, download this and then open up the wizard and open up the app. So, when you open up the app, it'll look like this.
And what we're going to do here is we're going to create a new vault. So, down
here, you can see I have Herk Brain and I have YouTube transcripts. I'll just
make it a little bigger. I'm going to go to manage vaults. I'm going to create a new one. And now, we just have to give
new one. And now, we just have to give this a name. So, I'm just going to call this one demo vault. and you're going to choose a location where you want to put this. So, I'm just throwing this on my
this. So, I'm just throwing this on my desktop and I'm going to go ahead and create this vault. Then, what you're going to do is go to wherever you like to run Cloud Code. So, in this case, I'm doing it in VS Code. And I open up that
folder. So, demo vault. We get an
folder. So, demo vault. We get an Obsidian and then we get a welcome.md.
So, I'm going to open up Claude. So, I'm
going to do it in my terminal. I'm going
to run Claude. And lately, I've been liking using my terminal better for Claude. I like to do it inside of VS
Claude. I like to do it inside of VS Code, but the reason is just because I like to see the status line and I have, you know, a little bit more functionality. So, anyways, now that we
functionality. So, anyways, now that we have Cloud Code open, here's what we're going to do. We're going to go back over to the LLM wiki thing that we got from Andre Carpathy. We're going to copy all
Andre Carpathy. We're going to copy all of this and we're going to go back into Cloud Code and then just paste it in there. So, that is the prompt from
there. So, that is the prompt from Carpathy that's going to build out everything we need. And then before we send that off, we're dropping this in which you guys can screenshot and then just throw into yours. But I'm saying
you are now my LLM wiki agent. Implement
this exact idea file as my complete second brain. Guide me step by step.
second brain. Guide me step by step.
Create the cloudmd schema blah blah blah. So this is just telling it what it
blah. So this is just telling it what it needs to do with this idea that we just got from kpathy. So anyways, on the right we have this cloud code running and on the left we have our obsidian vault and you can see it just created
those two folders. So it created the raw and it created the wiki as you can see.
Now, by default, it threw in four folders. It threw in analysis, concepts,
folders. It threw in analysis, concepts, entities, and sources. Once we start to populate stuff, we can talk to it to see if that's actually the way we want to do it or not. Because it's interesting in my personal kind of second brain, the
wiki is literally just markdown files.
There's no structure to it. And in some cases, that's good. Carpathy actually
said, "Sometimes I like to keep it really simple and really flat, which means like no subfolders and not a bunch of over organizing." But then you guys did see in my YouTube transcript one, there were different subfolders. And I
think that in this case it actually makes more sense. But you can see that it went ahead and it created a claw.md.
It created an index and a log and then a few different folders in our wiki. But
now it's saying, "Hey, let's go ahead and try it out. Drop in your first source into the raw folder and tell me to ingest it." Okay, so I'm at this website called AI2027. If you guys haven't read this before, it's kind of an interesting read. So go check it out.
And now let's say I want to get this into my vault. What I could do is just copy the whole page, right? And it might just come through a little weird. or we
can just get an Obsidian extension which lets us basically take articles right from the web and just put it right into our vault. Super easy. So search for
our vault. Super easy. So search for this extension called Obsidian Web Clipper. You would go ahead and add this
Clipper. You would go ahead and add this to Chrome. So then when you're in the
to Chrome. So then when you're in the article that you want, you basically just click on your extensions, you open up Obsidian Web Clipper, and then you can just chuck it into your vault. And
then right here, you're going to want to set this to RAW because this is the actual folder that it's going to put it in. So you can go ahead and click add to
in. So you can go ahead and click add to Obsidian. Open Obsidian. And then now
Obsidian. Open Obsidian. And then now you can see in my raw section we have this AI 2027 source with the title the source and it's not super super
populated yet because the LLM in cloud code is going to do that. So here is our file. I'm going to open up cloud code
file. I'm going to open up cloud code and say awesome. I just threw in an article called AI 2027 into the raw. Can
you please go ahead and ingest that? It
might ask you some questions. It might
also be helpful to before you start ingesting stuff say hey by the way this project is specifically for my second brain. So, personal things, business
brain. So, personal things, business things, whatever. Or this is just a
things, whatever. Or this is just a research project. This is where I'm
research project. This is where I'm going to chuck you all the articles and all the things that I want to learn about and all the things that I know.
So, there's different ways that you can set up the project as you saw with mine.
One for YouTube, one for just personal second brain. So, now what it's doing is
second brain. So, now what it's doing is it's going to read through this article and then it's going to figure out where should I chuck everything into the wiki.
It's not just going to create one MD file for this. It might create five or it might create 10. And there's going to be relationships between each of the different sections that it creates. So,
it's kind of doing its own method of chunking. Now, one thing I want to call
chunking. Now, one thing I want to call out real quick is with this extension, if you go here and you open up the options for it, you can see that you can actually change where by default the
folders are dropped, which is in the location section. By default, it'll be
location section. By default, it'll be going to a place called clippings, but just go ahead and change that to raw.
Okay. So, here it came back with all these questions, right? It said, "Here are my key takeaways from this article, blah blah blah." And now it'll ask you, "What do you want to emphasize from this article? What's your focus? How granular
article? What's your focus? How granular
do you want to be? what's your plan? So,
I'm just going to say I want this to be extremely thorough. This is my passion
extremely thorough. This is my passion looking at where AI is going to go. Um,
and this whole project, by the way, that you're setting up in this vault is basically just going to be my place to dump in research about AI. So, help me keep all that organized so that I can query it and that I can, you know, keep
my thoughts related. So, that's just a quick example of what it might look like for you to give it some more context to continuously build your project. So, I'm
going to switch over over here to the graph view because I think it it'll be interesting to see as it is starting to go through and create those different wiki files. It's going to go ahead and
wiki files. It's going to go ahead and it's going to create all those relationships and we'll be able to watch it in real time. All right, so it's creating all of the wiki pages now and you can see that it said it's going to make about 25 because there's so much
stuff going on in the original AI 2027 article. Okay, so our first one just
article. Okay, so our first one just popped in here and there a second one just came through and now you can understand you're starting to see where do you have hubs or where do you just have little individual nodes? So this is
a major hub. Someone named Eli, someone named Thomas, Daniel, and you can see all the different relationships here with things like AI governance with things like OpenBrain, superhuman coder.
Okay, so that ingest took about 10 minutes. So sometimes you have to be a
minutes. So sometimes you have to be a little patient with, you know, it reading through everything and organizing everything, but it does a lot of heavy lifting, of course. When I
uploaded the 36 YouTube transcripts in batch, it took about 14 minutes. So it
kind of just depends, but it created 23 wiki pages. We have the source. We have
wiki pages. We have the source. We have
six people, five organizations and one AI systems page, different concepts, so technical alignment and geopolitical and then an analysis and then it asks some questions about it so that it can help
make the relationships and make the structure even better. Now let's just open this one up a little bit deeper and see what it actually did in here with this stuff. So we have this is the
this stuff. So we have this is the source with all the main relationships.
So as we start to add other articles, we will see other big kind of like nodes and maybe in some cases we'll have relationships between like compute scaling with different articles that we upload as well. So let's just see if I
click into the main source, we can see the tags that it got. We can see the authors and we can click around. So
here's a link to OpenAI. Okay, what's
OpenAI? Here's references in AI 2027.
Here's some other connections with OpenAI like modelsp spec. Okay, we're in model spec. Let's take a look. We can
model spec. Let's take a look. We can
see other things about modelsp spec. And
we could also go to how the LLM psychology model works. So this is just super super cool all the relationships that we get. And once again, all of this stuff that we're looking at was derived from one article and automatically
organized and related. So the question now is like what do we do from here? Do
we query it inside of this environment?
Do we query it from somewhere else? And
that's completely up to the way that you want to use this. So for example, with my YouTube project, I'm probably just going to keep this here. And whenever I want to ask questions about YouTube or if I want to turn this into like a website, I can just do that from here.
Or if I need to, I can point a different project at this folder since everything's here and it can crawl through the wiki, it can read the index and it knows how this stuff works because you can give it the clawmd so it understands the project as well. So for
example, in this one which is just my second brain where we have all of the different things about like I drop in my meeting recordings, I drop in, you know, ClickUp channels, summaries and things like that. This is something that I want
like that. This is something that I want to use in my executive assistant. So
what I did in my executive assistant here called Herk 2, if I go to this cloud.MD, MD you can see that we have a
cloud.MD, MD you can see that we have a wiki path. So whenever you need to read
wiki path. So whenever you need to read things about me and my business that you don't have already, you would basically go to my herkbrain vault. You would go to that directory and then you would read through the wiki. You can read the
hot cache which I'll explain in just a sec. You can read the index. You can
sec. You can read the index. You can
read the domain subindex and then you can also just search through everything here. And I said don't read from the
here. And I said don't read from the wiki unless you actually need it. Here
are some things that you might do that you don't need to go read the wiki for.
And all of this is my business knowledge. Now, if you guys remember, if
knowledge. Now, if you guys remember, if you watched my video on setting up an executive assistant, I used to do this with context files inside of this project. And when I changed over to this
project. And when I changed over to this method, I actually saw a reduction in tokens that I was actually calling in this project. So, the thing about the
this project. So, the thing about the hot cache, right, I didn't actually have this in my YouTube one. So, if I go to YouTube, you can see there's no hot cache. But, if I go to the herk brain in
cache. But, if I go to the herk brain in the wiki, you can see there's a hot.mmd
right here. And this is basically just a cache of like 500 words or 500 characters that it saves, which is like what is the most recent thing that Nate just gave me or that we talked about. In
the context of my executive assistant, this is really helpful. You know, it might save me from having to crawl different wiki pages. But in something like the YouTube transcript project, I don't really need a hot cache. So,
another thing that I alluded to but didn't really cover was the idea of linting. So Karpathi says that he runs
linting. So Karpathi says that he runs some LLM health checks over the wiki to find inconsistent data, impute missing data with web searches, find interesting connections for new article candidates,
things like that. So it basically helps you run a lint, you know, every day, every week, whenever you want, which helps make sure that everything is scalable and structured in the right way. And it might even come back and
way. And it might even come back and say, "Hey, I don't fully understand this. Can you give me some more info or
this. Can you give me some more info or can you grab some more articles that might help me out here?" So now the final question about this that I wanted to cover is like does this kill semantic search rag? And the answer is no, but
search rag? And the answer is no, but kind of yes. And it all depends on the goal of the project and the goal of the context, how much context you have. So
here's a really quick chart that I had my cloud code make. I was in my Herk brain where I dumped in a bunch of information about Karpathy's LLM knowledge and I just said, "Hey, can you please explain Karpathy knowledge as
simple as possible, keep it super concise, and um compare it to typical semantic search rag." So, it found Carpathy's idea. Instead of a database,
Carpathy's idea. Instead of a database, you just give the LM well organized markdown files and it compares it here to the actual semantic search rag. So,
actually, I might as well just read it off from here. So it finds it by reading indexes and follows links rather than using similarity search. So we're
getting a deeper understanding of relationships because they're links rather than just saying, "Hey, these chunks seem similar." As far as infrastructure, it is literally just markdown. So like I said, you don't even
markdown. So like I said, you don't even need the obsidian. You just need these markdown files. Whereas with semantic
markdown files. Whereas with semantic search, you need an embedding model. You
need a vector database and a chunking pipeline. The cost over here is
pipeline. The cost over here is basically free. Your only cost is going
basically free. Your only cost is going to be tokens. Whereas over here, you might have ongoing compute and storage.
And for maintenance, you just run a lint. You clean up things. You add more
lint. You clean up things. You add more articles. You give it more context
articles. You give it more context rather than having to re-mbed when things change. But right now, the
things change. But right now, the weakness of course with the uh LLM knowledge wiki is that it doesn't scale huge across enterprises, right? Because
it's just a bunch of files. Um and that is where the cost will probably get more and more expensive than going to something like standard semantic search or knowledger graph or light rag or whatever other tool is out there for
that. So here you can see if you have
that. So here you can see if you have hundreds of pages with good indexes, you're fine with wiki graph. But if you were getting up to the millions of documents, then you're going to want to actually do more of a traditional rag pipeline, at least for now with how the
current models are and everything we know right now in April 2026. So that is going to do it for today. I hope you guys learned something new or enjoyed the video. And if you did, please give
the video. And if you did, please give it a like. It helps me out a ton. Now,
after this video, if you're interested in learning how you can create your own sort of executive assistant and then plug it into this Obsidian vault, then definitely check out this video up here where I go over how I built my executive assistant and the way that you should be
thinking about it. So hopefully I'll see you guys over there, but if not, I'll see you in the next
Loading video analysis...