Karpathy's Skill Just Fixed Claude Code's Biggest Problem
By Eric Tech
Summary
Topics Covered
- Four Karpathy principles stop LLM coding failures
- Embed constraints into the model's soul
- Pair every constraint with a specific skill path
Full Transcript
In this video, we're going to go over this Andrej Karpathy skills, which got over 100,000 stars on GitHub. And this
skill is derived from this expose that Andrej Karpathy has wrote. And if you don't know what Andrej Karpathy is, he is previously a director of AI at Tesla and a founding team at OpenAI. And
currently, this expose has got over 7 million views on X. And essentially,
what this skill does is addressing the problems that Andrej Karpathy sees on X.
And the first problem we see is the model here always make the wrong assumption. So, it doesn't ask
assumption. So, it doesn't ask clarification questions. When you give
clarification questions. When you give it a problem, act on it. And the second problem that he sees is that model here often times over complicating things.
Things can be wrote in 100 lines, often wrote in 1,000 lines. And often times, large language model here making changes that they're not supposed to without clear understanding about the full side effects. So, that's why in this video,
effects. So, that's why in this video, we're going to take a look at what it is, how does it work, and how does it different compared to any other skills that we have used in the past. So, with
that being said, if you're interested, let's get into the video. Now, before we continue, I recently launched our school community where help you to master AI agents, automations, and so much more.
And that's all coming from someone who used to work as a senior AI software engineer at companies like Amazon and Microsoft. And in this community, you're
Microsoft. And in this community, you're going to get over 100 plus video materials like templates and workflows that I personally built and sold over 100 plus times. On top of that, you're also going to get access to our weekly
live calls. And just to give you an
live calls. And just to give you an idea, this week we're actually running a Claude call masterclass where we're going to dive into how to improve Claude call's accuracy, and we're going to use it to building applications. Plus,
you're also going to get full community supports where you're going to get a chance to ask questions and get direct answers back. So, if you're ready to
answers back. So, if you're ready to level up, make sure you jump right in, and I'll see you in a community. All
right. So, to get started, let's take a look at what this skill is trying to offer. So, right here you can see this
offer. So, right here you can see this skill here offers four principles that we can use in our project, which will basically wrote inside of our Claude MD file. So, first principle here you can
file. So, first principle here you can see is think before coding. Large
language model here, like I said, makes wrong assumptions, doesn't clarify things. And by having this principle,
things. And by having this principle, it's going to force AI here to have to think before it's going to do the execution. And the second here is
execution. And the second here is simplicity. Things that can be wrote in
simplicity. Things that can be wrote in 100 lines of code never should be wrote like 500 or 1,000 lines of code, right?
Never should be complicating things. And
the success criteria here is that if a senior engineer says that this is over complicated, then we should definitely simplify. And that's exactly what this
simplify. And that's exactly what this principle is trying to solve. And the
third one here is surgical changes. So,
let's say we're making changes. Large
language models should never touch code that are not related to the instruction that we provide. And that's is actually what this principle does. And the last one that we have is goal-driven executions. We need to have a clear goal
executions. We need to have a clear goal on exactly how large language models here perform before it's going to do the executions. And it's similar to the
executions. And it's similar to the first one, but we need to have a clear success criteria for the expected behavior on the end result. And that's
the four principles that the skill is trying to introduce, and they all compacted down into a single Claude MD file that we can add into a project to make sure that large language model here never hallucinates when writing code.
Now, to put this into practice, here you can see it tells exactly how to install this. First of all, what we can do here
this. First of all, what we can do here is that we can install this in Claude plugins. Simply, we're just going to add
plugins. Simply, we're just going to add the skill in our marketplace and install it. And the second option that we have
it. And the second option that we have is let's say if you want to install this not globally, but in a project level, simply you can just do with this curl command here and add this onto your new project. But if you have an existing
project. But if you have an existing project, you can simply just going to run this command right here, and simply it's going to write this rule onto your Claude MD file onto your existing project. Now, for my case here, I do
project. Now, for my case here, I do have an existing project called bookzero.ai, where I help businesses here to manage receipts and transactions all using AI. And what I want to do here is I want to install this onto this project and see how does it work. Now,
in order to install this, all I had to do here is just going to copy this command. It's going to install this on
command. It's going to install this on existing project. So, here I'm just
existing project. So, here I'm just going to come over to a project terminal, open a new terminal, and just going to paste that command here. And
what this essentially this command does is going to modify my Claude MD file by simply adding those four rules onto there. So, in this case, I'm going to
there. So, in this case, I'm going to click click on enter, and you can see that it's going to modify my cloud MD file. So, now if I were to open the Git
file. So, now if I were to open the Git diff, and here you can see this is the changes that has applied. And for most of you guys, you probably already have your cloud MD file or your existing project. And by installing it, it might
project. And by installing it, it might have some contradiction or conflict with your existing rules. So, what I highly recommend you to do is basically tell cloud code to basically try to modify
your cloud MD file. Once you paste that, once you paste those four principles to see if there's any conflictions, any conflicts, anything that's not following what you have in your cloud MD file. Like try to merge the conflicts
file. Like try to merge the conflicts that you have. And you can see here that this is what it recommends. After it
pasted from the original car party skills repository, here's some problem that I found. For example, we have the duplicate H1 tag for the cloud MD file because the cloud MD file already has
that. And here you can see there's some
that. And here you can see there's some meta framing here that really tells a human reader what a doc does, but the cloud MD file here doesn't really need that. It needs the instruction, a clear
that. It needs the instruction, a clear instruction on exactly what needs to do.
And we has also have removed some redundant stuff that's not really relevant to our cloud MD file. And you
can see cloud code has also removed that and make it more concise. So, again,
same rule, less to read, so that we can be able to make our cloud MD file here more shorter. So, now you can see if I
more shorter. So, now you can see if I were to close this, this is exactly what it has modified, right? So, you can see we have our behavior guardrails. That's
going to be an H2 tag. And now we have our think before coding and simplicity first, as well as the surgical changes and the goal-driven executions. So, you
can see that all the four principles are still here, and there's no conflicts between what we have above versus what we have here. Okay? So, you can see that's exactly how this works. All
right. So, now you know exactly how to install it, let's take a look at the clear difference between how they're different compared to all the skills that we have mentioned on this channel, like G stack, superpowers, GSD, all those spectrum development frameworks
that we have introduced. And a clear difference is this. GSD, superpower, G stack, they are all skills. They're all
skills that are being triggered whenever we trigger them, right? But Claude MD file is different is that we enforce those rules inside of a Claude MD file.
I kind of like personality embedded into the model's brain. That every time when it do something, we don't have to mention it. It knows that because it's
mention it. It knows that because it's embedded into its personality, embedded into its soul that it's going to follow these four constraints every time we do something. When it asks to do I do
something. When it asks to do I do anything, like maybe helping us to writing a blog post or helping us to generate images or helping us to writing code. It's going to embed that. It's
code. It's going to embed that. It's
going to do this exactly like we mentioned in our Claude MD file. So
that's the clear difference between the two. Now, other than their types, let's
two. Now, other than their types, let's talk about the functionalities, right?
Because these four principles actually cover a lot of those things that we have mentioned previously on this channel for those skills that we have mentioned like G stack, superpower, and GSD. How is it different, right? In terms of
different, right? In terms of functionality. And you can see here that
functionality. And you can see here that I asked AI on exactly the difference.
And you can see here that rule number two and rule number three here has pointed out by AI that is uniquely different compared to superpower and G stack because none of them has mentioned anything about like adding extra stuff
or staying your lane or they don't really mention about these things as like the constraints. They both teach how to work carefully, but they don't know how much like to do, right? Your
guardrails here, which is, you know, our Claude MD file here fix that gap by mentioning these things, right? And we
also have something that's similar to what we have with superpower and G stack is rule number one and rule number four, which is think before coding and goal driven, right? That's exactly what
driven, right? That's exactly what spectrum development does is creating the plan before doing executions. And
what superpower and G stack does a different is not just bunch of text sets inside of a Claude MD file is a framework that we have to first write our spec, from spec create a to-do list,
and from to-do list creating a action, right? That's exactly what superpower
right? That's exactly what superpower and G stack does is creating a framework that large language model here follow, but it doesn't build into the brain. But
most of them are very similar, right?
The same rule, the same concept is very similar between the two. But that's why my recommendation, my workflow is combining the all of them, right?
Combining the two. Not just showing them the constraints, but also giving them the path on exactly which skill to trigger. For example, each of the
trigger. For example, each of the principles that Karpathy has mentioned, like think before coding, like don't assume, like service the the trade-off before writing code, we will give them the path on exactly what skill to
trigger. For example, just a couple
trigger. For example, just a couple empty files not going to cut it. We're
going to let them to basically direct them to trigger the superpower skills.
Like for example, the superpower brainstorming skill for adding new features. Or if it's like bug, error, or
features. Or if it's like bug, error, or test like test error, like test failure, we're going to have them to trigger the superpower system debugging skill. If
it's a multi-step, like three files, we're just going to directly having them to trigger the writing plan skill or the GSD planning phase skill. Basically try
to execute it really fast. Creating a
to-do list and try to execute it, right?
And for simplicity first, there's actually bunch of skills for simplicity, like making code more simplified, like minimize the code that solves the problem, nothing speculative. So you can
see before committing, polish. So
whenever we try to commit things, okay, well, let's trigger the simplify skill and try to simplify everything. All
right, so pretty much that's it for this video and All right, so you can see that's pretty much it for this video.
And if you do find my this video, please make sure to like this video, consider subscribe for more content like this.
But with that being said, I'll see you in this video. And honestly, don't even stop there. You can also include in G
stop there. You can also include in G stock, like the auto plan skill, where you can you have different rules here to introduce for think before coding. The
possibility here is is endless, right?
You can actually add a lot of things in here. For example, there's also surgical
here. For example, there's also surgical changes, you can also add like using different work trees for, you know, breaking it into different environments.
And there's also goal-driven executions, right? Making
sure that we're setting a clear goal, right? For the success criteria. So
right? For the success criteria. So
implement a feature or a bug fix. Okay,
well, let's trigger the test driven developments here. If it's like
developments here. If it's like executing a written plan, well, let's do the executing plan, right? So, there's
actually a lot of skills that does that.
For example, if we want to do a security review, there's also security review, there's also data audits, there's also QA. So, we'll give them path to the
QA. So, we'll give them path to the skills that we have and making sure that not only following the constraints, but also calling the right skills to act on it, right? So, you can see the
it, right? So, you can see the possibility here is endless and you can just pick the skill that you want and just add it into your Claude MD file and Claude knows exactly what skill it's going to trigger based on your preference. Okay, so, pretty much that's
preference. Okay, so, pretty much that's it for this video and honestly, I don't want to have my Claude MD file here to be too long, so, I just wanted to keep it short. Just keep some skills that I
it short. Just keep some skills that I really like inside of a Claude MD file.
All right, so, you can see that's exactly the end of our skills. And of
course, if you're interested in learning more about spectrum development models, be sure to check out the playlist here inside of the description below, where I show you all the spectrum development models that I talk about on this channel on how you can be able to make your
large language model here to be highly accurate when performing task on your project, right? So, if you're interested
project, right? So, if you're interested for that, make sure to check it out in the link in the description below. With
that being said, if you do feel my this video, please make sure to like this video, consider to subscribe for more content like this. With that being said, I'll see you in the next video.
Loading video analysis...