Claude Skills Explained in 23 Minutes
By Shaw Talebi
Summary
Topics Covered
- Clearer Instructions Always Yield Better LLM Results
- Skills Enable Progressive Context Disclosure
- Skills vs MCP: Claude-Specific Efficiency
- Subagents Isolate Context for Focused Tasks
Full Transcript
Hey everyone, I'm Shaw. In this video, I'll explain Claude's new skills feature. I'll start by discussing what
feature. I'll start by discussing what this is and how it works, highlight how it fits together with existing concepts like MCP and sub agents and then finally
walk through a concrete example of using skills with Claude code. So, what are Claude skills? These are just reusable
Claude skills? These are just reusable instructions Claude can access when needed. So, we have Claude here. whether
needed. So, we have Claude here. whether
you're using it on the web, on your desktop, or through Claude Code. And
skills are just bundled special instructions that Claude can automatically access whenever it's relevant to the conversation. You're
probably wondering, okay, why does this matter? Why should we care about skills?
matter? Why should we care about skills?
And it all boils down to a single fact of large language models, which is the clearer your instructions are, the better your results are going to be.
What this typically looks like in practice is one of two ways. One is that every time you want to get Claude or any other LLM to do something, you're going to manually write out the instructions
or go back and forth with the LLM until it has a clear idea of what you're trying to get it to do. Another slightly
better way this might look is that you have these instructions already written out and you have them saved in a folder or a notion template or a Google doc or whatever it is and you simply just grab
whatever instruction you need at that moment, copy it and paste it into whatever large language model you're using. While these work well enough most
using. While these work well enough most of the time, it can definitely be a lot of work to write instructions from scratch or even go find, copy and paste instructions into your favorite LLM
every time you want to have it do something special. And so this is where
something special. And so this is where Claude skills are helpful. They make
this process much more streamlined because you can write these instructions once. For example, you have some
once. For example, you have some instructions that tell Claude how you wanted to explain technical concepts to you. Or you have a whole workflow you
you. Or you have a whole workflow you want Claude to go through to validate SAS ideas. Or you have a set of design
SAS ideas. Or you have a set of design principles and tools that you want Claude to use when evaluating your front-end designs. You can just write
front-end designs. You can just write out these instructions once and then save them as a skill. And then the really cool part is that you don't have to search across a notion document or whatever to go and find these
instructions when you want to give them to Claude. Claude is smart enough to
to Claude. Claude is smart enough to look at all the skills it has access to and pick the one that is most relevant to that use case. So we talked about what skills are and why we should care
about them. Now let's talk about how
about them. Now let's talk about how they work. The implementation of skills
they work. The implementation of skills is actually super straightforward.
Fundamentally, it's just a folder with a file in it. And so the structure looks like this. You'll have a folder with
like this. You'll have a folder with whatever you want the name of your skill to be. Could be AIT tutor, SAS idea
to be. Could be AIT tutor, SAS idea validator, front-end design audit, whatever it is. And then within that folder, you'll have a file called skill.md.
skill.md.
The skill file has two main components.
The first is its metadata. So the
metadata consists of the skills name and a short description for it. And then the second thing is the body. And so this is going to look like a typical prompt in a markdown format, but this will just be
those specialized instructions you want Claude to be able to access. And so one cool thing about skills is that since we have these two components, we have the metadata that give us a highle description of what the skill does and
then we have the body which are the instructions themselves. This enables a
instructions themselves. This enables a clever way of giving Claude these skills. What this looks like is we'll
skills. What this looks like is we'll have Claude here and instead of just dumping the metadata and the body all in the context window at the start of our conversation with Claude, instead what will happen is that Claude will have its
system prompt and then the way it gets access to our skills is that just the skill metadata will be added to the context window. This is helpful because
context window. This is helpful because the metadata are going to be much shorter than the body or the skills themselves. So the metadata, there's a
themselves. So the metadata, there's a character limit of I think 64 characters for the name and 1,024 characters for the description. So the metadata is
the description. So the metadata is going to be pretty short, while the body can be like 5,000 tokens. And then if the skill is relevant to the conversation or if the user asks Claude
to use a specific skill, then it'll read in the body of the skill file. What this
unlocks is that now instead of having to be very thoughtful about what skills you give to Claude when because you don't want to overrun its context window with irrelevant things. So, going back to
irrelevant things. So, going back to those three examples from before, the AI tutor, the SAS idea validator, and the front-end design audit, these are completely different skills that will be
used in completely different contexts.
And then, if I'm just dumping all the specialized instructions in the context window at the outset of all my conversations, I'm just going to be wasting a lot of tokens and a lot of Claude's attention on things that aren't
relevant to whatever I'm working on in that instance. But with skills, because
that instance. But with skills, because the metadata are so lightweight, we can have hundreds, if not thousands of skills and have a relatively small impact on the context window. And then
if there's a particular skill that's relevant, Claude can call it in as needed. So while this is the simplest
needed. So while this is the simplest skill you can implement, just a single skill.md file with metadata in a body,
skill.md file with metadata in a body, we can actually do more with this skills feature. We can actually have multiple
feature. We can actually have multiple files in this skill folder. These
additional files are more instructions that Claude can read as needed. And so
the way that this context is managed is that let's say we have Claude here and it decides that a particular skill is relevant. So it reads the skill body.
relevant. So it reads the skill body.
But then within the skill body, we have a reference to a markdown file called how tooxx.m MD. Claude can go one step
how tooxx.m MD. Claude can go one step further and read the content of how tox feels that it's relevant to whatever
it's working on. To make this more concrete, if this is the SAS idea validator and we have specialized instructions for validating BTOC ideas, Claude can call these specialized
instructions for validating BTOC ideas.
if the user has a business idea that is direct to consumer. But wait, there's more. We're not just limited to adding
more. We're not just limited to adding additional files to this skills folder.
We can also add additional folders. And
so let's say we have a lot of specialized instructions. So we have
specialized instructions. So we have how-tos for not just X, but also for Y and Z. We can organize them in a folder.
and Z. We can organize them in a folder.
So we can have additional folders where we have some logical organization of information that we want Claude to be able to access whenever it needs it. And
so this will work in exactly the same way. In the case where Claude calls one
way. In the case where Claude calls one of our skills, it reads the body. And
then within the body, we are referencing this howto's folder. And then Claude can read these files one at a time or not at all depending on the context, depending on the conversation. But we're not just
limited to giving Claude additional instructions. We can also give Claude
instructions. We can also give Claude specialized tools. And this is because
specialized tools. And this is because Claude lives in a virtual environment where it has a bash shell with Python installed and NodeJS. So that means
Claude can run terminal commands, it can run Python scripts and it can run JavaScript scripts. And so if we have a
JavaScript scripts. And so if we have a folder called scripts and in that folder we have a Python file, Claude can call this Python function to do a specific function. And so this will work in the
function. And so this will work in the same exact way as giving Claude additional instructions. So we have our
additional instructions. So we have our skill body that it'll call anytime the skill is relevant. And then within the skill body, we can reference this specialized tool. And then Claude
specialized tool. And then Claude because it has access to the bash shell, it can just run this command python script/fu.py
script/fu.py to execute that function. And so skills give us a very simple way to give Claude specialized skills and tools for
particular workflows or tasks. And it
doesn't do this by just blindly dumping all the skills and tools into the context window at the start of the conversation. It does this one step at a
conversation. It does this one step at a time. And so this is a key feature of
time. And so this is a key feature of skills and is what Anthropic calls progressive disclosure. This is just a
progressive disclosure. This is just a fancy way of saying that skills give Claude just enough context for the next step. So instead of dumping everything
step. So instead of dumping everything in the context window up front, content is only injected into the context window when it is needed. And so there actually three levels to this which we've talked
about. First level were the metadata
about. First level were the metadata living in the skill.md file. This enters
the context window right when you boot up claude code or claude on the desktop or wherever you're using claude. And
this is going to be limited to about 100 tokens. The next level of this is the
tokens. The next level of this is the body of the skill.md file. And so this enters the context window when Claude invokes that particular skill. This body
can be up to 5,000 tokens. But this
wasn't the only way we could give Claude specialized instructions. There was also
specialized instructions. There was also this third level which were all the files and the folders within the skills directory which Claude could access as needed. And then for this content,
needed. And then for this content, there's practically no limit to the amount of content you can put in the skills folder. And so ultimately, skills
skills folder. And so ultimately, skills give us better context management without sacrificing capabilities. Before
going to the concrete example with claude code, I just wanted to address some confusion people were having between skills and MCP. And so here I want to talk through their differences
and when it makes sense to use each.
This confusion is warranted because skills and MCP are not mutually exclusive and they actually have a lot of functional overlap and so skills as
we've seen provide us a way to give LLM tools and instructions and MCP does the same exact thing. MCP also gives us a way to provide resources and prompts and
tools to large language models. But one
of the key differences is that skills only work with claude while MCP is a universal open standard. So MCP allows any LLM to talk to any application while
skills will only work with Claude.
Another key difference is this progressive disclosure. So as we
progressive disclosure. So as we discussed in the previous slide, Claude only gets the context it needs for the next step. While with MCP at the outset,
next step. While with MCP at the outset, the server will dump all the tools and all the tool metadata into the context window. And so just for context,
window. And so just for context, Notion's MCP server has a wide range of tools and each of those tools has a lot of metadata that are basically
self-documenting and self-describing so that the LLM can know when it's appropriate to use a specific tool. All
of that is going to be dumped into the context window whether we're using Notion or we're not. And so this is going to be on the order of like 20,000 tokens that's just being dumped into the
context window. On the other hand, if we
context window. On the other hand, if we create a skill for notion, at the outset, we're only going to dump the name and the description of our notion skill, which will be on the order of 100
tokens. And so that's two orders of
tokens. And so that's two orders of magnitude of context that we're saving because of how skills work. And so
despite that, skills and MCP do a lot of the same things. At this point, they have slightly different main use cases.
The main use case for skills is teaching Claude how to do things with its available tools. And so Claude comes out
available tools. And so Claude comes out of the box with a lot of great tools and maybe you have a few MCP servers that you like that also have a lot of great tools. A great use case for skills is
tools. A great use case for skills is teaching cloud how to better utilize these tools. On the other hand, the main
these tools. On the other hand, the main use case for MCP is giving Claude access to complex tools and integrations. And
so while we could create a notion skill, this would probably be a lot of work because we'll need to dive into their API, understand how it works, and build custom tooling to do that. And so in
these cases where building the tool sets or building out integrations to your favorite software is going to be complicated. Just using an off-the-shelf
complicated. Just using an off-the-shelf MCP server is still the way to go.
Another thing worth mentioning is how skills and MCP fit together with the sub aents feature in Claude Code. Claude
Code is a popular coding agent developed by Anthropic and it has this feature of sub agents which are specialized agents for specific workflows with their own
context window. Them having their own
context window. Them having their own context window is really the key feature of sub agents as we'll discuss in a little bit. comparing that to skills is
little bit. comparing that to skills is these are specialized instructions and tools for specific workflows while MCP are specialized tool sets for specific workflows. So again, there's a lot of
workflows. So again, there's a lot of functional overlap between these things.
So I wanted to walk through how all of these things will fit together in a typical interaction with cloud code.
We'll start with our main cloud code agent. It has default tools. It has its
agent. It has default tools. It has its system prompt and it has the UI that we can interact with. And this will all live in a context window. So all the
tool calls, the system prompt, our messages, Claude's responses will all live in this context window. And at
startup, we'll have our skills. So all
the skills that we have preloaded into Claude code. The metadata for the skills
Claude code. The metadata for the skills will be injected in the context window.
And if we have any MCP servers, these will also be injected into the context window. So the main agent basically is a
window. So the main agent basically is a superset of everything in our session.
But sometimes we want Claude to do something without bloating our main context window with tokens for a specific task. For example, if we're
specific task. For example, if we're building some kind of web application and then we want to research a specific library, say fast HTML, we don't necessarily need to do that with the
main claude code agent. What we could do instead is to call a sub agent that has its own context window that also sees all the skills that are preloaded into
our cloud code, but has a specialized MCP server. Let's say this is the
MCP server. Let's say this is the context 7 MCP server that allows cloud to fetch updated documentation. This sub
agent can go off research the fast HTML library, understand what libraries are compatible with it and put together a whole specification sheet of what text stack we should use for our web
application. And so this will all live
application. And so this will all live in a separate context window and it can just return the result to our main coding agent. And so the main value of
coding agent. And so the main value of sub aents is this better context management. We don't have to run
management. We don't have to run everything in the same context window.
We can spin off a new one with the sub aent. The way skills fit into the sub
aent. The way skills fit into the sub agent is that it will have access to all the skills that we have preloaded into our cloud code account. And then
finally, we can give it specific MCP servers so we're not overloading it with all the MCP servers in our cloud code account. So now I want to walk through a
account. So now I want to walk through a concrete example of using cloud skills.
Here I'm going to create an AI tutor that will explain technical concepts in plain English. And so the way this will
plain English. And so the way this will work is we'll have flawed code and then we'll have the specialized skills. So
we'll have our skill.md file. We'll have
a specialized file that describes how to do research. And then finally, we'll
do research. And then finally, we'll have a custom tool that allows it to pull transcripts from specific YouTube videos. And all the files and code for
videos. And all the files and code for this are freely available at GitHub at this link here and in the description below. Starting with the skill.md file.
below. Starting with the skill.md file.
So we have our folder called AI tutor and our skill.md file. And our skill MD file looks something like this. So we
have the metadata up top. And then we have the instructions. And so this is the name AI tutor. We have a short description. Use when user asks to
description. Use when user asks to explain, breakdown or help understand technical concepts, A IML or other technical topics. Makes complex ideas
technical topics. Makes complex ideas accessible through plain English and narrative structure. And then we have
narrative structure. And then we have body. This will just be like a typical
body. This will just be like a typical prompt that will use prompt engineering tricks to write and it'll be in a markdown format. So the next thing we
markdown format. So the next thing we want to look at are these special instructions for doing research. So I
have a new file called research methodology.md.
methodology.md.
And the way Claude knows that this thing exists is that in the skill.md file I have a snippet here. If a concept is unfamiliar or requires research, load
research methodology.md for detailed
research methodology.md for detailed guidance. And so if this situation
guidance. And so if this situation happens, then Claude can open up research methodology, which is just a regular markdown file. So no metadata here. It'll have its title. Use this
here. It'll have its title. Use this
guide when you encounter concepts outside your reliable knowledge or when explaining cutting edge developments.
When to research, and then I'll have research guidelines within there.
Finally, we'll add the special tool that'll allow Claude to get the transcript of specific YouTube videos that it comes across in its research. So
here, instead of having reference to this tool in the skill body, it's actually referenced in research methodology.md.
methodology.md.
So it shows Claude how to actually use this tool. So, it's just by running a
this tool. So, it's just by running a command in the terminal and then it will execute this Python script which takes in a video's URL or ID and we'll grab
the transcript for it. So, that's
everything. Now, let's look at a demo here. I've got cursor opened up and
here. I've got cursor opened up and we've got all the contents of the skill folder. So, the skill folder actually
folder. So, the skill folder actually lives in your home directory. So, we can actually take a look at it here. So if I do pwd, we'll see for me my claud folder is at my home directory. So it's users
shaw.claude
shaw.claude skills and then the name of our skill.
So this is the AI tutor and then this is the skills file. We have the metadata here. We have the full body. So the
here. We have the full body. So the
instructions on basically how to explain technical ideas in simple terms and just kind of summarizes all my communication tips. So you can see that this is 130
tips. So you can see that this is 130 lines or so. We also have the research methodology which is like another 200 lines. So this is why it's helpful to
lines. So this is why it's helpful to have research methodology separately because if we had it in the main skill file, this would be a very long instructions and not every question is going to require research because it
might just be a well-known thing like a neural network or how to do matrix multiplications or how gradient descent works. So, these are well-known
works. So, these are well-known technical concepts and they're not things that Claude needs to research about because it's already in its pre-training. It's already something it
pre-training. It's already something it understands. But for new ideas, it's
understands. But for new ideas, it's important that Claude knows how to do research and know when to research. And
then finally, we have our scripts folder. This is where we have a Python
folder. This is where we have a Python script that extracts YouTube transcripts. And so, this has some
transcripts. And so, this has some helper functions in it. But ultimately
the way it works is that from the command line, Claude can run UV run, run the Python script, pass in the video ID or the video link or whatever it might be. So if you want to read through the
be. So if you want to read through the code or the instructions, they're all available on GitHub. One final piece, I'm running this locally. And so what I did is for any dependencies I want to
install because to get the YouTube transcripts, I'm using this Python library, YouTube transcript API. I'm
actually using UV. And so if you're not familiar with UV, I talked about it in a previous video, but it's just a really lightweight and fast Python package and project manager. So I use it for all of
project manager. So I use it for all of my projects. This just ensures that
my projects. This just ensures that Claude doesn't run into any dependency problems when trying to use this get YouTube transcript tool. So to see this in action, we'll open up the terminal,
clear this out, and I'll just type Claude, and I'll just say explain reinforcement learning in simple terms. So let's see
if Claude actually uses the skill. Okay,
so Claude knew to use the skill on its own. We didn't have to specifically say
own. We didn't have to specifically say to. So I'll say yes, proceed. And so
to. So I'll say yes, proceed. And so
it's thinking. And the reason it's thinking is that that's part of the instruction. So, if we go back to the
instruction. So, if we go back to the skill.md file, one of the big things
skill.md file, one of the big things about explaining things is that it's important that you explore different possible explanations before giving your explanation. So, I don't want Claude to
explanation. So, I don't want Claude to just jump in and start explaining things. So, that's why I said before
things. So, that's why I said before responding, think hard, explore multiple narratives, evaluate target audience, choose the best structure, plan your examples, take time to think through these options. So, let's see what it
these options. So, let's see what it came up with. Reinforcement learning.
Learning by trial and error. Okay, I
like that. Reinforcement learning is teaching a program to make good decisions by letting it try things, see what happens and learn from the results, how it's different. Okay, so this is good. This is using a narrative
good. This is using a narrative structure that I told it which is status quo. So this is how things are. This the
quo. So this is how things are. This the
baseline. The problem with the baseline traditional programming, you write explicit rules. That's the status quo.
explicit rules. That's the status quo.
What's the problem with that? Stum tasks
are impossible to capture in rules.
Okay, that's a big problem.
Reinforcement learning is a solution.
Instead of programming rules, you define a goal and let the program figure out how to achieve it through experience.
Perfect. And then it gives a concrete example, which is good. And then it gives why it matters. There's that. But
I wanted to do research. So let's say explain GRPO. I don't know if it knows
explain GRPO. I don't know if it knows this, but I'm just going to explicitly say do research. Okay. So it read research methodology.md,
research methodology.md, which is what I wanted. Okay. So it's
going to use web search. And so I didn't have to implement a web search tool because that comes out of the box with claude code and claude on the desktop or wherever you're using it. So the way I
came up with the skills like all the skill files and the research file and the Python script for the most part I just had Claude do this for me. So I was just using Claude in the browser and I
was just talking to it telling it what I wanted it to do. And I basically had it write the skills for me and I went back and forth. Maybe I give it six pieces of
and forth. Maybe I give it six pieces of feedback until it got to a pretty good spot. Okay, so let's see. GRPO is a
spot. Okay, so let's see. GRPO is a smarter way to train language models by having them compete against their own alternative responses rather than relying on absolute quality scores.
Status quo. Traditional reinforcement
learning from human feedback. Okay, so
it's interesting that it chose to use reinforcement learning from human feedback as the starting point. I guess
it probably did this because that's probably how Deep Seek framed it in their paper introducing the concept. But
this is good. Find a YouTube video by Sha Tbby on this and explain it. It
might be called how to train LLMs to think. Let's make sure that this YouTube
think. Let's make sure that this YouTube transcript fetcher works. So, it's going to do a web search. Shot to Lebby. How
to train LLMs to think. YouTube gpo. So,
now it's going off the rails. Let me see if I can help it out a little bit. So, I
guess the idea of the transcript tool is that it can understand transcripts if it's doing a deep dive into a topic. But
here, I just want to test if it actually works. Okay, there we go. So, it worked.
works. Okay, there we go. So, it worked.
It ran the script and that is the transcript. And so, now I guess it's
transcript. And so, now I guess it's going to explain the video. Okay, so
that's basically it. Again, all the instructions and the code for this are freely available on GitHub linked in the description. If you have any questions
description. If you have any questions about skills or anything else I talked about in this video, please let me know in the comments section below. And as
always, thank you so much for your time and thanks for watching.
Loading video analysis...