LongCut logo

Continual Learning in Claude Code

By Developers Digest

Summary

Topics Covered

  • Agents Never Learn Without Skills
  • Progressive Disclosure Saves Tokens
  • Document Failures to Skip Errors
  • Skills Enable Continual Learning
  • Skills Compound as Team Memory

Full Transcript

In this video, I'm going to be going over continual learning within cloud code. One of the big problems today is

code. One of the big problems today is if you were to write an AI agent, generally speaking, the process will go something like this. You'll write a system prompt. You'll add rules,

system prompt. You'll add rules, constraints, test, find edge cases, and you will repeat this process until something actually works. But one of the issues with this is every insight as you're going through and actually

building this out is going to be manually encoded. when you actually add

manually encoded. when you actually add to your system prompt, whether you're doing that with AI tools or a combination of writing things manually.

One of the problems with this is the agent never actually learns on its own.

What is the solution to this? Now, if

you've used cloud code before, you've probably heard of skills. Now, a lot of people were excited about these for a number of different reasons. They're

efficient with context. They're

composable. They're portable. They're

efficient. They're discoverable. You can

just put them on GitHub and you can download this markdown file, potentially some scripts, and it's really easy to actually have these be invoked. But one

of the big unlocks with skills that I don't see enough people talking about, Claude can read and write to these. And

what this means is that the model can actually improve them with every session. So you can set up a slash

session. So you can set up a slash command to have a retrospective at the end of your coding session where it can go through whatever happened and actually update the particular skills that you were using. Additionally, what

you could do is you could actually just encode this within something like your cloud.md and have that happen

cloud.md and have that happen automatically. In terms of skills, if

automatically. In terms of skills, if you're not familiar, what you'll have is you'll have a directory. And within the directory, you'll have all of your different skills. In each directory, the

different skills. In each directory, the one thing that you do need is a skill.md. Now, within that skill, you

skill.md. Now, within that skill, you can include other things that it could progressively disclose, like say if there's scripts or references or other helpful assets that you do want to leverage. In terms of how you can

leverage. In terms of how you can actually set these up within cloud code, there are a number of places where you can put them. So, you can put them at the root of your computer. So, you're

going to be able to access them whenever you want. And additionally, you can have

you want. And additionally, you can have them at the project level or you can actually have it within a plug-in that you can share with others to be able to easily install. Now, in terms of the

easily install. Now, in terms of the format for skills, they're super straightforward. So, you can come up

straightforward. So, you can come up with a name, a description. Now, the

description is really important because this is going to be what actually is within the context of the orchestrator model or the main thread to know when to actually invoke that. So, make sure you have a good description. Now,

additionally within skills, you can give it particular tools and you can also reference within the file if there are other helpful things that could potentially be useful. You can put them all within here and it doesn't

necessarily load up all of that context within the skill.md file. It can go and reach for those things progressively.

The cool thing with this is you're only going to be using a number of tokens within the description to actually have it within the main context window. And

then all of the other aspects below this is only going to be loaded once that skill is actually triggered. Now, in

terms of how skills work, one of the really cool things with this is called progressive disclosure. So, Claude will

progressive disclosure. So, Claude will load up the skills names and the descriptions. And what it will do is it

descriptions. And what it will do is it will request matches for a skills descriptions and it will ask for confirmation before loading. Now, I'm

not going to go through and actually show you how to set this up. It's really

straightforward to set up slash commands. You can do a little bit of

commands. You can do a little bit of Googling or you can actually just ask Claude Code itself to set up these slash commands. Now, in terms of setting up a

commands. Now, in terms of setting up a learning loop, it's really quite straightforward. I'm not going to be

straightforward. I'm not going to be going through and showing you step by step in terms of how you can actually set up these triggers, how you can set up these slash commands, but effectively what you can do is you can query for your skill registry prior to different

learnings. It can surface the relevant

learnings. It can surface the relevant past experiments, show known failures as well as provide the working configuration and additionally at the end of it, you can provide a retrospective. So once all of that is

retrospective. So once all of that is within context, you can go ahead and run an update process where Claude will read through the entire conversation, extract the relevant pieces of what worked as well as what failed. You could set it up

so it actually opens a PR if it's within a registry. Or you can actually have it

a registry. Or you can actually have it just write to that skills MD file or all of the different files that you have within the skill directory. Now, another

helpful thing with this is you can actually document failures. One of the things that I did when I was setting up a project open lovable is I spent an awful lot of time with the system prompt and it was basically a lot of do this

don't do that and me just going through the cycle like I mentioned earlier where you're writing out a system prompt you're trying different things you're finding edge cases and you're repeating that within a cycle. Now, one of the things with failures that I don't think

a lot of people are capturing is you can actually use these failures to inform which things to skip because when you start up a new session with the model, it's not going to have the context of all the things that it does bad. It

isn't necessarily as intuitive to actually encode failures or put failures within a place. That's something that we don't typically want to do within software. But because large language

software. But because large language models are non-deterministic, actually having some examples of where it can go off the rails can be very helpful. And

this goes both ways. having examples for failures as well as successes. These can

help improve these skills over time.

Now, I want to pull up a tweet from Robert Nishihara. This is the CEO of any

Robert Nishihara. This is the CEO of any skill, an inference provider. And one of the things that he said when skills came out, I'm just going to read through this is the thing that excites me about anthropics agent skills announcement is that it provides a step towards

continual learning. Rather than

continual learning. Rather than continuously updating model weights, agents interacting with the world can continuously add new skills. Compute

spent on reasoning can serve dual purposes for generating new skills.

Right now, the work that goes into reasoning is largely discarded after a task is performed. I imagine vast amounts of knowledge and skills will be stored outside of a model's weights. It

seems natural to distill some of the knowledge into the model's weights over time, but that part seems less fundamental to me. Now, some of the key aspects that he mentioned is there are many nice things with storing knowledge outside the model. It's interpretable.

You can just read through the skills.

You can correct mistakes. The skills and knowledge are just plain text, so they're easy to update and they should be highly data efficient in the same way that in context learning is data efficient. And I think some of the key

efficient. And I think some of the key aspects of this is you can just read through this. You can just edit it. If

through this. You can just edit it. If

there's something within natural language, you can just see that is not what I want to do and I'm just going to go ahead and update that. It makes it much much easier than actually having to go and retrain or post-train a model

where you don't necessarily know exactly what's happening under the hood. With

skills, you're going to know exactly what it's doing because it's written in English. One of the key insights with

English. One of the key insights with this is that the knowledge stored outside the model's weights in skills we can read, edit, and share. And not to mention, every session's reasoning can compound into future skills. What you

can do with continual learning is create effectively a flywheel where this will just get better over time and learn from mistakes or as things change or as the environment updates and it needs to leverage different libraries or leverage

whatever the skill is actually using.

Now, in terms of getting started with skills, if you haven't tried them, there are a number of really great examples on Anthropics repo on this and I'll also put this within a link within the description of the video and what you

can do with skills. What it is is just effectively another way that you can actually leverage different tools as well as pass in different contexts at progressive disclosure and then like I mentioned you can actually leverage it for continual learning. Now, in terms of

how you can leverage skills, so you can do this for personal things. So, if

there are things within your day-to-day job or things that you do personally and create custom skills very easily, just write out natural language, equip it with particular tools, and have it learn over time whatever you're actually wanting that skill to do. Now, the other

benefit of this is you can actually have them at the project level. And what's

nice with this is you can have repos.

And where that can be helpful in a team setting is when you share that and someone else is leveraging a system that can leverage skills, they're going to be able to inherit all of those particular skills that are specific to the project.

Now, additionally, you can set this up where you can share it via a plug-in or a registry. If you do want to try and

a registry. If you do want to try and set this up at a plug-in level as well.

And what's neat with plugins is you can set it up to leverage different MCP servers, skills, as well as hooks. You

can effectively have a whole config of a number of different tools. Now, within

the skills repo from Anthropic, there are a number of great skills that you can build on top of. And just to give you some ideas, so they do have a front-end design skill. They do also have a web app testing skill. Just to

give you an idea in terms of how these can be leveraged, if you're working on a web application, if this skill is installed, you can say test my application and leverage all of these different tools to test your application, things like Playright or

the Chrome MCP, so on and so forth. Now

another thing in terms of how you can actually leverage these learnings and continual learning and this is a little bit outside of cloud code per se but what you could do is you could actually take these learnings and improve your

system prompt like I mentioned within the open lovable example of spending time writing a system prompt or just doing that generally for any agentic system. What you can do is as you

system. What you can do is as you capture failures and successes, you could potentially even set up a system where it will PR your system prompt or even PR your skills if you have them within Git. There are a ton of really

within Git. There are a ton of really cool things that you can do with this.

Now, all in all, skills aren't just instructions. They're persistent team

instructions. They're persistent team memory that can compound with every session. I don't see a ton of people

session. I don't see a ton of people doing really interesting stuff with this quite yet, but hopefully with this video, it can inspire some ideas in terms of how you can leverage skills and continuous learning within Cloud Code as

well as other agentic systems. Otherwise, that's it for this video. If

you found this video useful, please like, comment, share, and subscribe.

Otherwise, until the next

Loading...

Loading video analysis...