Turn Claude Code into Your Full Engineering Team with Subagents

By Cole Medin

Summary

Topics Covered

Context Windows Kill Coding Agents
Harness Transforms Agent into Full Engineer
Linear Replaces Local Files as Truth
Custom Workflows Beat Off-the-Shelf Harnesses

Full Transcript

Last month, I covered agent harnesses and why they're the next evolution for AI agents, especially for agentic coding. The idea here is simple. If we

coding. The idea here is simple. If we

give too large of a request to our coding agent, even if we have a lot of context engineering, the agent is going to completely fall on its face. And it's

all about context management. Agents

don't do that well when you start to fill their context window. It is the most precious resource when we are engineering with them. And so that's what brings us to the idea of an agent

harness. It's really a wrapper that we

harness. It's really a wrapper that we build of persistence and progress tracking over our coding agent. So that

way we're able to string together multiple different sessions with state management, a git workflow. It can get pretty elaborate, but it allows us to extend how much we're able to send into

a system at once. And this really is the future of AI coding. If we're going to push the boundaries of what is possible with our coding agents, it's going to be with a harness as a wrapper. But there

is a big problem here because if we're building this harness like this is anthropics one that we'll talk about in this video, we're trying to push the boundaries of our coding agent turning

it essentially into a full-on engineer.

But engineers do a lot more than just coding. They also communicate in a

coding. They also communicate in a platform like Slack, giving us updates on the progress. They manage the tasks in something like Linear or Jira.

They're maintaining the GitHub repository. We need all of these things

repository. We need all of these things in the tool belt for our agent for it to be a true AI engineer. And this diagram, what you're looking at right here is actually what I've built to show you right now. I've been experimenting with

right now. I've been experimenting with some ideas here. How can we take an agent harness and build a tool belt into it so that it can really be a full engineer? So, I'll show you how this

engineer? So, I'll show you how this works right now, how you can extend this for yourself. and stick around to the

for yourself. and stick around to the end of the video as well because I'll talk about how this really is the future of Agentic Coding. Some big things that I'm working on personally as well. And

of course, this entire harness I have as a GitHub repository for you, which I'll link to in the description. So, I

encourage you to try it out and even extend it yourself. I made it super easy to tweak all the different sub aents that we'll talk about in a little bit to connect to the different services. In

the read me here, there's a really quick setup guide. I'm also using Arcade. This

setup guide. I'm also using Arcade. This

is the platform to make it super easy for us to connect to Linear, GitHub, and Slack through MCP. So, I'll talk about that a bit more as well. Once you have this all set up, all you have to do to

send the context into the harness to begin is create an appspec. You can

think of this like a PRD. It's all of the features that you want it to build autonomously in the harness loop. And

so, you want to take this appspec and use it as an example. So give it to your coding agent because there is a specific format that works best for this harness.

The biggest thing here is we have our task list in a special JSON format. This

is the official recommendation from Anthropic because I've built my harness on top of Anthropic's harness for longunning tasks that they open sourced at the end of last year. And of course

that does mean that I am using the Claude agents SDK to run this harness, but you can use your Anthropic subscription. So really cost effective

subscription. So really cost effective and the Cloud Agent SDK is powering all of the harness experimentation I'm doing right now. So for this app specifically,

right now. So for this app specifically, just to give you a really cool example what this harness can build, I'm extending my second brain. It's yet

another thing I've covered on my channel recently. I want to build a dashboard

recently. I want to build a dashboard where I can paste in a bunch of research that my second brain has done and then it'll in real time generate a layout that's unique to the specific research

that I gave it. So, I can glean insights really quickly. And boom, take a look at

really quickly. And boom, take a look at that. We have a beautiful TLDDR for this

that. We have a beautiful TLDDR for this pretty extensive research document. I

It's like 2,000 words in total. We can

view the full thing as well. And this is not a simple application. There is an agent behind the scenes deciding the components to generate in real time to customize the dashboard based on what we

pasted in. And so using the harness to

pasted in. And so using the harness to build this, it decided to create 44 tasks in total in linear. And so I ran all of this already. So everything is

done. So we can see all the tasks here.

done. So we can see all the tasks here.

And then we also have the progress tracker meta task. And so we need to hand off to the next agent session every time we go through that loop in the harness. And so we need to let the next

harness. And so we need to let the next agent know what did we do right now so that it can pick up where we left off.

It's also managing the GitHub repository. We got pull requests. It's

repository. We got pull requests. It's

making a commit for every single feature that it built. That's really cool. You

can tweak this to your heart's content as well. And we're providing updates in

as well. And we're providing updates in Slack. And so, for the sake of

Slack. And so, for the sake of simplicity, I just have it message me after the first and second sessions. And

then when my application is fully complete, so then I can come back to my computer to test everything myself, just like you would do when you're reviewing the output from a real engineer. So we

have everything managed in linear everything in GitHub and then letting us know when things are done. This is just beautiful to me. And by the way, I just want you to know that like this is just the starting point for a harness. A lot

more work that I'm doing on top of this.

A lot of ways you could extend this as well. Another really good example is you

well. Another really good example is you could build the harness to just watch 24/7 for any issues that you create in linear and then it would pick those up automatically. And so you can change the

automatically. And so you can change the way that you interact with this harness.

The sky is really the limit for the way that you build it into these tools. You

could even have it work with GitHub issues, add in some other platform you have like Aana or Jira. It's entirely up to you. All right, so with that, let's

to you. All right, so with that, let's now get into running this harness. We'll

even do a live demo on a simpler application and then of course I'll show you how this all works. I want you to learn from this and see how you can extend it yourself. And so like I said earlier, the readme is really easy to

follow. You just set up your virtual

follow. You just set up your virtual environment. Make sure you have Cloud

environment. Make sure you have Cloud Code installed and that you've logged in because this harness is going to use the same subscription that you have with Cloud Code. So, really easy there. The

Cloud Code. So, really easy there. The

main thing that I want to cover right now is setting up your env. So, Arcade

is our ticket here to connect super easily to linear Slack and GitHub.

That's why I wanted to include it because then we don't have to set up all of the individual MCP servers. And so,

you could change this harness to use those directly if you want. But Arcade

has a free tier. They also implement what's called agent authorization. So

they walk us through the OOTH flows really easily with these different services. So we could even share this

services. So we could even share this harness with our team members with our Arcade MCP gateway. And they don't have to create a new linear API key and a new Slack app, but we also don't have to

share those credentials with them. So

it's a really really powerful platform.

And so once you're signed in on the free tier, you just create your MCP gateway.

You give it a name, description, LLM instructions. For the authentication,

instructions. For the authentication, set it to arcade headers. And then for the allowed tools, look at this. Boom.

We got GitHub. I'll search for linear.

And then we got linear. And then

finally, Slack. It is that easy to add in all 91 tools. And by the way, we are using the new tool discovery for MCP and Cloud Code. So, it's not like we're just

Cloud Code. So, it's not like we're just dumping 91 tool definitions directly into our coding agent. That would not be contexts efficient. And so, there we go.

contexts efficient. And so, there we go.

You can create this. I'm just going to use the one that I already have. copy

your URL because you set that as one of your environment variables and then you get your API key from the dashboard as well. That easy to get everything set

well. That easy to get everything set up. Then just use your email here. We

up. Then just use your email here. We

can also configure the specific GitHub repo that the harness leverages. So

generally what I do is I'll create a empty repo and then add it in here. And

then you can define a slack channel for updates too. And you can even change the

updates too. And you can even change the model that each of our sub aents are using for coding linear GitHub. And so

we can make things really cost effective or just really fast, right? Like we just want to really quickly create things in linear. So let's just use haiku for the

linear. So let's just use haiku for the model. So do all that configuration and

model. So do all that configuration and then you'll run the authorize arcade script. So you just have to do this one

script. So you just have to do this one time because then it'll go through the OOTH flow. So the harness now has access

OOTH flow. So the harness now has access to your linear project, your Slack channel, and the GitHub repo that you're working in. And then with all of that

working in. And then with all of that taken care of, we can run our harness.

Just a single command that we need to run to send our appspec into the harness. And so make sure that you have

harness. And so make sure that you have your appspec fully fleshed out with the help of your coding agent because looking at the first prompt here that's sent to our initializer agent is going to read the appspec to understand what

we're building. So this is the single

we're building. So this is the single source of truth initially before we have everything set up in linear. And now I'm using WSL here because sub aents don't actually work that well in Windows with

the cloud agent SDK. So use WSL Mac or Linux to run this. And so I'm going to activate my virtual environment here if I can type. There we go. All right. And

then I'll run the command to kick off the agent. And then I'm just going to

the agent. And then I'm just going to specify the directory here. So it's

going to create this from scratch in the generations folder. So this is the

generations folder. So this is the default location for all of the projects that it creates. And so I'll send this off and it's going to kick off the initializer agent to scaffold everything

for our project linear the GitHub repo the initial configuration for our codebase. I'll come back once it's done

codebase. I'll come back once it's done some of that. All right, take a look. So

it delegated to the linear agent to get things set up for us. So it starts the project initially and now it's building all of these tasks. And so if I go to my projects here, we got our new Pomodoro

timer task or project. So if I go to the issues here, there's six right now. And

it's going to create more and more.

Maybe actually probably only need six for this cuz it's a really simple application. So it created the five to

application. So it created the five to build out the app. And then we have the meta project progress tracker as well.

So this is where we're going to update things with our progress over time as we're handing off between the different sessions for the harness. So all the setup is done in linear. And now it's

moving on to initializing the Git repository, calling the GitHub sub agent for this. And so remember, we're using

for this. And so remember, we're using sub aents for context isolation. So

we're not bloating the main context window for our primary orchestrator here. And so yeah, there's going to be a

here. And so yeah, there's going to be a lot that it does here. It'll go on for a while. And so while we wait for this,

while. And so while we wait for this, I'm going to go back to our diagrams here because I want to show you exactly how this works. I think the diagrams are a lot better of a visual than just

watching the longs as it's running. And

of course, I'll show you the project once it's done, but let's cover this in the meantime. So, going to the original

the meantime. So, going to the original harness here, I want to talk about what Anthropic built to set the stage for how I've improved it to create our full AI engineer. And so, we start with the

engineer. And so, we start with the appsp spec as the primary context that goes into our initializer agent. And

most of the harnesses that I've seen over the past few months, they always start with an initializer. Because

before we get into the main loop of implementing all of the features that we have in linear or in this case our local feature list, we need to set the stage for our project. We need something to

create those features in the first place. And I don't know if you saw that

place. And I don't know if you saw that blip there for a sec, but it actually popped up the browser because it was validating our code behind the scenes with Playright. So anyway, with our

with Playright. So anyway, with our initializer agent here, it creates the feature list. It's everything we have to

feature list. It's everything we have to knock out that we laid out in our appspec. It creates a way to initialize

appspec. It creates a way to initialize the project and it scaffolds the project and the git repository. And so these are the core artifacts that we have after the initializer runs. We have the source

of truth for everything that has to be built. And our coding agents when it

built. And our coding agents when it knocks out all the features, it'll go back here and update things. And so this is our place to keep track of what have we built already, what do we still have to build. And then for the session

to build. And then for the session handoff, we have a simple text file, which I appreciate the simplicity of this harness. But I think there really

this harness. But I think there really is a a big use case to have the agent work where we actually work, which is why I wanted to build this. But anyway,

I'll wrap up here with the coding agent loop. Every single time the agent runs,

loop. Every single time the agent runs, we're running in a fresh context window.

The whole point of this agent being able to go for a longer time is that we're stringing together different agent sessions and each one of them we want to start over so that we have fresh context. So it starts by getting its

context. So it starts by getting its bearings on the codebase and so reading the feature listing like okay what should we build next? It'll do

regression testing. This is important for reliability of the harness because a lot of times one agent is going to break what a different agent worked on earlier. And then after it validates

earlier. And then after it validates that then it'll pick the next feature implement it update and commit which includes making the get commit and then updating these two files as well. And so

what I've built is very similar. I mean

you can even see that I I purposely have a the same architecture for the diagram here but there are some big differences because of the service integrations and how I'm using sub aents to orchestrate

everything. So we still start with the

everything. So we still start with the appspec going into an initializer agent.

But now like we saw in the logs earlier, it's delegating to the linear agent to set up the project in linear and all of the issues. And then just so that we

the issues. And then just so that we know for our codebase like what linear project are we tied to? We also have a single local file. So for the most part, I'm avoiding local files. I don't have

all of these files, but we need at least one file to point us to the right project ID. And then we'll also create

project ID. And then we'll also create the meta linear issue. So this is replacing our cloud progress. And then

we'll create that git repo with our GitHub sub agent. And so now linear is our source of truth instead of these local files. And so now when each agent

local files. And so now when each agent runs, it's going to start by reading the linear project. So that way it knows

linear project. So that way it knows what is our project in linear. It'll

call the linear agent to then find, okay, what are the features that we should validate? What should we pick up

should validate? What should we pick up next? And we're using Arcade for

next? And we're using Arcade for authentication. So the agent has access

authentication. So the agent has access to all of these services. And so then it'll do that implementation, use the GitHub agent to push, and then we can also use the Slack sub agent to give a

progress update. And we're just going to

progress update. And we're just going to loop over and over and over again until every single task in linear is done. And

I've set it up in a way for most of the time it's going to just do one task at a time. But if the agent figures out it's

time. But if the agent figures out it's simple enough, it might actually just try to knock out multiple of them in a single session. And this is all

single session. And this is all configurable in the prompts that we'll get into as well. So the last thing I want to cover while we wait for our harness to complete is the architecture and how you can tweak things for

yourself. Every single agent that we

yourself. Every single agent that we have in this harness for coding the different services, they are controlled by these prompts that we have in the prompts folder. And so when we create

prompts folder. And so when we create our agent, we're using the cloud agents SDK. So we're defining everything in

SDK. So we're defining everything in code. We're not using our cloud folder

code. We're not using our cloud folder like you would with cloud code. We have

our system prompt loaded in right here.

So we're loading in from this file. And

so our orchestrator, this is our system prompt where we're describing. We're

building from the appspec. Here are the sub aents that we have access to. Here's

what our workflow looks like. All that's

defined in the system prompt. And then

when we are in our very first session, that's when we use the initializer task.

And so I'll show you here in the code. I

I promise I'll stay pretty high level with the code here. We're seeing is this our first run? Do we have things initialized in linear or not? If it is our first run, then this function is going to load in the prompt from this

file. So we're controlling with markdown

file. So we're controlling with markdown files just like you would with sub aents in cloud code. And then otherwise we're going to load the continuation task. And

so this is what we run every single loop when we're going to build that next feature. So we read that linear project.

feature. So we read that linear project.

We know what linear project we're working with. We delegate to the linear

working with. We delegate to the linear agent to figure out what we should work on next. everything that I've already

on next. everything that I've already explained in the diagram. I'm now just showing you how this maps to the prompts that we have for all the sub aents here so that you can tweak all this for yourself. You can connect more services,

yourself. You can connect more services, change how often it communicates in Slack, anything that you want to do. And

so the last thing that I want to show you here is that instead of defining our MCP servers and our sub agents in thecloud folder, we're doing it here in

our cloud agent SDK definition. And so

we're connecting to obviously our arcade MCP gateway and then the Playright MCP server. Same kind of way that you

server. Same kind of way that you configure the configuration with something like claw desktop for example.

And then we have all of our agent definitions right here. So this is being imported from this file. It is super easy to add on more sub agents if you want because for every agent we just

give it the description. This is how our orchestrator knows when to call upon the sub aent. We are loading the prompt from

sub aent. We are loading the prompt from the file. Like for our linear agent,

the file. Like for our linear agent, we're loading it directly from the linear agent prompt right here. Just

speaking to like how we manage issues and projects and things like that, we have the tools, the tools that are it's allowed to use with the arcade mcp. And

then finally, the model. So this is from ourv. We can use haiku, sonnet or opus.

ourv. We can use haiku, sonnet or opus.

And so we just build up these agent definitions here. So you can change the

definitions here. So you can change the prompts, change the description, add in another one. Very easy to configure. And

another one. Very easy to configure. And

that's all just brought into our agent automatically. And so it really is all

automatically. And so it really is all of these markdown documents that define the entire flow. Really just using the Claude agent SDK as the wrapper around these different prompts, connecting

everything together into this pretty elaborate system that's able to handle a lot. Like going back here, we're not

lot. Like going back here, we're not done quite yet, but we finished three out of the five issues for this simple Pomodoro timer app. I'll come back once everything is done so we can see the full example now that you know how it

all works and how you can extend this yourself. And here we go. The big

yourself. And here we go. The big

reveal. The application that we've been creating throughout this video is complete. And interestingly enough,

complete. And interestingly enough, because this application was so incredibly simple, it decided to build everything in the initializer session, which I actually prompted it to do so if

it determined it was simple enough just to show you how dynamic this system can be. And of course, I showed you the more

be. And of course, I showed you the more complex app earlier where it did have to do many different sessions for 44 tasks.

But yeah, our application looks really good. We can start it here. We can pause

good. We can start it here. We can pause it, skip to our break. The Pomodoro

technique is really awesome for productivity, by the way. But yeah, we got our update in Slack that the project is complete. It links to our GitHub

is complete. It links to our GitHub repository here where we have six commits, one for the initialization and then one for each of our tasks. And of

course, they're all marked as done with our progress tracking filled out as well. So we started the initialization

well. So we started the initialization and then this is the project complete status at the end. Super super cool. We

built this entire thing just during this video as I was covering the code and our diagrams. So I want to end this video by talking about the future of AI coding with these harnesses and some things

that I'm working on myself because here's the thing. I hope that in following this video you're inspired to try this harness yourself, even build on top of it. And I hope I made that clear

enough for you. But in the end, what's most powerful is building AI coding workflows and harnesses that are specific to your use case, exactly how you want to manage tasks, how you want

to share context between different sessions. I I really believe that if you

sessions. I I really believe that if you build your own optimized workflow, it's going to be way better than anything that's off the shelf. But there's

nothing that really helps you build that right now. And it's such a powerful

right now. And it's such a powerful concept. And so that's what I'm going to

concept. And so that's what I'm going to be working on. So, my open source project, Archon, I worked on this a lot last year. This is my command center for

last year. This is my command center for AI coding. It gained a lot of traction.

AI coding. It gained a lot of traction.

I know it's not as many stars as something like OpenClaw, but I was really happy with the traction that it gained, but it's really not as relevant of a tool right now because it was all about task management and rag for AI

coding. But task management is getting

coding. But task management is getting built into all these tools like cloud code and coding agents these days are so good at looking up documentation that rag just isn't as important for coding

specifically. And so I want to keep the

specifically. And so I want to keep the vision of archon being the command center for AI coding but I want to turn it into the N8N for AI coding being able

to define and orchestrate your own AI coding workflows and harnesses so you can build something like this really easily but actually make it custom to you. So, that's what I'm working on

you. So, that's what I'm working on behind the scenes right now. I know

there hasn't been a lot of updates with Archon because I've been shifting the vision, but I'm super excited for that.

And so, if you appreciate this video and you're looking forward to more things on AI coding and these harnesses, I would really appreciate a like and a subscribe. And with that, I will see you

subscribe. And with that, I will see you in the next video.

Loading...

Loading video analysis...