Agent Harness is All You Need
By Mayank Gupta
Summary
Topics Covered
- Harness Is the Real Agent Differentiator
- Frameworks Supply Parts, Harness Assembles Them
- Harness Skills May Disappear Into Models
Full Transcript
If you have spent even a little time on AI agents lately, you would have heard people throwing terms like agent harness, harness engineering, context engineering, sub agents, and whatnot.
And to be very honest, when I heard about agent harness, this whole concept sounded very vague to me as well. But
the more I read it, the more I tried to build an agent, I figured that this is one of the most important concept that you need to understand because otherwise you will not be able to make a difference between why some agent feel
magical whereas some agent feel like a piece of In this video we'll talk about why agent harness is important.
What all component goes behind building an agent harness as a concept. What are
Ralph loops? What is the difference between a framework and harness? [music]
And where this whole concept of agent harness and models is heading towards as far as my experience is concerned.
[music] So, without any further ado, let's get into it. So, let's address this first.
into it. So, let's address this first.
What is agent built of? You would say LLMs, right? And beyond LLMs, there is a
LLMs, right? And beyond LLMs, there is a scaffolding around the LLM that we call harness [music] is what agent built of. So, we can say models which are the LLMs plus this
scaffolding which we are calling harness across like around that model so that you can have a desired behavior out of that model. Eventually you call it
that model. Eventually you call it agent. Otherwise, if you give an out
agent. Otherwise, if you give an out input to a model, you can expect an output. But if you want that model to
output. But if you want that model to retain the memory of whatever you're talking to that model about. If you want that model to have that input once and execute the overall task that you're
asking it for in an agentic manner, that would not happen without any scaffolding in place, any harness in place.
In the same manner, harness is anything that gives capabilities to that model to go beyond its limitation. So, model is basically intelligence and whatever you
can make out of that intelligence in whatever way you want to make it useful for yourself so that you can make it an agent, that part is called harness. So,
it includes things like system prompts, it include things like giving it tools like MCPs, skills. It include things like file system, sandbox, giving browser access.
It even include things like you have orchestration features, you have sub agents running, you have the handoffs between the agent, and then you have the model routing as
well. Apart from this, hooks and
well. Apart from this, hooks and middleware, you have compaction, you have continuation, you have lint checks, everything that your agent can perform
beyond just the intelligence part is called the harness. So, all the code, all the execution logic, all the
configuration that goes behind the scene in an agent is actually harness. And the
intelligent layer is your model. I hope
that's clear. So, if we look at the agent in a figurative way, you have the model in the core layer which does the reasoning and decides. But to do all the reasoning and to make a decision, you
need some context. So, you have to put the context injection so that it can get some memory out of whatever is happening with it. You also give it system
with it. You also give it system prompts. And in order for the model to
prompts. And in order for the model to execute something and update whatever has happened, it has to have a file system access. It has to have a Git
system access. It has to have a Git access so that it can log the progress that it is making.
Apart from this, you need control over the context window that you have, otherwise it will get bloated and you will not get the desired results. So,
you need to have compaction feature in place. So, all of this is put together
place. So, all of this is put together so that your model can actually access the tools that you're giving it access to, it can actually make right reasoning behind whatever you want that model or
that agent to behave like. And
eventually you are getting those results, you are actually being able to also observe whatever that whole agentic system is doing. But this is it. This is
the whole harness thing. Like if you see everything around this model, you would find that this is the harness layer and in the middle you will find your actual LLM. Now, one might argue that why all
LLM. Now, one might argue that why all these components are required to build an harness and what is the use of these components? So, let's try to understand
components? So, let's try to understand that with a backward compatibility approach. We'll observe it in a way
approach. We'll observe it in a way that, okay, this is a required or desired feature from an agent and this is why we need this component. So, let's
say if you want your agent to have real access to the data so that when you are asking it to something, it can actually give you the right answer. So, for that you will give it a web search tool.
Let's say you want your agent to be able to write and execute the code, you will give it the bash access. You will get the code execution access.
If you want your agent to behave in a very safe environment so that it can actually make progress, you will give it sandbox access so that it can actually have those defaults in place. It can
make some progress in a very safe environment. It can also get the
environment. It can also get the feedback. It can also run the test in
feedback. It can also run the test in that safe environment itself. You want
your agent to remember and access new knowledge, then you have memory files.
Again, you have web search tool in the place. You have MCPs that you can use as
place. You have MCPs that you can use as a tool. Then you also want to maintain
a tool. Then you also want to maintain the performance of your agent over the long context. So, for that you will be
long context. So, for that you will be having the compaction feature in place.
You will be having tool offloading, skills in place.
Now, if you want your agent to behave across a long horizon in one single prompt, then you would be needing while loops. You would be needing techniques
loops. You would be needing techniques like Ralph loops. You would be needing planning and verification modes so that you can actually make your agent work in a specific way over the long horizon of
task execution.
So, this is it about the components that are there.
I hope you understand why we have given all these tools and like they there are things beyond this as well. It's just
that these are the most common components that makes up a harness layer. It depends on case-to-case basis
layer. It depends on case-to-case basis what your agent behavior is actually desired to be. Based on that you figure out what you have to give it as a tool.
Now, the most common example that you can take reference of while understanding this concept is your coding agents like Codex, Claude code,
there is Cursor, there's Pi Agent. All
of these agents are harnesses built on top of the models that you're going to use at the core.
So, all of these tools, all of these coding agents gives you access to whatever model you want to use, whatever type of beta open source model or any other model, you can use those models under
the hood. Whereas all the things that
the hood. Whereas all the things that happen when you give a prompt to that agent is being able to be done because of the harness layer, because of the
code that has been written on top of the model you are using. So, you would find Codex behaving in a different way, a Claude code behaving in a different way for the same prompt because both the harnesses are different. One of the
best, my favorite is Pi Agent. Big shout
out to Mario for that. And you should actually try out these tools so that you can figure out how changing the harness layer can actually make changes to the results that you eventually get. You use
a certain model with the same harness, you'll get different results. You will
use a certain model with the same harness, you will get maybe the best results. And
that's where all the comparison comes into the picture. The much better your model gets, your model evolves, and the best layer that you're trying to build for the harness, the better the results
will be. So, there is also a discussion
will be. So, there is also a discussion of what is the difference between a harness layer and then a difference between a framework like LangChain. So,
you can think of it like this that let's say you have to build a car, you have to make a car, and for that you have all the parts in place. The engine and
everything is in place. You just have to build like put them together to build your car.
Now, it's up to you if you want to build a automatic gear car, you want to build a manual a gear car, what kind of steering you want, what kind of
engine you are looking forward to to build your car. Everything is in place.
So, then framework is actually providing you all those tools.
As a harness layer, as a code execution layer, you have to make use of these tools. You want to put them together in
tools. You want to put them together in a way that you want that agent to behave like.
So, it could differ in every other case and it it should also differ because you actually want a agent to have a certain behavior and to have that behavior in place, you have to put together things
in a certain manner only.
So, that is a analogy that I wanted to portray here. Also talking about the
portray here. Also talking about the long horizon task that these coding agents are being able to perform today, some of the patterns that are very important to understand are called Ralph
loops and then you will also be hearing patterns like planning and verification.
When you give something as an input to a model, it eventually gives you the output right away. It doesn't go into a loop of doing things and then tell you something. So, this is nothing but a
something. So, this is nothing but a while loop. You give a input to your
while loop. You give a input to your agent, it gives an output, then it again tells you this is the new prompt. You'll
have to go through this. This again
gives a result, then it again tells you this is the prompt you have to give results to this until you have the desired result for that prompt. So, this
is one of the techniques, a harness pattern that you can use. In the same manner, there is planning and verification pattern as well where you give a prompt to your agent, your agent
then plans the overall task. It starts
executing that task. It verifies what has completed in the plan. It updates a plan. Then it takes a look into what has
plan. Then it takes a look into what has to be done next. So, it does the next task forward. It completes that task. It
task forward. It completes that task. It
again verifies with the plan, the original plan. It updates the plan, then
original plan. It updates the plan, then it goes back to what all is left. So,
this is again a pattern that you can put in place using your harness approach.
This is not something that models give you by default.
As a matter of fact, when we talk about models having these capabilities of planning and verification and things like Ralph loops that we have to put in place as a harness, this can actually be one reality also in
coming days where models will be evolving into having these capabilities as well. So, a major part of harness can
as well. So, a major part of harness can get consumed into what models contain today as an intelligence.
But, if we if you look at the larger picture, we still talk about prompt engineering. We still talk about context
engineering. We still talk about context engineering.
And in the same manner, I believe harness engineering and agent harness as a concept is here to stay. And I
[snorts] believe that if you can master yourself over how to harness your model, you can actually build some great agents. So, on this channel, you can
agents. So, on this channel, you can expect me to put a lot of great videos over how to actually have a great harness layer, how to actually go about building the system design for an agent
so that you can actually make a difference between how a agent behaves in a magical way and how an agent behaves in a fit way. I hope now you're able to
understand what an agent harness is. All
the clarity is there. If you still have questions, then put them in the comments down below and I'll give answers to each one of them.
If you like this video, subscribe to my channel and I'll see you in the next one. Till then, take care. Bye-bye.
one. Till then, take care. Bye-bye.
Loading video analysis...