LongCut logo

There Are 4 AI Skills in 2026. You're Using 1. The Last 3 Separate 10x Users From Everyone Else.

By AI News & Strategy Daily | Nate B Jones

Summary

Topics Covered

  • Chat Prompting Now Obsolete
  • Context Engineering Creates 10x Gap
  • Four Prompting Disciplines Stack
  • Specification Engineering Enables Autonomy
  • AI Prompting Sharpens Human Leadership

Full Transcript

If you're prompting like it's last month, you're already too late. And I'm

not just doing that for clickbait. If

you haven't updated how you think about prompting since January 2026, you're already behind. Opus 4.6, Gemini 3.1

already behind. Opus 4.6, Gemini 3.1 Pro, and GPT 5.3 codecs have all shipped in the past few weeks with autonomous agent capabilities that make the

chatbased prompting most people are practicing functionally obsolete for serious work. These models don't just

serious work. These models don't just answer better. They work autonomously

answer better. They work autonomously for a long time, for hours, for days against specs without really checking in. That changes what good at prompting

in. That changes what good at prompting means on a fundamental level. And it's

time to revisit how we think about prompting as a result. Not because

prompting stopped mattering. It actually

matters more than ever, but because the word prompting is now hiding four completely different skill sets, and most people are only practicing one of them. And the gap between the people who

them. And the gap between the people who see all four of them and the people who don't is already 10x and widening. In

this piece, I'm going to lay out what those four skills are, why the distinction matters now, and exactly how to build the skills you're missing. This

builds on my earlier work on intent engineering, but it goes way beyond it to lay out a full framework for how to think about prompting post February 2026. Intent engineering is just one

2026. Intent engineering is just one layer in a larger stack. This is the full stack for prompting post February, post these new autonomous models. First,

what changed? The prompting skill that mattered since 2024 has been really conversational. You sit in a chat

conversational. You sit in a chat window, you type a request, you read the output, you iterate, you get better at phrasing things, you provide examples, you structure instructions. If you're

good at that, and if you've been following this video series, you probably are. You've been building real

probably are. You've been building real skills. They work. you're faster than

skills. They work. you're faster than you were a year ago. But that

fundamental chatbased skill has a ceiling. And in early 2026, a lot of

ceiling. And in early 2026, a lot of people are hitting it because the models have stopped really being chat partners and started being workers. Workers that

run for a long time. I'm not kidding when I say days and sometimes weeks. And

the thing about a worker that runs for a long time is that everything you relied on in a conversation, like your ability to catch mistakes in real time, your ability to provide missing context

accurately when the model asks, your ability to course correct when things drift. All of that must be encoded

drift. All of that must be encoded before the agent starts. Not during the course of a conversation, but at the top. This is a fundamentally different

top. This is a fundamentally different skill. It's not a harder version of the

skill. It's not a harder version of the same skill. It's actually different.

same skill. It's actually different.

I've talked before about the importance of thinking about prompting even in the chat window as providing the relevant context for the LLM to provide you an

accurate response. But this goes way

accurate response. But this goes way beyond that. If you were giving an agent

beyond that. If you were giving an agent a longunning task, which is where most of AI is going, even if you're not a coder, then you have to think not about

how do I build for a chat response, but how do I build for economically real work that this agent will do and provide the agent the relevant context for that.

And this shift is happening really quickly. Between October of 2025 and

quickly. Between October of 2025 and January of 2026, in just 3 months, the longest autonomous cloud code sessions nearly doubled, and they've doubled again since then. Agents are running

into the hundreds and thousands in production systems at major companies.

And this is just from publicly available information. We have Telus reporting

information. We have Telus reporting that they have 13,000 custom AI solutions internally. We have Zapia

solutions internally. We have Zapia reporting that they have over 800 agents internally. Look, whenever they release

internally. Look, whenever they release a press release, you got to assume that company feels behind and is releasing something to help them feel better about themselves. The companies that are

themselves. The companies that are really serious about AI don't feel the need for press releases and have an order of magnitude or more agents. So,

this is not about a world that is coming. This is about a world that has

coming. This is about a world that has landed. But this still might not feel

landed. But this still might not feel concrete enough for you. So, let me talk to you about a random Tuesday. Two

people sit down with the same model, same subscription, same context window.

One of them uses 2025 skills. You type a request, you get back something that is 80% right. You spend 40 minutes cleaning

80% right. You spend 40 minutes cleaning it up and you have a good use of AI because you saved it a lot of time. It

might have saved 50% on your time, might have saved 60%, maybe higher. Let's make

the gap concrete. Let's say two people are sitting down with the same model on a Tuesday morning in February of 2026.

Same subscription, same context window.

The only difference is that one of them is using 2025 prompting skills and one of them is using 2026 prompting skills.

So the 2025 person types a request and they're asking for a PowerPoint deck, right? They get back something that's

right? They get back something that's about 80% correct. Maybe there's some formatting issues. Maybe the font has

formatting issues. Maybe the font has some collisions in it. Maybe there's

some styling issues. They spend about 40 minutes cleaning it up, but they're pretty happy because this deck would have taken two or three hours. That's a

2025 prompting skull application. it

would have been good in 2025.

Person B sits down with 2026 prompting skills. They write a structured

skills. They write a structured specification in 11 minutes. They take

longer to prompt. Then they hand it off to the same chatbot, but they're thinking of it and using it as an autonomous agent. They go to make

autonomous agent. They go to make coffee. They come back to a completed

coffee. They come back to a completed PowerPoint that hits every quality bar defined up front. And they're able to do this for five other decks before lunch.

In other words, they are now doing a week's worth of work in a morning easily. Same model, same Tuesday, 10x

easily. Same model, same Tuesday, 10x gap. And if you want to replicate this,

gap. And if you want to replicate this, you can replicate this experiment directly in Claude Opus 4.6 in the co-work model, which is available on Windows and Mac. And you can see exactly

how this plays out. And this did not happen because the person with 2026 prompting skills is smarter or because she's more technical. It's because she's practicing a different skill than person

one and person one doesn't know that kind of prompting skill exists. I think

it's worth paying attention to Shopify CEO Toby Lutkkey here. Unlike most CEOs, Toby is a technical guy and he does not just dig into AI from a LinkedIn

perspective. He has a folder of prompts

perspective. He has a folder of prompts that he runs against every new model release and he really deeply thinks about how new model releases change his workflow. He uses the term context

workflow. He uses the term context engineering because he believes the fundamental skill that we're all facing is the ability to state a problem with enough context in a way that without any

additional pieces of information, the task becomes plausibly solvable. I think

that's a really elegant way to describe what person B did in the example I just showed you. Person B put all of the

showed you. Person B put all of the information that the model needed to build a deck in one task very clearly defined and the model could just go to

work. This isn't about clever prompt

work. This isn't about clever prompt tricks. This isn't about magical words

tricks. This isn't about magical words that an AI can use to produce a better output. It's about a communication

output. It's about a communication discipline. Can you state a problem so

discipline. Can you state a problem so completely with so much relevant surrounding information that a capable system can solve it without going out and fetching more context? Can you make

your request as self-contained as possible? This is a really big deal

possible? This is a really big deal because it demands a much higher bar for communication from us humans than we're used to. And that's something that Toby

used to. And that's something that Toby called out when he reflected on the impact of AI on his own leadership style. One of the things he mentioned is

style. One of the things he mentioned is that by being forced to provide AI with complete context, he is now better at communicating as a CEO. His emails are tighter, his memos are better, his

decision-making frameworks are stronger, and Toby has gone farther than most people thinking about the implications of context engineering. I think one of his most provocative assessments is that

a lot of what people in big companies call politics is actually bad context engineering for humans. What he suggests is that essentially good context

engineering would surface disagreements about assumptions that are never surfaced explicitly but play out as politics and grudges in large companies.

And he says that happens because humans tend to be sloppy communicators who rely on shared context that doesn't actually exist. I think that's a really

exist. I think that's a really interesting thesis. I think one of the

interesting thesis. I think one of the implications of getting this February 2026 prompting lesson deeply ingrained in ourselves is that our communication humanto human is likely to improve and

our organizations are likely to have cleaner decisioning and cleaner communication even between humans as a result. So I bet you're wondering what

result. So I bet you're wondering what are these four disciplines? What does it mean when prompting becomes multiple skills that we need to learn? Well, I

don't want to hide the ball here. Here

is the framework that I would lay out that describes what prompting should be in February 2026. And I've built it to be futurep proof. So we look at the direction that agents are going, how

they're developing this year. These four

disciplines are going to matter even as we expect agents to continue to scale.

This represents a significant update on how I've taught prompting before 2026 because the way we prompted before 2026 was helpful as a foundation. You're not

losing something by having learned it, but it's not enough as agents get more capable. And I think we're due for a

capable. And I think we're due for a reset. So, fundamentally, prompting is

reset. So, fundamentally, prompting is the broad skill of providing input to AI systems so that they can do useful work.

Prompting has diverged into four distinct disciplines, but it's not taught that way. And so, I'm laying this out for the first time here. Each of

these disciplines is operating at a different altitude and time horizon. And

you need to understand them all to prompt well. Just because they're not

prompt well. Just because they're not often prompted this way, just because they're not often taught this way, doesn't mean that good prompters don't intuitively know this. What I'm taking

is intuitive knowledge that I see in excellent prompters and boiling it down into four key disciplines that you can practice and learn from. And these build on each other. If you skip one, I'm presenting them in order. If you skip

one, you're creating the kind of failures we tend to see at scale in the enterprise, but you're creating it for yourself in your own prompting. And I'll

kind of get at what that means and you'll get the idea. So discipline one is prompt craft. This is the original skill. This is the skill I have taught

skill. This is the skill I have taught and many others have taught for the last year or two. It's synchronous. It's

sessionbased and it's an individual skill. You you sit in front of a chat

skill. You you sit in front of a chat window. You write an instruction. You

window. You write an instruction. You

evaluate the output. Then you iterate.

The skill here is knowing how to structure a query. And I've talked a lot about this in the past. I'll rehash it briefly here. You must have clear

briefly here. You must have clear instructions. You must include relevant

instructions. You must include relevant examples and counter examples. You need

to include appropriate guard rails. You

need to include an explicit output format. And you should be very clear

format. And you should be very clear about how you resolve ambiguity and conflicts. So the model doesn't have to

conflicts. So the model doesn't have to make it up on the fly. This is what anthropics prompt engineering documentation covers. Open AI talks

documentation covers. Open AI talks about this. Google talks about this.

about this. Google talks about this.

It's on a thousand blog posts and LinkedIn courses. Prompt craft has not

LinkedIn courses. Prompt craft has not become irrelevant. Don't hear that. It's

become irrelevant. Don't hear that. It's

just become table stakes. It's sort of the way knowing how to type with 10 fingers was once a professional differentiator and now it's just table stakes. It's just assumed. If you can't

stakes. It's just assumed. If you can't write a clear, well ststructured prompt in 2026, you're the person in 1998 who couldn't send an email. Is it important

to be able to do it? Yes. Is it going to differentiate you in the workforce? No,

not really. The key shift is that promptcraft was the whole game when AI interactions were synchronous and sessionbased. You wrote something, you

sessionbased. You wrote something, you got something back, and you refined it in real time. As a human interacting with that model, you were acting as the intent layer, as the context layer, and

as the quality layer so that longunning tasks could get done. You did all the breaking out of those tasks. That model

of prompting broke the moment agents started running for hours without checking in. Discipline 2 we've been

checking in. Discipline 2 we've been talking about for a few months now. It's

called context engineering. Enthropic

published the foundational piece on this back in September of 2025, but there's a lot of other good pieces out there as well. I've written a fair bit on it. I

well. I've written a fair bit on it. I

define context engineering as the set of strategies for curating and maintaining the optimal set of tokens during an LLM task. And that's not just me defining

task. And that's not just me defining that. That's a pretty commonly held

that. That's a pretty commonly held definition. Wang Shane's Harrison Chase

definition. Wang Shane's Harrison Chase was even bluntter about what context engineering is during a recent Sequoia Capital interview when he said everything is context engineering. It

actually describes everything we've done at lane chain without knowing the term existed. So that's actually somewhat

existed. So that's actually somewhat dangerous because context engineering is only one of four levels and people have misunderstood it to mean everything. And

one of the things I'm trying to get us to move toward is a world where we understand context engineering as a specific skill where we're providing relevant tokens to the LLM for inference. And yes, it is certainly

inference. And yes, it is certainly foundational. It is certainly

foundational. It is certainly significant. It is where the industry's

significant. It is where the industry's attention is focused today. It is the shift from crafting a single instruction to curating the entire information environment and agent operates within.

all of the system prompts, all of the tool definitions, all of the retrieved documents, all of the message history, all of the memory systems, the MCP connections. The prompt you write might

connections. The prompt you write might be 200 tokens. The context window it lands in might be a million. Your 200

tokens are 002% of what the model sees.

The other 99.98% that's context engineering. This is the discipline that produces claw.md files,

agent specifications, rag pipeline design, memory architectures. It's the

discipline that determines whether a coding agent understands your project's conventions, whether a research agent has access to the right documents, whether a customer service agent can retrieve relevant account history.

Anthropics engineering team identified the core challenge precisely. LLM

degrade as you give them more information. That's correct. They do.

information. That's correct. They do.

And the point therefore is to include relevant tokens because the issue is not that they can't hold the tokens. It's

that retrieval quality does drop as context grows. The practical implication

context grows. The practical implication is that people who are 10x more effective with AI than their peers are not writing 10x better prompts. They're

building 10x better context infrastructure. Their agents start each

infrastructure. Their agents start each session with the right project files, the right conventions, the right constraints already loaded. The prompt

itself can be relatively simple because the context does the heavy lifting. I

have seen this for myself as I've built out my own context engineering scaffolding. And if you're wondering how

scaffolding. And if you're wondering how to do it for yourself, I'm putting a guide together with this video over on the Substack and I think it'll be helpful. Discipline number three. This

helpful. Discipline number three. This

one we don't talk a lot about. I think

is where we're going as these agents start to do much longer autonomous running tasks. I wrote about this one at

running tasks. I wrote about this one at length in a prior piece. I did a video on it. I'm going to be brief here. I'm

on it. I'm going to be brief here. I'm

going to contextualize where it sits in the stack. Context engineering tells

the stack. Context engineering tells agents what to know. Intent engineering

tells agents what to want. It's the

practice of encoding organizational purpose, your goals, your values, your trade-off hierarchies, your decision boundaries into infrastructure that agents can act against. Claro story is

the proof case I talked about. Their AI

agent resolved 2.3 million customer conversations in the first month, but it optimized for the wrong thing. It

slashed resolution times, but it didn't optimize for customer satisfaction. And

as a result, CLA got into big trouble and had to rehire a bunch of human agents and is still dealing with the customer trust aftermath. So, intent

engineering sits above context engineering the way strategy sits above tactics. You can have perfect context

tactics. You can have perfect context and terrible intent alignment. You

cannot have good intent alignment without good context, though, because the agent needs information to act on the intent. So these disciplines again

the intent. So these disciplines again they're cumulative. Another thing I want

they're cumulative. Another thing I want you to notice is that failure as we progress up this hierarchy is getting more and more serious. When you as an individual sit down and you screw up a

prompt, it might waste your morning at worst. When you as a human being sit

worst. When you as a human being sit down and screw up context engineering or intent engineering, you are screwing up for the entire team, your entire org, your entire company. The stakes get

higher. And because the stakes get

higher. And because the stakes get higher, our attention to detail matters and the value of the work we do increases commensurately. What I am

increases commensurately. What I am talking about when I talk about context engineering and intent engineering can be a full-time role at a big company.

And if it's not, it is a highstakes human skill that has a lot of transferable value. Level four is

transferable value. Level four is specification engineering. We're just

specification engineering. We're just starting to talk about this now, even though the best practitioners are already doing it. Specification

engineering is the practice of writing documents across your organization that autonomous agents can execute against over extended time horizons without human intervention. This is a level

human intervention. This is a level above everything I've described because all of the first three levels focused on how you prepare work directly for an agent. Specification engineering is

agent. Specification engineering is really about thinking about your entireformational corpus in your organization as agent fungeible, agent readable. Everything

you write has to be something the agent can access and do something with. It's

not really about prompting per se. It's

not about an individual agent's context window. It's not even about the intent

window. It's not even about the intent you've given agents. Specifications

are complete. They're structured.

They're internally consistent descriptions of what an output should be for a given task. They look at how quality is measured. Specifications are

a mindset you bring to your documents that allow you to apply agents across large swaths of your current company's

context with the confidence that what the agent reads is going to be relevant.

I think an interesting example from Anthropic actually comes from the team's struggles with the Opus 4.5 agent which is one generation ago now. They were

trying to build a production quality web app. But if you give the agent only a

app. But if you give the agent only a highlevel prompt like build a clone of claw.ai, the agent tries to do too much at once, runs out of context mid-implementation, and leaves the next

session guessing at what happened. The

fix, it turned out, was not a better model. It was specification engineering.

model. It was specification engineering.

It was a pattern that you could specify where an initial laser agent sets up the environment. A progress log documents

environment. A progress log documents what's been done and a coding agent then makes incremental progress against a structured plan every session. The

specification became the scaffolding that let multiple agents produce coherent output over days. So the shift from prompt to specification mirrors a transition that happened in human

engineering decades ago. When you're

building something small, verbal instructions and conversations work really well. When you're building

really well. When you're building something large enough to require a team or span multiple sessions, you need blueprints. Anthropic needed blueprints

blueprints. Anthropic needed blueprints in the Opus 4.5 example. And even though we've now moved to Opus 4.6, the need for specification engineering has not

gone down, it's gone up because Opus 4.6 can do even more work. That's true for Codeex 5.3. It's true for Gemini 3.1 Pro

Codeex 5.3. It's true for Gemini 3.1 Pro as well. The smarter models get, the

as well. The smarter models get, the better you need to get at specification engineering. Which is why I deliberately

engineering. Which is why I deliberately started this section zooming out and saying the entire org's document corpus should be viewed as a form of specification engineering. And yes, this

specification engineering. And yes, this is a fractal insight. You can also think about specification engineering for your individual agent task where you think

about what is the log that the agent has. How do we assign tasks across this

has. How do we assign tasks across this agent build? How do we make sure the

agent build? How do we make sure the agent has a clearly specified requirements list to work from? But all

of that gets way way easier to put together if you think of your entire organizational document corpus as specifications that are agent readable.

Your corporate strategy is a specification. Your product strategy is

specification. Your product strategy is a specification. Your OKRs are a

a specification. Your OKRs are a specification. Everything ends up being

specification. Everything ends up being a specification that your agent can use.

And that's different from context engineering because the art of context engineering is really about shaping the context window in a way that's relevant for the agent. Right? If you look at these four levels, the prompt is you and

the agent and you're working on crafting clear instructions. The context window

clear instructions. The context window is how do you shape relevant tokens.

Intent engineering is how do you communicate goals and objectives to the agent that allow the agent to work autonomously for long periods of time in a direction consistent with company

strategy. Specification engineering is

strategy. Specification engineering is how do you think about your entire corporate document structure, the knowledge, the context that makes the corporation work as a form of

specification. And yes, for individual

specification. And yes, for individual agent runs, how do you refine that specification that becomes something where you give the agent a good specification? It is going to start to

specification? It is going to start to keep the context window cleaner than before. And so these start to interplay,

before. And so these start to interplay, right? If you write good spec, if you

right? If you write good spec, if you have a good task log, if the agent understands what the spec is from the broader organizational context, they're less likely to go off the rails because

of intent engineering conflicts. They're

less likely to bloat out with bad context. And so all of these start to

context. And so all of these start to interplay, but the highest level is to think about specification as the way

your organization does business. You

specify the outputs you want. The agent

does the work. the outputs are produced.

That is the highest level description of what business is going to look like in the next couple of years and it starts with understanding how to specify. This

is where anthropics best practices documentation for cloud code becomes really revealing. The recommended

really revealing. The recommended workflow for complex features is relatively simple. Interview me in

relatively simple. Interview me in detail. Ask about technical

detail. Ask about technical implementation, UIUX, edge cases, concerns, and trade-offs. Don't ask

obvious questions. Dig into the hard parts. The agent then writes the spec

parts. The agent then writes the spec with the human. I think that that is an artifact of this moment in time. And I

think we will get to a point where the agent will only be asking us for places where the broader specification corpus is in conflict or ambiguous and we have

to talk about what it means to us for this task to be accomplished. Well,

because the entire organizational infrastructure is going to be agent. So

the practical skill going forward is not writing code. It's not crafting prompts.

writing code. It's not crafting prompts.

It's the ability to describe an outcome with enough precision and completeness that an autonomous system can execute against it for days or weeks. I hope

you're seeing here how that is a fundamentally different skill from writing a good prompt in a chat window.

And the people who are excellent at one of these layers are not automatically excellent at all of them. Context

engineers spend a lot of time thinking about how to compress tokens and get good tokens into context windows and how to keep bad tokens out. That is a different mindset than thinking about your information environment is agent

translatable, agent readable, agent fungeible. We have to have all of these

fungeible. We have to have all of these skills in order to effectively bring AI into the enterprise or even into a small business in 2026. And I will tell you

one-person businesses have the greatest advantage right now because if you are a oneperson business and you can just convert your notion to be agent readable, you're off to the races today.

There's no gigantic effort required to make all of your SharePoint agent readable. It's simple. It's easy. You

readable. It's simple. It's easy. You

just get it into notion and you're done.

This comes back to the core idea that in 2026, speed is going to matter because agents are going to keep getting better quickly. What we have now as days and

quickly. What we have now as days and weeks is going to become weeks and months by the end of the year. And the

corresponding impact of getting specification engineering correct is going to be even higher. The

corresponding impact of getting all four levels translated into specific roles, people who are responsible, DRIs, teams who handle this, that's going to be even more valuable in 2026. And so what I

mean by that is that if you are at a large company, you should have people who are doing context engineering and that's all they're doing. You should

have people who are doing specification engineering and thinking about how agents can read the enterprise. You

should have people who are thinking about intent engineering and how you translate goals into a set of objectives that an agent can read and value and a

set of verifiable guardrails the agent can follow. Look, the mental model most

can follow. Look, the mental model most people carry is that prompting is good instructions for the AI and that fails for a very specific reason. And I hope you're seeing it here. That entire model

assumes synchronous interaction. In the

synchronous AI human partnership model, you're always there at the computer. You

see the output in real time. You correct

mistakes right away. You provide

additional context when the model asks or when you notice it going off track.

longunning agents break every single assumption in that model. So if you've relied on the assumptions of synchronous prompting, you have a structural

vulnerability in the way you think about AI. You need to start thinking about AI

AI. You need to start thinking about AI as if your real time oversight is embedded in the specification before the

agent begins to work. The planner worker architecture that's dominating production agent deployments really reflects this reality. A capable model plans the work, decomposes it into

subtasks, defines the acceptance criteria, and assigns work out to models and then cheaper, faster models do the work. The planning phase, you could call

work. The planning phase, you could call it the specification phase, determines the quality ceiling. The planning phase, basically taking your specification and expanding it and enriching it and

breaking it out and planning against it.

That's what determines the quality of the work of the overall system.

Execution without that specification step produces broken work and it requires extensive human rework to be a value at all. So the shift from fixing it in real time, which is what we do

with a lot of prompting in the chat window to we must get the spec right up front changes your bottleneck skill.

Right? Real time prompting rewards verbal fluency. It rewards quick

verbal fluency. It rewards quick iteration. It rewards a good eye for

iteration. It rewards a good eye for output quality. Specification

output quality. Specification engineering rewards completeness of thinking, anticipation of edge cases, clear articulation of acceptance criteria, and the ability to decompose

really complicated outcomes into independently executable components.

Different people are good at these things in different ways. Some people

are going to be naturally exceptional at synchronous prompting and they're going to struggle with specification work. And

some people will be mediocre at chatbased interaction, but they might actually be excellent spec engineers. My

challenge to you is that you don't tolerate whatever your natural propensity is as your ceiling. Think of

this as a learnable skill and go after it. Now, you might ask, if we're going

it. Now, you might ask, if we're going after it, what are the foundational elements to learn? Specification is

ironically very vague. Nate, please tell me what you mean. I want to suggest to you that in 2026 we can define the primitives that go into good specifications in ways that are useful

for us to learn. And I'm going to go ahead and define them right here. I

think these are the foundation that we need to learn if we want to get better at specifying and we want to get better at the prompting skills that will matter in 2026 and beyond. Primitive number

one, self-contained problem statements.

This is actually Toby's insight, but it's not only his insight. It's been

primitive. Number one is self-contained problem statements. Think back to what

problem statements. Think back to what Toby said when he talked about the idea that we have to give the model everything it needs to do the work. Can

you state a problem with enough context that the task is plausibly solvable without the agent going out and getting more information? The discipline of

more information? The discipline of self-containment forces you to be clear.

It surfaces hidden assumptions. It makes

you articulate constraints you normally leave implicit because you trust the human on the other end to fill in the gaps. AI doesn't fill in gaps reliably.

gaps. AI doesn't fill in gaps reliably.

It fills them with statistical plausibility, and that's a polite way of saying it guesses in ways that are often subtly wrong. So if you're trying to

subtly wrong. So if you're trying to train this primitive, I would say take a request you would normally make conversationally, like update the dashboard to show the Q3 numbers and

rewrite that as if the person receiving it has never seen your dashboard, doesn't know what Q3 means in your org context, doesn't know what database to query, and has no access to any

information other than what you include.

That is the level of self-containment you should be challenging yourself with if you want to get better at this primitive. Primitive two, learn about

primitive. Primitive two, learn about acceptance criteria. If you can't

acceptance criteria. If you can't describe what done looks like, an agent can't know when to stop or more precisely, it will stop at whatever point its internal huristics say the

task is complete, which may bear no relationship to what you needed. This is

why the 80% problem is a big issue for agent system design. the specification.

Let's say the specification said build a login page when it should have said build a login page that handles email passwords, social ooth via Google and GitHub, progressive disclosure of 2FA

session persistence for 30 days and rate limiting after five failed attempts. If

you're training on this primitive, you want to get to the latter, not the former. For every task you delegate,

former. For every task you delegate, write three sentences that an independent observer could use to verify the output without asking you any questions whatsoever. If you can't write

questions whatsoever. If you can't write those sentences down, you probably do not understand the task well enough to give it to an agent. I have had that happen where I've been in a conversation

with an AI agent and I realize I don't know enough to delegate the task and I have to come back later. That's okay.

It's good to realize that before you assign the work. Primitive three is constraint architecture. Learn

constraint architecture. Learn constraint architecture. What the agent

constraint architecture. What the agent has to do, what the agent cannot do, what the agent should prefer when multiple valid approaches exist, what the agent should escalate rather than

decide autonomously. These four

decide autonomously. These four categories, the musts, the must notss, the preferences, and the escalation triggers form the constraint architecture that turns a loose

specification into a very reliable one.

The claw.md pattern that's emerging in the coding community is a practical implementation of constraint architecture. The best claw.md files are

architecture. The best claw.md files are not long lists of rules. They're

concise. They're extremely high signal constraint documents. Use these build

constraint documents. Use these build commands. Follow these code conventions.

commands. Follow these code conventions.

Run these tests before marking a task complete. Never modify these files

complete. Never modify these files without explicit instructions. The

community consensus is very strongly that every line in an file needs to earn its place. If you ask would removing

its place. If you ask would removing this line cause the AI to make mistakes and the answer is no it really wouldn't then kill the line. So if you want to train this primitive before delegating a

task you should be writing down what a smart well-intentioned person might do that would technically satisfy the request but produce the wrong outcome.

Those failure mos end up being your constraint architecture. Encode them.

constraint architecture. Encode them.

Primitive four is decomposition. Large

tasks need to be broken into components that can be executed independently, tested independently, and integrated predictably. This is software

predictably. This is software engineering's oldest lesson, modularity, but is being applied to AI task delegation. Anthropic's longunning agent

delegation. Anthropic's longunning agent harness splits every complex project into an environment setup phase, a progress documentation phase, and an incremental coding session, each

independently verifiable. You get

independently verifiable. You get similar task decomposition automatically inside codecs. A marketing content audit

inside codecs. A marketing content audit requires the same decomposition as a coding task. So I'm not just talking to

coding task. So I'm not just talking to engineers here. You would have to

engineers here. You would have to decompose your marketing content into quality scoring, gap analysis, recommendation generation, etc. If you want to train on the primitive of task

decomposition, take any project that you would estimate at a few days of work and decompose it into subtasks that each take less than 2 hours for you to do.

Have clear input output boundaries and can be verified independently of the other tasks. That is the granularity at

other tasks. That is the granularity at which agents work best and it's the granularity at which specification engineering tends to operate. Now, in

2026, you do not have to pre-specify all of those 2-hour tasks when you are writing a prompt, but you do have to

understand what all of those tasks are.

And you have to understand how to describe for a planner agent what done looks like and what decomposible pieces look like in such a way that the

planner agent can reliably break the work into those 50 or 60 subtasks. In

other words, your job increasingly is not to manually write the subtasks for the agent. Your job is to provide the

the agent. Your job is to provide the break patterns that a planner agent can use to break up larger work in a reliable executable fashion. That's a

level of abstraction even above decomposition and that is a lot of where we're going as agents start to run.

Primitive 5 is evaluation design. This

is critical to do not just in an individual level but at an org level.

Organizations need to think about every level of AI deployment in terms of eval.

How do you know the output is good? Not

does it look reasonable, which is how most people evaluate AI output. But can

you prove measurably consistently that this is good. If prompt craft is the art of the input, evaluation design is the art of knowing whether that input

worked. And in a world where agents can

worked. And in a world where agents can run for a really long time, eval design is the only thing standing between AI generated output that I can't use and AI

generated output we can really use asis.

If you want to train this primitive for every recurring AI task in your world, build an eval build three to five test cases with known good outputs and run them periodically, especially after

model updates. This is going to catch a

model updates. This is going to catch a regression. It'll build your intuition

regression. It'll build your intuition for where models fail. It will create institutional knowledge about what good looks like for your specific use cases, your team, you, your org. You need to be

doing this systematically. If you're

listening to all this and you're wondering where to start, I gave you those four layers for a reason in order.

Start by closing the prompt craft gap.

Most people are worse at basic prompting than they think. You should be rereading prompting documentation. I've written a

prompting documentation. I've written a bunch on the Substack. I'm writing more for this piece. You should do interactive tutorials. You can head over

interactive tutorials. You can head over to AI Cred and see where you are on prompting. You should be building a

prompting. You should be building a folder of tasks that you do regularly, writing your best prompt against each one and saving the outputs as your baseline and then revisiting them over

time. Take prompt craft seriously.

time. Take prompt craft seriously.

Second, once you feel like you start to have a handle on that, start to build your personal context layer. You should

be writing a cla.markown equivalent for your work. I don't care if you use

your work. I don't care if you use claude. You still need to have an idea

claude. You still need to have an idea of your goals, your constraints, your communication preferences, your quality standards, the institutional context that a new team member would need 6

months to absorb written down. Start AI

sessions by loading this context. The

difference in output quality should be immediate and obvious. Then get into specification engineering. Take a real

specification engineering. Take a real project, not a toy problem, and write a specification for it. Then start to get into intent infrastructure. This is an organizational layer. So if you manage

organizational layer. So if you manage people or systems, you can start encoding the decision frameworks your team uses implicitly. If you are an individual contributor, you should be

encoding the decision frameworks you understand and trying to be a champion that pushes for this at the organizational level. A lot of teams

organizational level. A lot of teams like to talk about adopting AI. We'll

talk about it in terms of building intent infrastructure. Talk about what

intent infrastructure. Talk about what good enough looks like for each category of work together. Talk about what gets escalated by AI versus what AI can decide. Write it down, structure it,

decide. Write it down, structure it, make it available to agents. And then

practice specification engineering. Take

a real project, not a toy problem. Write

a spec for it before touching AI. Talk

about acceptance criteria, constraint architectures, decomposition, etc. Hand that spec to an agent and see what comes back. And oh yes, from an org

back. And oh yes, from an org perspective, start to think about every doc you touch as a spec that the agent will need to read and operate against.

Your org is a system of business processes, even if you're a team of one.

And those business processes should be agent readable and they should be speckable. This has some of the

speckable. This has some of the downstream implications that Toby talked about where he said a lot of organizational politics is just bad context engineering at the org level. If

we practice better specification engineering for our documents, we will expose a lot of the implicit assumptions that we end up being political about inside orgs and we will start to make

those agent readable and they will start to become fungeible and we will start to have fewer issues. Practicing

specification engineering is a way for us to clearly describe intent at organizational scale and clearly translate that intent in a way that agents can read it. And yes, it nests

down to individual agent runs and it ladders up to the full organizational context. And that's why it's the last

context. And that's why it's the last and most difficult skill to learn. The

progression here from prompt craft all the way up to spec engineering is not a ladder where you can abandon lower rungs. It's a stack where each layer

rungs. It's a stack where each layer makes the layers above it possible. You

cannot write good spec if you can't write good prompts. You can't build effective agent systems if you don't understand context engineering. You

can't align agent behavior with organizational goals without understanding how intent works and how that plays into context engineering.

They all go together. There's a final dimension to this that goes beyond AI, and I want to spend some time on it. I

hinted at it when I talked about the idea that Toby found that he communicated better when he got better at prompting. The best human managers

at prompting. The best human managers that I've worked with already operate with that degree of clarity. They give

complete context when they delegate.

They specify acceptance criteria to their team members. They articulate

constraints. They're effectively

following the four disciplines of AI input with their people. And that makes for effective leadership. What's

happening right now, I think, if we step back, is that AI is enforcing a communication discipline that the best leaders have always practiced intuitively and now everyone needs it in

order to be effective. You cannot just rely on shared context with the machine.

You cannot just assume that AI will know. And that is something that is a

know. And that is something that is a gift to us because so many of our colleagues don't know what we mean either. How many times have you sat in a

either. How many times have you sat in a meeting where someone is referring to a document and you don't know what that document is and you're afraid to ask?

That is a wonderful example of the kind of poor communication quality that goes into human meetings. This is not a framing you'll see in a lot of how to prompt courses. I think it should be.

prompt courses. I think it should be.

The skill of providing highquality input to intelligent systems turns out to be a skill that's translatable for AIs and for humans. It turns out to be a

for humans. It turns out to be a fundamental skill of the agent age that benefits us as humans and how we work together. I think the people who develop

together. I think the people who develop these skills, this collection of skills around prompting for 2026 are going to end up being the leaders who will run

organizations where agents and humans both perform at their ceilings. And the

people who don't, the people who are stuck in 2025 prompting skills are going to wonder why their AI investments keep producing partial value. And meanwhile,

their human teams keep having alignment issues. The prompt by itself is dead.

issues. The prompt by itself is dead.

The specification, the context, the organizational intent. That is where the

organizational intent. That is where the value in prompting is moving toward because agents are starting to work for longer and longer periods and look in a lot of ways like junior employees. the

specification done right turns out to be just what clear thinking has always looked like really made explicit because machines don't let us be lazy about it and I'm really excited for the way that

kind of communication clarity can clean up our organizations and our humanto human communication as well. Good luck

with prompting the humans and agents in your life. Cheers. And yes, there's lots

your life. Cheers. And yes, there's lots more on this on the Substack. I think

this is one that needs a really complete guide. So, I wrote up a lot of extra

guide. So, I wrote up a lot of extra stuff for this so that you can dive into what each layer of learning means.

Loading...

Loading video analysis...