Why You're Only Getting 25% of the AI You Paid For (Context Rot Explained)

By Tiago Forte

Summary

## Key takeaways - **AI context windows have a hidden performance limit**: Despite advertised capacities of up to 1 million tokens, major LLMs like ChatGPT and Claude show reliable performance drops after around 50,000 to 100,000 tokens, meaning you're only getting 25-50% of the advertised capacity. [00:41] - **Context rot: AI misses details due to sampling, not deception**: Context rot isn't necessarily deceptive marketing; LLMs sample from context and take shortcuts when overwhelmed by large amounts of data, leading to missed details rather than intentional omissions. [01:24] - **Structure, not prompts, is key to overcoming context rot**: The solution to context rot isn't better prompts, but rather preparing information in more structured formats like markdown or JSON, which are closer to an LLM's native language. [03:25] - **Break down large documents into focused, labeled chunks**: Instead of one massive document, break information into smaller, purposeful chunks, and only load the relevant documents needed for a specific query into the AI. [04:16] - **Organize information using a system like PARA**: An overarching system like the PARA method (Projects, Areas, Resources, Archives) provides a consistent hierarchy to organize documents, making them more manageable for AI processing. [05:00]

Topics Covered

Your AI's context window is far smaller than advertised.
AI samples context; it doesn't 'read' like a human.
Prepare data, don't just prompt, for AI success.
Structure your data for AI's 'native language.'
Organize information systematically for optimal AI focus.

Full Transcript

Consider this. You're uploading multiple

documents into ChatGpt, crafting

detailed prompts, adding tons of project

files, all to give it as much context as

possible, and yet somehow it keeps

missing the most important details.

If you're experiencing this as well, you

might assume that you need to give the

AI better instructions, or maybe you

think you have to give it even more

context. But what I've learned through

weeks of testing is that this is a

systemic widespread issue. In the next

few minutes, we'll explore the research

that reveals the real performance limits

of AI. And finally, I'll walk you

through the approach that I've developed

to work with these limitations rather

than against them. You know those

impressive context windows that AI

companies love to promote? 200,000

tokens, 1 million tokens. A recent study

of 18 leading LLMs tested their

performance across different context

lengths and the results tell a story

that's very different from the marketing

materials. So when ChachiPT and Claude

promised 200,000 tokens, the effective

context window where you get reliable

results is actually closer to between 50

and 100,000 tokens. In other words, in

every case, you're actually getting

somewhere between 25 and 50% of the

advertised capacity in terms of

dependable performance from major LLMs.

This is a systemic widespread issue and

it already has a name, context rot. Now,

this isn't necessarily deceptive

marketing. These systems can technically

handle those larger contexts, but

whether they handle them well, that's a

different question entirely. Speaking of

these real performance numbers, if you

want to see exactly how all the major

LLMs actually stack up, not the

marketing promises, but the real limits

for your specific tasks, I've put

together a complete LLM context

performance guide. It shows you which AI

works best for contract analysis, for

meeting summaries, for research papers,

and eight other common tasks, plus the

actual effective token limits where each

model stops being reliable. Here's what

I find fascinating about this problem.

It reveals something fundamental about

how AI processes information. We tend to

think of AI as reading our entire

context the way a human might carefully

from beginning to end and somehow

memorizing everything that it reads all

along the way. But that's really not how

it works. See, the way that AI works is

it samples from your context, focusing

its attention on the parts that seem

relevant to your query. As the context

grows larger and larger, the sampling

becomes less precise. See, every LLM has

a budget that it has to work within. A

budget of time, electricity, server

capacity, tokens. Processing a large

amount of context really requires a

significant amount of all those

resources, which means that when the

system is overwhelmed, it takes

shortcuts. This understanding has

completely changed my approach to

working with AI. Instead of trying to

stuff in every last bit of context I

can, I've started thinking about how can

I give it as little as possible while

still allowing it to fulfill my goal.

Most people when they first discover

context rot try to solve it by writing

better prompts or switching to different

AI models altogether. But that's

addressing the symptom, not the cost.

The real solution isn't changing how you

talk to AI. It's about changing how you

prepare information before giving that

information to AI. Here's what I've

learned works consistently. First,

instead of copying and pasting plain

text, use more structured formats, for

example, markdown or JSON. These are

closer to an LLM's native language, so

to speak, which means they can process

and absorb that text much more

efficiently. And you know what? You can

actually have an LLM convert that text

into these formats just by asking ask it

something like convert this into

markdown format so it's easy for an LLM

to parse. On the other hand, you should

really avoid formats that are difficult

for LLMs to understand. For example,

PDFs which take a tremendous amount of

processing power for them to absorb

effectively. The second technique that I

found works well is to break large

documents down into focused, purposeful,

clearly labeled chunks. So instead of a

single master prompt in one gigantic

document with all the information on

your company, make a separate document

for each major part of the business.

Then when you're thinking of asking the

LLM for something, only load up into the

project files the one document or the

documents that you need to answer that

specific question. For example, if you

have a question about taxes, you only

need to load up the finance doc, not the

product doc. If you have a question

about compensation on your team, there's

really no need for the strategy doc. The

third technique that I found is to

organize these various documents that

you're going to be creating and

maintaining in an overarching system.

For example, my parah method, which

gives you a simple, clear, consistent

hierarchy across both your work and your

life. I have a book, the parah method,

which explains how that works. And it's

sold over 30,000 copies worldwide. You

can find it wherever books are sold. So

here's an example of what this can look

like. In our shared Google Drive as a

company, we actually have the entire

business broken down into different

parts. So there's strategic planning,

things like goal setting, metrics, the

long-term strategy and vision. Then

there's enabling processes like HR, like

finance, the ongoing processes that make

the business run. And then there's core

processes, which are the core ways that

we deliver value to customers. If I

click into core processes, so one layer

down the hierarchy, we've divided up the

way that we deliver value to customers

into six stages. We go from creating

content in the market to generating

leads. We go from leads to making sales,

from sales to delivering our services

and training, from delivery to success,

actually making our customers

successful, from success to lead, so

turning those successes through word of

mouth back into more leads. and then

success to market which is repurposing

those uh case studies and testimonials

back into the market in the form of

content. So drilling down one level

further if I click on for example

delivery to success we finally get to

the level of documents that could be

loaded into an LLM as project files. So

what you see here is uh several levels.

So level one is sort of providing the

big picture view. What is delivery to

success? Level two is what we call a

blueprint. It's getting a little more

into the weeds of what are the steps and

how are each of those steps measured and

delivered and verified. Then there's a

level three which is a step-by-step

guide for how to accomplish that task or

that process. And then there's even a

level four which is tools which is any

templates or other tools that we've

created. So can you see that by breaking

down our entire business into SOPs into

standard operating procedures, if I have

a question about how to uh perform a

certain process in our company or how to

improve it or how to measure it, instead

of dumping in this gigantic, you know,

tens of thousands of word document into

claude and saying try to find the

specific thing I'm looking for, I can

drill down as specifically as I can and

as I need to to give it just the one

document that has all the level of

detail. that it requires. And finally,

if we zoom out even further to the top

level of our shared Forte Labs Google

Drive, you can see that the entire

business, every document that we ever

need can be organized in an overarching

system, which is my para method, which

stands for projects. In this case, it's

Forte Labs, my company's projects, Forte

Labs areas, Forte Labs resources, Forte

Labs archives. What's so powerful about

this is I can click in and see every

single one of our currently active

projects. In fact, anyone in the entire

business or even external collaborators

can see every single active project in

the business. Or I can go in and see the

different ongoing areas such as finance,

insurance marketing operations

processes, product, etc. Or I can go

into resources, which is all the other

resources that we use as a company from

interview, transcripts to email

templates to membership, support

documents to testimonials to YouTube

documentation, etc. And then of course

finally the archives is everything else.

All the past projects that we've started

and completed or canceled or postponed.

All the old areas and processes that no

longer are active and we no longer use.

And even just other random stuff that we

don't want cluttering our workspace. If

you'd like to know more about parah, I

actually have a complete series on the

method which you can find right here in

this playlist. The key insight here is

that AI works best when it can focus its

attention, which is exactly the same as

for humans. However, in order to focus,

the AI needs something that's completely

different from what humans need. It

needs highly structured data formats.

Instead of fighting against this

limitation at AI, we can design our

approach around it. But just knowing

about the problem is only the beginning.

The true solution to context rot is a

whole emerging discipline called context

engineering. I've given you a few

initial tips in this video to get you

started, but if you want the full

picture, check out the next video that

walks you through the five levels of

context that transform AI into a

reliable thinking partner. And don't

forget to like and subscribe as I keep

exploring this topic.

Loading...

Loading video analysis...