Why You're Only Getting 25% of the AI You Paid For (Context Rot Explained)
By Tiago Forte
Summary
## Key takeaways - **AI context windows have a hidden performance limit**: Despite advertised capacities of up to 1 million tokens, major LLMs like ChatGPT and Claude show reliable performance drops after around 50,000 to 100,000 tokens, meaning you're only getting 25-50% of the advertised capacity. [00:41] - **Context rot: AI misses details due to sampling, not deception**: Context rot isn't necessarily deceptive marketing; LLMs sample from context and take shortcuts when overwhelmed by large amounts of data, leading to missed details rather than intentional omissions. [01:24] - **Structure, not prompts, is key to overcoming context rot**: The solution to context rot isn't better prompts, but rather preparing information in more structured formats like markdown or JSON, which are closer to an LLM's native language. [03:25] - **Break down large documents into focused, labeled chunks**: Instead of one massive document, break information into smaller, purposeful chunks, and only load the relevant documents needed for a specific query into the AI. [04:16] - **Organize information using a system like PARA**: An overarching system like the PARA method (Projects, Areas, Resources, Archives) provides a consistent hierarchy to organize documents, making them more manageable for AI processing. [05:00]
Topics Covered
- Your AI's context window is far smaller than advertised.
- AI samples context; it doesn't 'read' like a human.
- Prepare data, don't just prompt, for AI success.
- Structure your data for AI's 'native language.'
- Organize information systematically for optimal AI focus.
Full Transcript
Consider this. You're uploading multiple
documents into ChatGpt, crafting
detailed prompts, adding tons of project
files, all to give it as much context as
possible, and yet somehow it keeps
missing the most important details.
If you're experiencing this as well, you
might assume that you need to give the
AI better instructions, or maybe you
think you have to give it even more
context. But what I've learned through
weeks of testing is that this is a
systemic widespread issue. In the next
few minutes, we'll explore the research
that reveals the real performance limits
of AI. And finally, I'll walk you
through the approach that I've developed
to work with these limitations rather
than against them. You know those
impressive context windows that AI
companies love to promote? 200,000
tokens, 1 million tokens. A recent study
of 18 leading LLMs tested their
performance across different context
lengths and the results tell a story
that's very different from the marketing
materials. So when ChachiPT and Claude
promised 200,000 tokens, the effective
context window where you get reliable
results is actually closer to between 50
and 100,000 tokens. In other words, in
every case, you're actually getting
somewhere between 25 and 50% of the
advertised capacity in terms of
dependable performance from major LLMs.
This is a systemic widespread issue and
it already has a name, context rot. Now,
this isn't necessarily deceptive
marketing. These systems can technically
handle those larger contexts, but
whether they handle them well, that's a
different question entirely. Speaking of
these real performance numbers, if you
want to see exactly how all the major
LLMs actually stack up, not the
marketing promises, but the real limits
for your specific tasks, I've put
together a complete LLM context
performance guide. It shows you which AI
works best for contract analysis, for
meeting summaries, for research papers,
and eight other common tasks, plus the
actual effective token limits where each
model stops being reliable. Here's what
I find fascinating about this problem.
It reveals something fundamental about
how AI processes information. We tend to
think of AI as reading our entire
context the way a human might carefully
from beginning to end and somehow
memorizing everything that it reads all
along the way. But that's really not how
it works. See, the way that AI works is
it samples from your context, focusing
its attention on the parts that seem
relevant to your query. As the context
grows larger and larger, the sampling
becomes less precise. See, every LLM has
a budget that it has to work within. A
budget of time, electricity, server
capacity, tokens. Processing a large
amount of context really requires a
significant amount of all those
resources, which means that when the
system is overwhelmed, it takes
shortcuts. This understanding has
completely changed my approach to
working with AI. Instead of trying to
stuff in every last bit of context I
can, I've started thinking about how can
I give it as little as possible while
still allowing it to fulfill my goal.
Most people when they first discover
context rot try to solve it by writing
better prompts or switching to different
AI models altogether. But that's
addressing the symptom, not the cost.
The real solution isn't changing how you
talk to AI. It's about changing how you
prepare information before giving that
information to AI. Here's what I've
learned works consistently. First,
instead of copying and pasting plain
text, use more structured formats, for
example, markdown or JSON. These are
closer to an LLM's native language, so
to speak, which means they can process
and absorb that text much more
efficiently. And you know what? You can
actually have an LLM convert that text
into these formats just by asking ask it
something like convert this into
markdown format so it's easy for an LLM
to parse. On the other hand, you should
really avoid formats that are difficult
for LLMs to understand. For example,
PDFs which take a tremendous amount of
processing power for them to absorb
effectively. The second technique that I
found works well is to break large
documents down into focused, purposeful,
clearly labeled chunks. So instead of a
single master prompt in one gigantic
document with all the information on
your company, make a separate document
for each major part of the business.
Then when you're thinking of asking the
LLM for something, only load up into the
project files the one document or the
documents that you need to answer that
specific question. For example, if you
have a question about taxes, you only
need to load up the finance doc, not the
product doc. If you have a question
about compensation on your team, there's
really no need for the strategy doc. The
third technique that I found is to
organize these various documents that
you're going to be creating and
maintaining in an overarching system.
For example, my parah method, which
gives you a simple, clear, consistent
hierarchy across both your work and your
life. I have a book, the parah method,
which explains how that works. And it's
sold over 30,000 copies worldwide. You
can find it wherever books are sold. So
here's an example of what this can look
like. In our shared Google Drive as a
company, we actually have the entire
business broken down into different
parts. So there's strategic planning,
things like goal setting, metrics, the
long-term strategy and vision. Then
there's enabling processes like HR, like
finance, the ongoing processes that make
the business run. And then there's core
processes, which are the core ways that
we deliver value to customers. If I
click into core processes, so one layer
down the hierarchy, we've divided up the
way that we deliver value to customers
into six stages. We go from creating
content in the market to generating
leads. We go from leads to making sales,
from sales to delivering our services
and training, from delivery to success,
actually making our customers
successful, from success to lead, so
turning those successes through word of
mouth back into more leads. and then
success to market which is repurposing
those uh case studies and testimonials
back into the market in the form of
content. So drilling down one level
further if I click on for example
delivery to success we finally get to
the level of documents that could be
loaded into an LLM as project files. So
what you see here is uh several levels.
So level one is sort of providing the
big picture view. What is delivery to
success? Level two is what we call a
blueprint. It's getting a little more
into the weeds of what are the steps and
how are each of those steps measured and
delivered and verified. Then there's a
level three which is a step-by-step
guide for how to accomplish that task or
that process. And then there's even a
level four which is tools which is any
templates or other tools that we've
created. So can you see that by breaking
down our entire business into SOPs into
standard operating procedures, if I have
a question about how to uh perform a
certain process in our company or how to
improve it or how to measure it, instead
of dumping in this gigantic, you know,
tens of thousands of word document into
claude and saying try to find the
specific thing I'm looking for, I can
drill down as specifically as I can and
as I need to to give it just the one
document that has all the level of
detail. that it requires. And finally,
if we zoom out even further to the top
level of our shared Forte Labs Google
Drive, you can see that the entire
business, every document that we ever
need can be organized in an overarching
system, which is my para method, which
stands for projects. In this case, it's
Forte Labs, my company's projects, Forte
Labs areas, Forte Labs resources, Forte
Labs archives. What's so powerful about
this is I can click in and see every
single one of our currently active
projects. In fact, anyone in the entire
business or even external collaborators
can see every single active project in
the business. Or I can go in and see the
different ongoing areas such as finance,
insurance marketing operations
processes, product, etc. Or I can go
into resources, which is all the other
resources that we use as a company from
interview, transcripts to email
templates to membership, support
documents to testimonials to YouTube
documentation, etc. And then of course
finally the archives is everything else.
All the past projects that we've started
and completed or canceled or postponed.
All the old areas and processes that no
longer are active and we no longer use.
And even just other random stuff that we
don't want cluttering our workspace. If
you'd like to know more about parah, I
actually have a complete series on the
method which you can find right here in
this playlist. The key insight here is
that AI works best when it can focus its
attention, which is exactly the same as
for humans. However, in order to focus,
the AI needs something that's completely
different from what humans need. It
needs highly structured data formats.
Instead of fighting against this
limitation at AI, we can design our
approach around it. But just knowing
about the problem is only the beginning.
The true solution to context rot is a
whole emerging discipline called context
engineering. I've given you a few
initial tips in this video to get you
started, but if you want the full
picture, check out the next video that
walks you through the five levels of
context that transform AI into a
reliable thinking partner. And don't
forget to like and subscribe as I keep
exploring this topic.
Loading video analysis...