No way this actually works
By The PrimeTime
Summary
Topics Covered
- Caveman Method Achieves 87% Token Savings
- AI Companies Profit From Verbose Output
- Short Prompts Beat Long Prompts by 26%
Full Transcript
Now, I know a lot of you have recently been hitting some limits when it comes to using Claude code. The conventional
wisdom, of course, is that you're holding it wrong, but actually, it turns out there's a better way to save on tokens. The solution, I honestly I I
tokens. The solution, I honestly I I didn't believe it, but it actually works. It actually works quite well. And
works. It actually works quite well. And
here's the thing, you will save actual real money using this method. And no,
I'm not exaggerating. I'm talking about Caveman. Now, you may not know what
Caveman. Now, you may not know what Caveman is. And hey, if you don't know
Caveman is. And hey, if you don't know what it is, there's there's a couple of kind of pop references you might be familiar with. First off, GrugBrain Dev.
familiar with. First off, GrugBrain Dev.
If you haven't heard of Grugra Dev, I highly recommend the essays. They go
about as counterintuitive counter. I
think I just made up a word. They're
like countercultural, but I also use the word intuitive. I kind of just made a
word intuitive. I kind of just made a baby with them. Countercultural to what is going on in today's space age AI.
37,000 lines of code. Just only let AI review AI's kind of nature. This is for the simpler man. Okay.
So me think why waste time say lot word when few word do trick.
That's actually what caveman is. Instead
of allowing claude code or codeex or whatever to go off and say a bunch of expressive statements. Hey man, you're
expressive statements. Hey man, you're absolutely right. You could spend money
absolutely right. You could spend money getting glazed like wild. Instead it
goes straight to the heart of the issue which is to actually just stop saying so many things. And with the cost of output
many things. And with the cost of output tokens, that's actually this actually can save some some serious money. Okay,
so what does actually the caveman scale look like? Well, I can't show it to you
look like? Well, I can't show it to you because apparently on GitHub, it can't show 200 lines of markdown. And when I try to go look at it raw right now, they broke raw. It just downloads it. Like it
broke raw. It just downloads it. Like it
doesn't even take me to the web page.
Anyways, I downloaded it and this is all it says to do right here. Watch this.
Drop any articles. So just don't use a and and the drop all filters. Just
really basically actually simply drop all the pleasantries. Sure. Certainly.
Of course. Happy to. A short cinnamons big not extensive fix not implement a solution for. All of these are actually
solution for. All of these are actually real token dropping phrases that you can actually save actual money with which is kind of insane. No hedging. Skip the it might be worth considering. Fragments
fine. No need full sentence. Technical
terms remain the same. So you can use polymorphism is still polymorphism. We
don't we don't shorten up those terms. Code blocks unchanged. Caveman speak
around code, not in code. Air messages
quoted exact. Caveman only for explanation. You can get the same
explanation. You can get the same results. The only difference is Claude
results. The only difference is Claude just doesn't sit there and glaze you and say a bunch of stupid words at you with well actually the fix was quite simple and your insight into the sol into the
problem space was actually the right direction. All I had to do and it's just
direction. All I had to do and it's just like no no no shut up. Stop saying that.
Here's a good example. Sure, I'd be happy to help you with that. The issue
you are experiencing is likely caused by no, don't do that. Yes. Bug in O middleware token expiry check. Use this,
not that fix. So often you can actually drop a lot of tokens. Even just this alone, you can see right here it goes from 69 tokens to 19 tokens. It even
allows you to do various levels of caveman. You can do light where you're
caveman. You can do light where you're just trimming the fat. You can do like kind of the full one. You can also do the ultra maximum one. All full rules plus abbreviate common terms DB, O
config request res u fn imple strip conjunctions where possible, one-word answers when one word enough, arrow notation for causality, and this just actually works. Like, this is just
the free hack. They even have like a basic table breaking down the various usages, explaining a re a reactender bug. It goes from 1,180 to 159 tokens.
bug. It goes from 1,180 to 159 tokens.
87% saved, which also just shows how fluffy the language is. Like, think
about it's like bloiating. It's just
saying a bunch of nonsense with these big extravagant words without actually saying anything at all. I don't want to be much of a conspiracy theorist, but you know, I'm just saying Claude, they
do make their money by output tokens. So
instead of just being like off is broken, it needs to just go on just a rampage soliloquy to let you know every last possible thing that could possibly be said about a topic that could be said in three words. It's truly an impressive
piece of technology. I can't like honestly the trade for affordable computers in Rainforest for one of these little black magic, you know, sandboxes
is pretty it's pretty fantastic. I would
make that trade any day of the week.
Also, if you don't know anything about me, I typically don't cite studies because largely I think studies have been gamed this the facts the facts you're getting hit with I'm not too sure you can really trust those cuz you could
you know there's lies there's damned lies and then there's statistics. But
hey, since this one's going up in my favor in March 2026, so just a couple days ago, brevity constraints reverse performance hierarchies and language models. All of that, all those words
models. All of that, all those words just simply mean that uh making the response brief improves accuracy by 26 percentage points. Now, what is 26% more
percentage points. Now, what is 26% more accurate? Some would some would say that
accurate? Some would some would say that sounds like a lot. What does 26% even mean? It doesn't really matter. You know
mean? It doesn't really matter. You know
why? Cuz it's more accurate. Okay. Hey,
green mean good. Okay. We got that graph that's going up and to the right. And
that's all you need in life. Okay. When
things get better, it's good. Things
bad, not good. So, go ahead, give it a try. Go check out this Julius Brussy's
try. Go check out this Julius Brussy's caveman, which also Can we just take a quick step to the side? We got we got we got to chat about this for a second.
Why? Why oh why why oh why does every single agent program you can possibly download have its own skill directory that you put skills into? This has to be
the greatest XKCD outcome that could ever be. you have like any project I
ever be. you have like any project I seem to walk into has like 20 separate folders for the same text and they're all committed. [laughter] It's just like
all committed. [laughter] It's just like why why did we get here? [gasps] I
THOUGHT WE GOT PhD level intelligence.
Instead, we just have absolutely junior level execution. It hurts me. It hurts
level execution. It hurts me. It hurts
me deep down. Anyways, so if you're struggling out there using Claude code and the you're holding it wrong message, in fact, it did not help you. Why don't
you give this a try right here? Okay, go
check it out. Don't say I never told you anything. Okay, cuz this is good. This
anything. Okay, cuz this is good. This
is good information right here. Okay,
this good thing you download, use now name Prime Gen. Hey, is that HTTP? Get
that out of here. That's not how we order coffee. We order coffee via SSH
order coffee. We order coffee via SSH terminal.shop. Yeah, you want a real
terminal.shop. Yeah, you want a real experience. You want real coffee. You
experience. You want real coffee. You
want awesome subscriptions. So you never have to remember again. Oh, you want exclusive blends with exclusive coffee and exclusive content? Then check out
CRON. You don't know what SSH is?
CRON. You don't know what SSH is?
Well, maybe the coffee is not for you.
[singing] [music] Live the dream.
Loading video analysis...