LongCut logo

No way this actually works

By The PrimeTime

Summary

Topics Covered

  • Caveman Method Achieves 87% Token Savings
  • AI Companies Profit From Verbose Output
  • Short Prompts Beat Long Prompts by 26%

Full Transcript

Now, I know a lot of you have recently been hitting some limits when it comes to using Claude code. The conventional

wisdom, of course, is that you're holding it wrong, but actually, it turns out there's a better way to save on tokens. The solution, I honestly I I

tokens. The solution, I honestly I I didn't believe it, but it actually works. It actually works quite well. And

works. It actually works quite well. And

here's the thing, you will save actual real money using this method. And no,

I'm not exaggerating. I'm talking about Caveman. Now, you may not know what

Caveman. Now, you may not know what Caveman is. And hey, if you don't know

Caveman is. And hey, if you don't know what it is, there's there's a couple of kind of pop references you might be familiar with. First off, GrugBrain Dev.

familiar with. First off, GrugBrain Dev.

If you haven't heard of Grugra Dev, I highly recommend the essays. They go

about as counterintuitive counter. I

think I just made up a word. They're

like countercultural, but I also use the word intuitive. I kind of just made a

word intuitive. I kind of just made a baby with them. Countercultural to what is going on in today's space age AI.

37,000 lines of code. Just only let AI review AI's kind of nature. This is for the simpler man. Okay.

So me think why waste time say lot word when few word do trick.

That's actually what caveman is. Instead

of allowing claude code or codeex or whatever to go off and say a bunch of expressive statements. Hey man, you're

expressive statements. Hey man, you're absolutely right. You could spend money

absolutely right. You could spend money getting glazed like wild. Instead it

goes straight to the heart of the issue which is to actually just stop saying so many things. And with the cost of output

many things. And with the cost of output tokens, that's actually this actually can save some some serious money. Okay,

so what does actually the caveman scale look like? Well, I can't show it to you

look like? Well, I can't show it to you because apparently on GitHub, it can't show 200 lines of markdown. And when I try to go look at it raw right now, they broke raw. It just downloads it. Like it

broke raw. It just downloads it. Like it

doesn't even take me to the web page.

Anyways, I downloaded it and this is all it says to do right here. Watch this.

Drop any articles. So just don't use a and and the drop all filters. Just

really basically actually simply drop all the pleasantries. Sure. Certainly.

Of course. Happy to. A short cinnamons big not extensive fix not implement a solution for. All of these are actually

solution for. All of these are actually real token dropping phrases that you can actually save actual money with which is kind of insane. No hedging. Skip the it might be worth considering. Fragments

fine. No need full sentence. Technical

terms remain the same. So you can use polymorphism is still polymorphism. We

don't we don't shorten up those terms. Code blocks unchanged. Caveman speak

around code, not in code. Air messages

quoted exact. Caveman only for explanation. You can get the same

explanation. You can get the same results. The only difference is Claude

results. The only difference is Claude just doesn't sit there and glaze you and say a bunch of stupid words at you with well actually the fix was quite simple and your insight into the sol into the

problem space was actually the right direction. All I had to do and it's just

direction. All I had to do and it's just like no no no shut up. Stop saying that.

Here's a good example. Sure, I'd be happy to help you with that. The issue

you are experiencing is likely caused by no, don't do that. Yes. Bug in O middleware token expiry check. Use this,

not that fix. So often you can actually drop a lot of tokens. Even just this alone, you can see right here it goes from 69 tokens to 19 tokens. It even

allows you to do various levels of caveman. You can do light where you're

caveman. You can do light where you're just trimming the fat. You can do like kind of the full one. You can also do the ultra maximum one. All full rules plus abbreviate common terms DB, O

config request res u fn imple strip conjunctions where possible, one-word answers when one word enough, arrow notation for causality, and this just actually works. Like, this is just

the free hack. They even have like a basic table breaking down the various usages, explaining a re a reactender bug. It goes from 1,180 to 159 tokens.

bug. It goes from 1,180 to 159 tokens.

87% saved, which also just shows how fluffy the language is. Like, think

about it's like bloiating. It's just

saying a bunch of nonsense with these big extravagant words without actually saying anything at all. I don't want to be much of a conspiracy theorist, but you know, I'm just saying Claude, they

do make their money by output tokens. So

instead of just being like off is broken, it needs to just go on just a rampage soliloquy to let you know every last possible thing that could possibly be said about a topic that could be said in three words. It's truly an impressive

piece of technology. I can't like honestly the trade for affordable computers in Rainforest for one of these little black magic, you know, sandboxes

is pretty it's pretty fantastic. I would

make that trade any day of the week.

Also, if you don't know anything about me, I typically don't cite studies because largely I think studies have been gamed this the facts the facts you're getting hit with I'm not too sure you can really trust those cuz you could

you know there's lies there's damned lies and then there's statistics. But

hey, since this one's going up in my favor in March 2026, so just a couple days ago, brevity constraints reverse performance hierarchies and language models. All of that, all those words

models. All of that, all those words just simply mean that uh making the response brief improves accuracy by 26 percentage points. Now, what is 26% more

percentage points. Now, what is 26% more accurate? Some would some would say that

accurate? Some would some would say that sounds like a lot. What does 26% even mean? It doesn't really matter. You know

mean? It doesn't really matter. You know

why? Cuz it's more accurate. Okay. Hey,

green mean good. Okay. We got that graph that's going up and to the right. And

that's all you need in life. Okay. When

things get better, it's good. Things

bad, not good. So, go ahead, give it a try. Go check out this Julius Brussy's

try. Go check out this Julius Brussy's caveman, which also Can we just take a quick step to the side? We got we got we got to chat about this for a second.

Why? Why oh why why oh why does every single agent program you can possibly download have its own skill directory that you put skills into? This has to be

the greatest XKCD outcome that could ever be. you have like any project I

ever be. you have like any project I seem to walk into has like 20 separate folders for the same text and they're all committed. [laughter] It's just like

all committed. [laughter] It's just like why why did we get here? [gasps] I

THOUGHT WE GOT PhD level intelligence.

Instead, we just have absolutely junior level execution. It hurts me. It hurts

level execution. It hurts me. It hurts

me deep down. Anyways, so if you're struggling out there using Claude code and the you're holding it wrong message, in fact, it did not help you. Why don't

you give this a try right here? Okay, go

check it out. Don't say I never told you anything. Okay, cuz this is good. This

anything. Okay, cuz this is good. This

is good information right here. Okay,

this good thing you download, use now name Prime Gen. Hey, is that HTTP? Get

that out of here. That's not how we order coffee. We order coffee via SSH

order coffee. We order coffee via SSH terminal.shop. Yeah, you want a real

terminal.shop. Yeah, you want a real experience. You want real coffee. You

experience. You want real coffee. You

want awesome subscriptions. So you never have to remember again. Oh, you want exclusive blends with exclusive coffee and exclusive content? Then check out

CRON. You don't know what SSH is?

CRON. You don't know what SSH is?

Well, maybe the coffee is not for you.

[singing] [music] Live the dream.

Loading...

Loading video analysis...