LongCut logo

The Ralph Wiggum Loop from 1st principles (by the creator of Ralph)

By Geoffrey Huntley

Summary

Topics Covered

  • Software Development Costs $1042/Hour
  • Redesign Tools for AI Agents
  • Engineers Manage Autonomous Locomotives
  • Specs Emerge from AI Conversations
  • Ralph Loops Deterministically Mallic Arrays

Full Transcript

Hello everyone. Surprise for anyone who's tuning in. So, it's been a wild couple of days. Um, Ralph is finally crossing the chasm and people are

realizing that the economics of software development has forever changed. Um, if you run uh what

forever changed. Um, if you run uh what I'm about to show you in a loop. What

happens is you work out a unit economic cost for software development.

Note there is a difference between software development and software engineering. Software engineering is

engineering. Software engineering is some of the stuff we'll be doing here later as I expand some of these teachings.

And essentially software development now costs $1042 US an hour. It's uh less

than uh you would pay a fast fast food retail worker. It's cheap.

It's now cheap. And not only is it cheap, you can do it autonomously.

But to get it working, you have to understand the bare bones fundamentals from first principles. Like don't start with a jackhammer like Ralph. Like learn

how to use a screwdriver first. Learn

how to use a screwdriver first. Really

important. just don't jump straight to the power tools.

And yeah, so the calculations of $1042 an hour is really simple. If you take API costs for uh Sonnet 4.5 from Enthropic right now

and you run Ralph in a loop and you look at how much it costs in a 24-hour period, you're looking at $1042 an hour.

And in that you're not out in a single hour you're outputting multiple days worth of work if not weeks levels of work. So in 24 hours you're you're

work. So in 24 hours you're you're actually at a place where you're mogging like backlogs type thing. So it gets really strange for what's going to be happening in our industry going forward

because if you look at it right essentially um it's going to create this rift something I've been talking about over the last There's going to be this massive rift in software development

between those who get it and those who don't. And I've been pleading with

don't. And I've been pleading with people, please invest in yourselves, get curious, pick up the screwdriver, master the screwdriver. Once you know the

the screwdriver. Once you know the screwdriver, go for the jackhammer.

Thanks, uh, Dex for the analogy. So,

let's kick this off. Let's kick this off.

So um one of the most fundamental things is you need to have your specifications.

It doesn't matter about the tool that you use, the coding harness or what else have you. It's it's more about thinking

have you. It's it's more about thinking about this from first principles. It's

all about from first principles.

Anthropic has released their plugin that does Ralph and I'm thankful for that because it's just created this inflection point. But the way that that

inflection point. But the way that that works is uh not not it. You're going to get better outcomes if you do these concepts by hand. This is the

screwdriver. Now, if you give me a

screwdriver. Now, if you give me a moment, it looks like my Roomba is kicking off, which is kind of hilarious considering we're about to program a Roomba. So, give me a check.

Roomba. So, give me a check.

There we go. All right. So, the physical room is done. Let's do the virtual room.

So, here is Loom. Loom is where I'm kind of reimagining what software development will be, what we need to throw away.

Everything that exists today is all been designed for humans. If you

think about the user space in Unix, that's all we've got TTY and all that stuff. That's everything has been

stuff. That's everything has been compounding on a design for humans for humans for humans. Even agile and how we do software development and all the rest of the engineering practices, it's all

been designed around humans. So if you go in a loop of essentially invalidating uh was this designed for humans, you can do like the five W's and maybe you're able to cut it. And if you're able to

cut it, you need then need to think about okay, how do I mitigate cutting that? Was it adding value? Was it not

that? Was it adding value? Was it not adding value?

And this is what Loom is. Loom is an experiment in essentially self-evolutionary software. It's uh the

self-evolutionary software. It's uh the idea is if instead of having humans in the loop,

what happens if humans instead are on the loop or programming the loop?

And that's going to require a heavy engineering mindset.

Um, and this is going to be very different to software development because software development is now fundamentally automated with the most trivial of bash loops and techniques.

And uh, this is just the start. Um, this

is going to cause an inspiration for other people to build their own things that are smarter than Ralph.

Smarter than Ralph. I'm already seeing that taking place.

So, Loom is of many things right now.

Loom is uh essentially GitHub code hosting. It's its own source control

hosting. It's its own source control that uses JJ. It's GitHub code spaces.

So, I can remotely provision infrastructure.

Um, it's got its own coding agent very similar to AMP or Claude code. um except

that it alloys multiple different LLM providers together and it has the ability to spawn remote infrastructure and run not

locally on the computer and I'm very much building in a actor pub sub type of mind where I want to be

looking at creating chains of these agents or creating loops on loops on loops on Ralph and it extends way past

software development. It crosses into

software development. It crosses into the feature space or the product design space. Last night I added uh essentially

space. Last night I added uh essentially feature flags or feature experiments uh by giving some prompts to essentially hey we want to clone launch darkly and

we've got it now. So, the next thing really is, okay, I I'm missing some analytics and I've got this SAS company that wants to charge me $900 of $900 to

renew and it's such a simple product and I'm going to need this functionality in Loom if I want to take Ralph to the to the product access. And that's something I

product access. And that's something I do want to do. I want autonomous agents.

I call them weavers that autonomously deploy software without any code review. It's

already it's already doing it now.

There's been so many failure domains. I

can probably hear your objections. If

Ralph the idea of running Ralph makes you want a Ralph, listen to it.

Listen to it and then engineer away those concerns. That is now the job.

those concerns. That is now the job.

That is now our job as software engineers is to keep the locomotive on the track. We are locomotive engineers

the track. We are locomotive engineers now.

We're no long we're no longer carrying cargo by hand onto the ship. We have the uh the box the boxes here. Um the

shipping containers are here. So Loom is of many things. Um the way that Loom was built is really simple.

It starts with a conversation and the conversation creates specs.

The conversation creates specs. I see a lot of people out there saying, "Hey, I want to uh do like they handcraft their specs and they say they don't have time

to create specs." No. Would you believe I don't create my specs. I generate

them. Then I review them and edit them by hand. And then I just let it rip with

by hand. And then I just let it rip with Ralph. So let's do this now. Like I've

Ralph. So let's do this now. Like I've

got this SAS analytics company. I'm

going to need analytics because I want my Weavers to be able to look at product metrics. I've got the ability for

metrics. I've got the ability for feature experiments to turn functionality on and off. And I'm

engineering in a way that there is no code review. We have autonomous software

code review. We have autonomous software or agents or weavers that will automatically when it's introducing a

feature put a feature flag in deploy it look at the analytics decide whether it's actually fixed any errors or maybe it can do some optimizations

and the landing page this is where we're going folks this is 2026 strap yourself in where we're getting towards essentially autonomous systems um Ralph is really just a malicking

orchestrator that avoids context rot and compaction.

Compaction is the is the devil that basically considers the entire system including the operating system as the complete unit. If you have some

external functionality like some external vendor or API or whatever that is also part of the system. It's not

just your application. Now, a lot of those external systems that exist today that we use in software engineering have been designed for humans.

You might see me on a loop about this.

So, what would they look like if they were designed for robots and how can we change that design so they're designed

for robots? If we control the entire

for robots? If we control the entire stack, then we can start doing optimization like serialization formats.

JSON is not a great uh protocol when it comes to uh serialization and tokenization. It's not a great protocol.

tokenization. It's not a great protocol.

If we control the entire stack, then we can improve how tokenization works to drive these reactive agents and all of a

sudden you can now operate cheaper than anyone else because you're optimizing. Like we need to be thinking

optimizing. Like we need to be thinking about software engineering again. We got

a brand new computer. What is something from something?

Like for example, what is garbage collection now? What is Malo? What is

collection now? What is Malo? What is

Erlang? OTP principles and message passing.

Why do we have user space?

Like and all these things. Why do we have JSON? Why do we have TTY? And you

have JSON? Why do we have TTY? And you

start just cutting and start optimizing just to be the bare minimum that the machine needs. So let's go folks. Today

machine needs. So let's go folks. Today

we're going to show you how you create specs.

Going to kick off cla code. Claw code's

going to do its thing. yada yada yada.

So the first thing you want to do and one of the first principles of Ralph is essentially deterministically malicking the array. Context windows are arrays.

the array. Context windows are arrays.

The less that you use in that array, the less the window needs to slide, the better outcomes you get. very much

different to the anthropic which basically just completely keeps pounding the model in a loop until it gets compaction and then the compaction is a lossy function and then it can result in

the loss of the the the pin and what I call the pin is this I've been incrementally building up loom through conversations just like this and

this is my specifications and every time I add a new feature or adjust I continually continually evolve and update my specifications.

And cool, this is now my pin. It's got

my frame of reference of what this is all about. And I haven't injected it

all about. And I haven't injected it all. But what I've done, if I look at

all. But what I've done, if I look at this file, is it's a whole bunch of lookup tables that link to a particular things and give hints to the search tool

for like user authentication. and what

are some other words used for user authentication which improves the hit rate of the search tool. The more it's able to find and look up that context,

the less it's going to invent. I don't

want it to invent anything to do with my current functionality, but I do want to use it as a pin or frame or reference to my current functionality.

Okay, let's go. So, how do you build specs? It's really simple. Hey, I want

specs? It's really simple. Hey, I want to add product analytics

like post hog into uh loom.

It would be used uh by products built

on loom.

Thus we are collecting information about non uh authenticated

users. Let's have a discussion

users. Let's have a discussion and you can interview me.

So we got our pin which basically you can use as a lookup source to learn more about the current functionality of the application.

And then I just go like four four point four. I don't care about privacy. We

four. I don't care about privacy. We

collect data. Use the loom secret

data. Use the loom secret create for IP addresses. So this is something I've got. I've got a a special

wrapper for PII type topics. That way

logs for sensitive PII information will never end up any anything like IP addresses and all that stuff will never end up in logs. This is the engineering type topic. So I'm giving it some some

type topic. So I'm giving it some some direction. Later I'll play with the

direction. Later I'll play with the privacy etc. And this is really low effort just to teach people how simple it is. uh five

it is. uh five integration is via web

uh API thus clients will be uh the clients will be rust typescript and interact with the

loom api.

Next up, I guess we need SDKs for this new feature set.

So, other applications can run experiments on the Loom platform.

And this is it's a dance and dance in and out.

I don't want any uh like mobile. So I'm

going to say no mobile event model event model choose best practices. Look how

post does it. Three,

does it. Three, four, experiments integrate in with our

existing flag specifications and system.

Fine. For data storage, just store in our current SQL light. Right now I'm using SQL light

SQL light. Right now I'm using SQL light because it's just really fast for iteration loop versus Postgress. Plus

you get these machines that are massive now. They're really cheap. You can fogg

now. They're really cheap. You can fogg a postgress database so hard these days.

And uh whilst this is not scalable, that's not a concern I have right now.

Um and in the end I'm not locking in my data model because I have a vision towards actors and OTP principles. Um

maybe I can merge multiple of these big machines together and we start getting towards virtual actors and Microsoft all things. If you know you know

things. If you know you know okay identity identity model I don't know how does

post hog do it let's discuss.

So this is think about this is about you've got some clay on a pottery wheel and you're just like slowly making adjustments. You're you're molding the

adjustments. You're you're molding the context window and you're testing what it knows and you're applying the engineering knowledge that you have

and we're shaping the specifications.

We're shaping the specifications.

And it's a dance, folks. This is how you build your specifications. You got all the time in the world. And what's really cool is I could let this rip once it's

done in a branch, and then I could like check its outcomes. It's free. It's

pretty much free. I think it's like let it do a couple Ralph loops while I'm there and I would I'd be driving it by hand.

I'll be malicking and doing the principles by hand. And if I'm okay where it's it's going, then I will just let it rip. Otherwise, if something's wrong, I'll go back to the specifications.

I will adjust some of the prompt engineering. Maybe I take some different

engineering. Maybe I take some different approaches for the back pressure.

There's a lot of things you can do with back pressure. Our job is now

back pressure. Our job is now engineering back pressure to the generative function to keep the generative function on the rails, the locomotive.

Um yeah, we want person profiles inc

anonymous and SDK can identify someone.

I don't know what does post hogg do three uh multi- tannency

ty works by analytics uh tied to a loom or

they're tied to a loom organization I've already got aback back. I've already got multi-tenency built in. So, this is me applying the engineering that you should

use the search tool to uh learn how like I'm already doing multi-tenency.

This is me steering the specification stage or adjusting the clay and the pottery wheel that you got to use the search tool. Don't reinvent yourself

search tool. Don't reinvent yourself here.

And the interesting thing is because it's the context window is just an array. There's no reason why you can't

array. There's no reason why you can't like productize just this process. Um

even Anthropics doing it like the user ask question tool or the planning tool etc. But that's not good enough because there's no memory server side for inferencing. It's just what's in the RA.

inferencing. It's just what's in the RA.

There's no reason why you can't preserve this conversation and rehydrate it later. So, you can either create another

later. So, you can either create another terminal once you write it out to disk, and we're going to be doing that very shortly, and then let it rip uh attended, and

then let it rip unattended, and then come back here and make some adjustments before you let it go. a full

hog or like you can make some sort of tool here that just like resumes this type of state resumes the state of the conversation. So you can resume the

conversation. So you can resume the planning. There's no reason why the

planning. There's no reason why the source of truth needs to be marked down folks. It can be just this array.

folks. It can be just this array.

Um Just do what post hog does.

Okay. Then update specar.md

and create a implementation plan at hostth hog.mmd.

hostth hog.mmd.

And this is the key is if you want to improve the ability for it to track the plan. I've seen people go JSON and other

plan. I've seen people go JSON and other JSON etc. All you need to do is think about like the generative function into the search tool.

strong linkage. You just got to do linkage and as bullet points

and site the specification or source code that needs to be adjusted specification.

And then you can you can start really playing with this because like the way that the read tool works is it works in hunks, folks. It works in hunks. So you

hunks, folks. It works in hunks. So you

can actually tell it to actually give specifics of what hunks in each file need to be done. I'm not going to do it here. I just need to let this rip. I've

here. I just need to let this rip. I've

got about seven minutes to get this going and then I'm going to be AFK.

Okay, let's call out something that's happening here. It's creating the new

happening here. It's creating the new specification, but not only is it creating the new specification, it's updating my lookup table. The specs

readme, it's updating the specs dot the lookup table. It's just a lookup table.

table. It's just a lookup table.

And it has many different generative words to explain what each spec does. And those

generative words act as essentially by having more descriptors of what the specification is. That lookup table will

specification is. That lookup table will get more hits for the search tool.

You need to think about these things from first principles because you can drive it all by hand. The more you drive by hand, the better outcomes you get. If

you go straight for the jackham jackhammer, you're going to get terrible results. Absolutely terrible results.

results. Absolutely terrible results.

All righty, we now got a specification.

Now, it would be really tempting to kick this off in this context window, but this context window or this array already has one goal. That's to create me some specifications.

Now, there's a thing of context rot.

This is what Ralph is all about is avoiding compaction by deterministically

malicking the array. So question time.

It should be no surprise what we need to do is create a new array.

We're going to keep this one open. We're

going to have a look to see what it does, right? Because I might want to

does, right? Because I might want to refine my specifications. So I'm just going to like create another session.

Hop on over.

Cool. And we're going to go to code, go to Loom.

And the first thing we're going to do is create a prompt.

Prompt of MD.

Okay, we need our PIN study specs readme.md.

specs readme.md.

Study. What is this called? It's called

the specs implementation plan.

and pick the most important thing to do.

So instead of this multi-step type of approach, what we're doing is we're allowing the LLM to decide what the most important thing is in our implementation plan.

It's uh you you it's not high control, it's low control with high oversight.

And by it only just doing one thing in lots of loops, then each loop only has one goal, one objective, and you you're using less of the context window, folks.

Very important. Okay. Important.

Entropic likes you to yell at the LM.

Let's give us some completion promises or give us give some objectives. It's

important. uh

use the loom web i18 where uh and loom for typescript and I will just say this is like loom

18n patterns for typescript or the loom iain n patterns for rust.

Okay.

Uh build author property based tests or unit tests whichever is best. I'm

giving the judgment to the LLM to decide whether it should be a property based test or unit test. I'm not going to get into that dogma of where you should use each. This is your engineering

each. This is your engineering knowledge.

After making chain, after make run cargo test run the tests.

When tests pass, commit and push to deploy the changes. Now, Loom

automatically deploys. There is no CI.

There is no CI. Um, it has full access to do pseudo. It's been programmed in loops. So, it can introspect the

loops. So, it can introspect the automatic deployment using pseudo. And

pseudo is very safe in this case for bootstrapping up because uh I use Nixos.

And if you know, you know.

Now, I'm kind of halfassing this to be honest. Um because I'm depressed on

honest. Um because I'm depressed on time, you normally would do something a little bit better. So I'm just going to kick

bit better. So I'm just going to kick this off while true. Let's do the Ralph cat prompt. MD

cat prompt. MD to claude dangerously skip permissions.

Done. And no matter typo while true do Cool.

We're now doing the Rap Wigum. While

true, we're deterministically malicking the harness or the or harness. And in

the bigger picture of this, Loom is the thing that actually deterministically malics or is the Ralph loop. And it's

running many Ralph loops and they're chain reactive as as much as need be like Erlang style.

And uh it is uh let me just check. I think I made a mistake here. So this is the idea. You

mistake here. So this is the idea. You

don't have to immediately go into full blown raft.

You can do it attended. I forgot to update the implementation plan when the

task is done. Y.

Cool.

And this sets up our state checkpoints.

Let's kick off Ralph again. So, this is the practice backwards, forwards. You

like you don't just let it rip. You you

you watch this. Well, you're watching it. I'll call out anything that I notice

it. I'll call out anything that I notice that's a little bit weird and I'll cancel this. I'll go back and adjust my

cancel this. I'll go back and adjust my prompt. Maybe this was a little bit more

prompt. Maybe this was a little bit more effort. I would like inspect the code to

effort. I would like inspect the code to make sure it's following conventions.

But like for me, at least with Loom, it really doesn't matter if it outputs something bad because I would just sit down there with a highlighter and have a

look what if it's bad. And that's just another Ralph loop to pull and curate and refactor the codebase to follow conventions. If if it gets

conventions. If if it gets internationalization, I would just that's just a raph loop to force it to do that. If it doesn't use aback uh for security, that's just

another RA loop. It's just different techniques of the RA loop to automate things. So, this is now running. So,

things. So, this is now running. So,

let's just kick off some music. I'm

going to go AFK and uh gonna go out tonight. So, cheers folks. Thanks for

tonight. So, cheers folks. Thanks for

tuning in. I hope this answers a lot of questions.

Um it is you got to approach this from first level's principles and you can do Ralph by hand because it's all about

deterministically malicking the array.

Loading...

Loading video analysis...