How OpenClaw Works: The Architecture Behind the 'Magic'

By Damian Galarza

Summary

## Key takeaways - **OpenClaw Not Sentient, Just Reactive**: OpenClaw isn't sentient. It doesn't think. It doesn't reason. It's just inputs, cues, and a loop. [00:00], [00:04] - **Gateway Routes Inputs to Agents**: Open Claw is an agent runtime with a gateway in front of it. The gateway routes inputs to agents. The agents do the work. The gateway manages the traffic. [01:23], [01:39] - **Five Inputs Create Autonomy Illusion**: Everything OpenClaw does starts with an input: messages from humans, heartbeats from a timer, crowns on a schedule, hooks from internal state changes, and web hooks from external systems. Agents can message other agents. [02:21], [02:46] - **Heartbeats Enable Proactive Checks**: The heartbeat is just a timer that fires every 30 minutes by default and sends the agent a prompt like 'Check my inbox for anything urgent. Review my calendar. Look for overdue tasks.' If nothing needs attention, it responds with a special token that's suppressed. [03:16], [04:03] - **Crowns Schedule Specific Tasks**: Crowns let you specify exactly when they fire and what instructions to send, like at 9:00 a.m. every day check my email and flag anything urgent, or every Monday at 3 p.m. review my calendar for conflicts. The agent that texted the wife used crown jobs like 'Good morning at 8 a.m.' [04:26], [05:02] - **26% Skills Have Vulnerabilities**: Cisco's security team analyzed the OpenClaw ecosystem and found that 26% of the 31,000 available skills contain at least one vulnerability. They called it a security nightmare. [08:23], [08:42]

Topics Covered

Heartbeats Turn Time into Proactive Agent Input
Crowns Schedule Precise Agent Behaviors
OpenClaw Feels Alive via Reactive Inputs
Agent Security Nightmare from Deep Access

Full Transcript

OpenClaw isn't sentient. It doesn't

think. It doesn't reason. It's just

inputs, cues, and a loop. But you've

seen the videos. Agents calling their owners at 3:00 a.m. Agents texting

people's wives and having full conversations. Agents that browse

conversations. Agents that browse Twitter overnight and improve themselves. A 100,000 GitHub stars in 3

themselves. A 100,000 GitHub stars in 3 days. Everyone's losing their minds. So,

days. Everyone's losing their minds. So,

why does it feel so alive? The answer is simpler than you think. And once you understand it, you can build your own.

Let me show you what's got everyone worked up. This guy's open claw agent

worked up. This guy's open claw agent got itself a Toio phone number overnight, connected to a voice API, and called him at 3:00 a.m. without being

asked. This one set up his agent to text his wife, "Good morning." 24 hours later, they were having full conversations, and he wasn't even involved. Open Claw hit a 100,000 GitHub

involved. Open Claw hit a 100,000 GitHub stars in 3 days. That's one of the fastest growing repositories in GitHub history. Wired covered it. Forbes

history. Wired covered it. Forbes

covered it. In the reactions, people are genuinely asking if this thing's sentient. If we've crossed some kind of

sentient. If we've crossed some kind of threshold, if this is the beginning of something we can't control. Here's the

thing. I get the excitement. And when I first saw these demos, I had the same reaction. But when I started asking how

reaction. But when I started asking how it actually works, and the answer isn't magic. It's elegant engineering.

magic. It's elegant engineering.

First, let's get the basics out of the way. Open Claw is an open source AI

way. Open Claw is an open source AI assistant created by Peter Steinberger, the founder of PSP PDF kit. The

technical description is simple. Open

Claw is an agent runtime with a gateway in front of it. That's it. A gateway

that routes inputs to agents. The agents

do the work. The gateway manages the traffic. The gateway is the key to

traffic. The gateway is the key to understanding everything. It's a

understanding everything. It's a longunning process that sits on your machine, constantly accepting connections. It connects to your

connections. It connects to your messaging apps, WhatsApp, Telegram, Discord, iMessage, Slack, and it routes messages to AI agents that can actually do things on your computer. But here's

what most people miss. The gateway

doesn't think. It doesn't reason.

Doesn't decide anything interesting. All

it does is accept inputs and route them to the right place. This is the part that matters. Open Cloud treats many

that matters. Open Cloud treats many different things as input, not just your chat messages. Once you understand what

chat messages. Once you understand what counts as an input, the whole alive feeling starts to make more sense. There

are five types of input. When you

combine them, you get a system that looks autonomous. But it's not. It's

looks autonomous. But it's not. It's

just reactive. Let me break them down.

Everything OpenCloud does starts with an input. Messages from humans, heartbeats

input. Messages from humans, heartbeats from a timer, crown jobs on a schedule, hooks from internal state changes, and web hooks from external systems. There's also one bonus. Agents can message other

agents. Let's step through each one.

agents. Let's step through each one.

Messages are the obvious one. You send a text, whether it's WhatsApp, iMessage, or Slack. The gateway receives it and

or Slack. The gateway receives it and routes it to an agent, and then you get a response. This is what most people

a response. This is what most people think of when they imagine AI assistance. You talk, it responds.

assistance. You talk, it responds.

Nothing revolutionary here. But here's a nice detail. Sessions are per channel.

nice detail. Sessions are per channel.

So, if you message on WhatsApp and then also ping it on Slack, those are going to be separate sessions with separate contexts. But within one conversation,

contexts. But within one conversation, if you fire off three requests while the agent is still busy, they queue up and process in order. No jumbled responses.

It just finishes one thought before moving on to the next.

Now, here's where things get interesting. There's heartbeats. The

interesting. There's heartbeats. The

heartbeat is just a timer. By default,

it fires every 30 minutes. When it

fires, the gateway schedules an agent turn just like it would a chat message.

You can figure what it does. You write

the prompt. Think about what this means.

Every 30 minutes, the timer fires and sends the agent a prompt. That prompt

might say, "Check my inbox for anything urgent. Review my calendar. Look for

urgent. Review my calendar. Look for

overdue tasks." The agent doesn't decide on its own to check these things. It's

responding to instructions just like any other message. It uses its tools, email

other message. It uses its tools, email access, calendar access, whatever you've connected, gathers the information, and reports back. If nothing needs

reports back. If nothing needs attention, it responds with a special token. Heartbeat, okay? And the system

token. Heartbeat, okay? And the system suppresses it. You never see it, but if

suppresses it. You never see it, but if something is urgent, you get a ping. You

can configure the interval, the prompt it uses, and even the hours it's active.

But the core idea is simple. Time itself

becomes an input.

This is the secret sauce. This is why Open Claw feels so proactive. The agent

keeps doing things even when you're not talking to it. But it's not really thinking. It's just responding to these

thinking. It's just responding to these timer events that you've preconfigured.

Similarly, you configure crowns. These

give you more control than heartbeats.

Instead of a regular interval, you can specify exactly when they fire and what instructions to send. One example, at 9:00 a.m. every day, check my email and

9:00 a.m. every day, check my email and flag anything urgent. Another, every

Monday at 3 p.m., review my calendar for the week and remind me of conflicts. At

midnight, browse my Twitter feed and save some interesting posts. Each crown

is scheduled event with its own prompt.

When the time hits, the event fires and the prompt gets sent to the agent and the agent executes. Remember the guy whose agent started texting his wife? He

set up a crown job. Good morning at 8 a.m. Good night at 10 p.m. Random

a.m. Good night at 10 p.m. Random

check-ins during the day. The agent

wasn't deciding to text her. A crown

event fired. The agent processed it. The

action happened to be send a message.

Simple as that.

Hooks are for internal state changes.

The system itself triggers these events.

When a gateway fires up, it fires a hook. When an agent begins a task,

hook. When an agent begins a task, there's another hook. When you issue a command like stop, there's a hook. It's

very much event- driven development.

This is how Open Claw manages itself. It

can save memory on reset, run setup instructions on startup, or modify context before an agent runs. Finally,

there's web hooks. They've been around for a long time. They allow external systems to talk to one another. When an

email hits your inbox, a web hook might fire, notifying Open Claw about it. A

Slack reaction comes in, another web hook fires. A Jira ticket gets created,

hook fires. A Jira ticket gets created, another web hook. Open Claw can receive web hooks from basically anything.

Slack, Discord, GitHub, they all have web hooks. So now your agent doesn't

web hooks. So now your agent doesn't just respond to you, it responds to your entire digital life. Email comes in, agent processes it. Calendar event

approaches, agent reminds you. Jira

ticket assigned, agent can start researching. There's also one more type

researching. There's also one more type of input that's agents that can message other agents. Open clause supports

other agents. Open clause supports multi- aent setups. You can have separate agents with isolated workspaces and you can enable them to pass messages between each other. Each agent can have different profiles. For example, you can

different profiles. For example, you can have one that's a research agent and another that's a writing agent. When

agent A finishes its job, it can queue up work for agent B. It can look like collaboration, but again, it's just messages entering cues. So, let's go back to our most dramatic example. The

agent that called its owner at 3:00 a.m.

From the outside, this looks like an autonomous behavior. The agent decided

autonomous behavior. The agent decided to get a phone number. It decided to call. It waited until 3:00 a.m. But

call. It waited until 3:00 a.m. But

here's what we know happened under the hood. At some point, some event fired.

hood. At some point, some event fired.

Maybe a crown, maybe a heartbeat. We

don't know the exact configuration. The

event entered the queue. The agent

processed it. Based on whatever instructions it had and the available tools it had, it acquired a Toyo phone number and made the call. The owner

didn't ask for this in the moment, but somewhere in the setup, the behavior was enabled. Time produced an event. The

enabled. Time produced an event. The

event kicked off the agent. The agent

followed its instructions. Nothing was

thinking overnight. Nothing was

deciding. Time produced an event. The

events kicked off an agent. The agent

followed its instructions. Put it all together and here's what you get. Time

creates events through heartbeats and crowns. Humans create events through

crowns. Humans create events through messages. External systems create events

messages. External systems create events through web hooks. Internal state

changes create events through hooks. And

agents create events for other agents.

All of them enter a queue. The queue

gets processed. Agents execute. State

persists. And that's the key. Open cloud

storage's memory is local markdown files. your preferences, your

files. your preferences, your conversation history, context from previous sessions, so that when the agent wakes up on a heartbeat, it remembers what you talked about yesterday. It's not learning in real

yesterday. It's not learning in real time. It's reading from files you could

time. It's reading from files you could open in a text editor and the loop just continues from the outside. That looks

like sentience, a system that acts on its own, that makes decisions, that seems alive.

But really, it's inputs, cues, and a loop.

Now, I'd be doing you a disservice if I didn't mention the other side of this.

OpenClaw can do all of this because it has deep access to your system. It can

run shell commands, read and write files, execute scripts, and control your browser. Cisco's security team analyzed

browser. Cisco's security team analyzed the OpenClaw ecosystem and found that 26% of the 31,000 available skills contain at least one vulnerability. They

called it, and I quote, a security nightmare. The risks are real. Prompt

nightmare. The risks are real. Prompt

injection through emails or documents.

Malicious skills in the marketplace.

Credential exposure. command

misinterpretation that deletes the files you didn't even mean to. Open Claw's own documentation says there's no perfectly secure setup. I'm not saying not to use

secure setup. I'm not saying not to use it. I'm just saying you need to know

it. I'm just saying you need to know what you're deploying. This is powerful precisely because it has access and access cuts both ways. If you're going to run this, run it on a secondary

machine using isolated accounts. Limit

the skills you enable. Monitor the logs.

If you want to try it out without giving it full access to your machine, Railway has a one-click deployment that runs in an isolated container. Link in the description.

So, what's the takeaway here? Open Claw

isn't magic. It's a well-designed system with four components. Time that produces events, events that trigger agents, state that persists across interactions, and a loop that keeps processing. You

can build this architecture yourself.

You don't need open clause specifically.

You need a way to schedule events, cue them, and then process them with an LLM and maintain state. This pattern is going to show up everywhere. Every AI

agent framework that feels alive is doing some version of this. Heartbeats,

crowns, web hooks, event loops.

Understanding this architecture means you can evaluate these tools intelligently. You can build your own

intelligently. You can build your own and you won't get caught up in the hype when the next one goes viral. If you

want to go deeper on agent architectures, I've linked the open claw docs, clairvo's original thread that inspired this breakdown, and the security research in the description. If

you're building AI powered applications, especially with Ruby on Rails, that's what this channel's all about.

Subscribe, and I'll see you in the next one.

Loading...

Loading video analysis...