How OpenClaw Works: The Architecture Behind the 'Magic'
By Damian Galarza
Summary
## Key takeaways - **OpenClaw Not Sentient, Just Reactive**: OpenClaw isn't sentient. It doesn't think. It doesn't reason. It's just inputs, cues, and a loop. [00:00], [00:04] - **Gateway Routes Inputs to Agents**: Open Claw is an agent runtime with a gateway in front of it. The gateway routes inputs to agents. The agents do the work. The gateway manages the traffic. [01:23], [01:39] - **Five Inputs Create Autonomy Illusion**: Everything OpenClaw does starts with an input: messages from humans, heartbeats from a timer, crowns on a schedule, hooks from internal state changes, and web hooks from external systems. Agents can message other agents. [02:21], [02:46] - **Heartbeats Enable Proactive Checks**: The heartbeat is just a timer that fires every 30 minutes by default and sends the agent a prompt like 'Check my inbox for anything urgent. Review my calendar. Look for overdue tasks.' If nothing needs attention, it responds with a special token that's suppressed. [03:16], [04:03] - **Crowns Schedule Specific Tasks**: Crowns let you specify exactly when they fire and what instructions to send, like at 9:00 a.m. every day check my email and flag anything urgent, or every Monday at 3 p.m. review my calendar for conflicts. The agent that texted the wife used crown jobs like 'Good morning at 8 a.m.' [04:26], [05:02] - **26% Skills Have Vulnerabilities**: Cisco's security team analyzed the OpenClaw ecosystem and found that 26% of the 31,000 available skills contain at least one vulnerability. They called it a security nightmare. [08:23], [08:42]
Topics Covered
- Heartbeats Turn Time into Proactive Agent Input
- Crowns Schedule Precise Agent Behaviors
- OpenClaw Feels Alive via Reactive Inputs
- Agent Security Nightmare from Deep Access
Full Transcript
OpenClaw isn't sentient. It doesn't
think. It doesn't reason. It's just
inputs, cues, and a loop. But you've
seen the videos. Agents calling their owners at 3:00 a.m. Agents texting
people's wives and having full conversations. Agents that browse
conversations. Agents that browse Twitter overnight and improve themselves. A 100,000 GitHub stars in 3
themselves. A 100,000 GitHub stars in 3 days. Everyone's losing their minds. So,
days. Everyone's losing their minds. So,
why does it feel so alive? The answer is simpler than you think. And once you understand it, you can build your own.
Let me show you what's got everyone worked up. This guy's open claw agent
worked up. This guy's open claw agent got itself a Toio phone number overnight, connected to a voice API, and called him at 3:00 a.m. without being
asked. This one set up his agent to text his wife, "Good morning." 24 hours later, they were having full conversations, and he wasn't even involved. Open Claw hit a 100,000 GitHub
involved. Open Claw hit a 100,000 GitHub stars in 3 days. That's one of the fastest growing repositories in GitHub history. Wired covered it. Forbes
history. Wired covered it. Forbes
covered it. In the reactions, people are genuinely asking if this thing's sentient. If we've crossed some kind of
sentient. If we've crossed some kind of threshold, if this is the beginning of something we can't control. Here's the
thing. I get the excitement. And when I first saw these demos, I had the same reaction. But when I started asking how
reaction. But when I started asking how it actually works, and the answer isn't magic. It's elegant engineering.
magic. It's elegant engineering.
First, let's get the basics out of the way. Open Claw is an open source AI
way. Open Claw is an open source AI assistant created by Peter Steinberger, the founder of PSP PDF kit. The
technical description is simple. Open
Claw is an agent runtime with a gateway in front of it. That's it. A gateway
that routes inputs to agents. The agents
do the work. The gateway manages the traffic. The gateway is the key to
traffic. The gateway is the key to understanding everything. It's a
understanding everything. It's a longunning process that sits on your machine, constantly accepting connections. It connects to your
connections. It connects to your messaging apps, WhatsApp, Telegram, Discord, iMessage, Slack, and it routes messages to AI agents that can actually do things on your computer. But here's
what most people miss. The gateway
doesn't think. It doesn't reason.
Doesn't decide anything interesting. All
it does is accept inputs and route them to the right place. This is the part that matters. Open Cloud treats many
that matters. Open Cloud treats many different things as input, not just your chat messages. Once you understand what
chat messages. Once you understand what counts as an input, the whole alive feeling starts to make more sense. There
are five types of input. When you
combine them, you get a system that looks autonomous. But it's not. It's
looks autonomous. But it's not. It's
just reactive. Let me break them down.
Everything OpenCloud does starts with an input. Messages from humans, heartbeats
input. Messages from humans, heartbeats from a timer, crown jobs on a schedule, hooks from internal state changes, and web hooks from external systems. There's also one bonus. Agents can message other
agents. Let's step through each one.
agents. Let's step through each one.
Messages are the obvious one. You send a text, whether it's WhatsApp, iMessage, or Slack. The gateway receives it and
or Slack. The gateway receives it and routes it to an agent, and then you get a response. This is what most people
a response. This is what most people think of when they imagine AI assistance. You talk, it responds.
assistance. You talk, it responds.
Nothing revolutionary here. But here's a nice detail. Sessions are per channel.
nice detail. Sessions are per channel.
So, if you message on WhatsApp and then also ping it on Slack, those are going to be separate sessions with separate contexts. But within one conversation,
contexts. But within one conversation, if you fire off three requests while the agent is still busy, they queue up and process in order. No jumbled responses.
It just finishes one thought before moving on to the next.
Now, here's where things get interesting. There's heartbeats. The
interesting. There's heartbeats. The
heartbeat is just a timer. By default,
it fires every 30 minutes. When it
fires, the gateway schedules an agent turn just like it would a chat message.
You can figure what it does. You write
the prompt. Think about what this means.
Every 30 minutes, the timer fires and sends the agent a prompt. That prompt
might say, "Check my inbox for anything urgent. Review my calendar. Look for
urgent. Review my calendar. Look for
overdue tasks." The agent doesn't decide on its own to check these things. It's
responding to instructions just like any other message. It uses its tools, email
other message. It uses its tools, email access, calendar access, whatever you've connected, gathers the information, and reports back. If nothing needs
reports back. If nothing needs attention, it responds with a special token. Heartbeat, okay? And the system
token. Heartbeat, okay? And the system suppresses it. You never see it, but if
suppresses it. You never see it, but if something is urgent, you get a ping. You
can configure the interval, the prompt it uses, and even the hours it's active.
But the core idea is simple. Time itself
becomes an input.
This is the secret sauce. This is why Open Claw feels so proactive. The agent
keeps doing things even when you're not talking to it. But it's not really thinking. It's just responding to these
thinking. It's just responding to these timer events that you've preconfigured.
Similarly, you configure crowns. These
give you more control than heartbeats.
Instead of a regular interval, you can specify exactly when they fire and what instructions to send. One example, at 9:00 a.m. every day, check my email and
9:00 a.m. every day, check my email and flag anything urgent. Another, every
Monday at 3 p.m., review my calendar for the week and remind me of conflicts. At
midnight, browse my Twitter feed and save some interesting posts. Each crown
is scheduled event with its own prompt.
When the time hits, the event fires and the prompt gets sent to the agent and the agent executes. Remember the guy whose agent started texting his wife? He
set up a crown job. Good morning at 8 a.m. Good night at 10 p.m. Random
a.m. Good night at 10 p.m. Random
check-ins during the day. The agent
wasn't deciding to text her. A crown
event fired. The agent processed it. The
action happened to be send a message.
Simple as that.
Hooks are for internal state changes.
The system itself triggers these events.
When a gateway fires up, it fires a hook. When an agent begins a task,
hook. When an agent begins a task, there's another hook. When you issue a command like stop, there's a hook. It's
very much event- driven development.
This is how Open Claw manages itself. It
can save memory on reset, run setup instructions on startup, or modify context before an agent runs. Finally,
there's web hooks. They've been around for a long time. They allow external systems to talk to one another. When an
email hits your inbox, a web hook might fire, notifying Open Claw about it. A
Slack reaction comes in, another web hook fires. A Jira ticket gets created,
hook fires. A Jira ticket gets created, another web hook. Open Claw can receive web hooks from basically anything.
Slack, Discord, GitHub, they all have web hooks. So now your agent doesn't
web hooks. So now your agent doesn't just respond to you, it responds to your entire digital life. Email comes in, agent processes it. Calendar event
approaches, agent reminds you. Jira
ticket assigned, agent can start researching. There's also one more type
researching. There's also one more type of input that's agents that can message other agents. Open clause supports
other agents. Open clause supports multi- aent setups. You can have separate agents with isolated workspaces and you can enable them to pass messages between each other. Each agent can have different profiles. For example, you can
different profiles. For example, you can have one that's a research agent and another that's a writing agent. When
agent A finishes its job, it can queue up work for agent B. It can look like collaboration, but again, it's just messages entering cues. So, let's go back to our most dramatic example. The
agent that called its owner at 3:00 a.m.
From the outside, this looks like an autonomous behavior. The agent decided
autonomous behavior. The agent decided to get a phone number. It decided to call. It waited until 3:00 a.m. But
call. It waited until 3:00 a.m. But
here's what we know happened under the hood. At some point, some event fired.
hood. At some point, some event fired.
Maybe a crown, maybe a heartbeat. We
don't know the exact configuration. The
event entered the queue. The agent
processed it. Based on whatever instructions it had and the available tools it had, it acquired a Toyo phone number and made the call. The owner
didn't ask for this in the moment, but somewhere in the setup, the behavior was enabled. Time produced an event. The
enabled. Time produced an event. The
event kicked off the agent. The agent
followed its instructions. Nothing was
thinking overnight. Nothing was
deciding. Time produced an event. The
events kicked off an agent. The agent
followed its instructions. Put it all together and here's what you get. Time
creates events through heartbeats and crowns. Humans create events through
crowns. Humans create events through messages. External systems create events
messages. External systems create events through web hooks. Internal state
changes create events through hooks. And
agents create events for other agents.
All of them enter a queue. The queue
gets processed. Agents execute. State
persists. And that's the key. Open cloud
storage's memory is local markdown files. your preferences, your
files. your preferences, your conversation history, context from previous sessions, so that when the agent wakes up on a heartbeat, it remembers what you talked about yesterday. It's not learning in real
yesterday. It's not learning in real time. It's reading from files you could
time. It's reading from files you could open in a text editor and the loop just continues from the outside. That looks
like sentience, a system that acts on its own, that makes decisions, that seems alive.
But really, it's inputs, cues, and a loop.
Now, I'd be doing you a disservice if I didn't mention the other side of this.
OpenClaw can do all of this because it has deep access to your system. It can
run shell commands, read and write files, execute scripts, and control your browser. Cisco's security team analyzed
browser. Cisco's security team analyzed the OpenClaw ecosystem and found that 26% of the 31,000 available skills contain at least one vulnerability. They
called it, and I quote, a security nightmare. The risks are real. Prompt
nightmare. The risks are real. Prompt
injection through emails or documents.
Malicious skills in the marketplace.
Credential exposure. command
misinterpretation that deletes the files you didn't even mean to. Open Claw's own documentation says there's no perfectly secure setup. I'm not saying not to use
secure setup. I'm not saying not to use it. I'm just saying you need to know
it. I'm just saying you need to know what you're deploying. This is powerful precisely because it has access and access cuts both ways. If you're going to run this, run it on a secondary
machine using isolated accounts. Limit
the skills you enable. Monitor the logs.
If you want to try it out without giving it full access to your machine, Railway has a one-click deployment that runs in an isolated container. Link in the description.
So, what's the takeaway here? Open Claw
isn't magic. It's a well-designed system with four components. Time that produces events, events that trigger agents, state that persists across interactions, and a loop that keeps processing. You
can build this architecture yourself.
You don't need open clause specifically.
You need a way to schedule events, cue them, and then process them with an LLM and maintain state. This pattern is going to show up everywhere. Every AI
agent framework that feels alive is doing some version of this. Heartbeats,
crowns, web hooks, event loops.
Understanding this architecture means you can evaluate these tools intelligently. You can build your own
intelligently. You can build your own and you won't get caught up in the hype when the next one goes viral. If you
want to go deeper on agent architectures, I've linked the open claw docs, clairvo's original thread that inspired this breakdown, and the security research in the description. If
you're building AI powered applications, especially with Ruby on Rails, that's what this channel's all about.
Subscribe, and I'll see you in the next one.
Loading video analysis...