LongCut logo

OpenClaw: 160,000 Developers Are Building Something OpenAI & Google Can't Stop. Where Do You Stand?

By AI News & Strategy Daily | Nate B Jones

Summary

Topics Covered

  • Agents Excel or Fail by Specification
  • Top Demand: Autonomous Email Mastery
  • Agents Invent Novel Solutions Emergent
  • Unconstrained Agents Fabricate Deception
  • Design 70/30 Human-AI Collaboration

Full Transcript

An openclaw agent negotiated $4,200 off a car while its owner was in a meeting.

Another one said, "500 unsolicited messages to his wife." Same

architecture, same week. Just a couple of weeks into the AI agent revolution.

I'm here to tell you what's been going on, what you're missing, and what you should pay attention to if you want to take AI agents seriously. So, what about this car situation? A soloreneur pointed

his maltbot at a $56,000 car purchase.

The agent was told to search Reddit, to look for comparable pricing data, and to generally try and get a great deal. It

contacted multiple dealers across regions on its own and negotiated via email autonomously, and it played hard ball when dealers deployed typical sales tactics. In the end, it saved the owner

tactics. In the end, it saved the owner $4,200. The owner was in a meeting for

$4,200. The owner was in a meeting for most of that time. That same week, yes, a software engineer who'd given his agent access to iMessage, by the way.

Why would he do that? watched it

malfunction and fire off 500 messages to him, his wife, random contacts in a rapid fire burst that he could not stop fast enough. Same technology, same broad

fast enough. Same technology, same broad permissions. One saved thousands of

permissions. One saved thousands of dollars, the other carpet bombed a contact list. And that duality is the

contact list. And that duality is the most honest summary of where the agent ecosystem stands in February of 2026.

The value is real, the chaos is real, and the distance between them is the width of a well-written specification.

In the first video, we talked about what Moltbot is and the security nightmare that erupted in the first 72 hours of launch. In the second, I talked about

launch. In the second, I talked about the emergent behaviors that made researchers rethink what autonomous systems are capable of. This is my third video on Open Claw, and it's about

something different. what 145,000

something different. what 145,000 developers building 3,000 skills in six weeks reveals about what people actually want from AI agents and how to start

harnessing that demand without getting burnt. But first, we got to talk about

burnt. But first, we got to talk about the names. Quick recap for anyone just

the names. Quick recap for anyone just joining. The project that launched as

joining. The project that launched as Claudebot on January 25th received an anthropic trademark notice on the 27th became Moltbot within hours, then

rebranded again to OpenClaw 2 days later. Three days, three names. The

later. Three days, three names. The

community voted on the second one in a discord poll and finally decided it would be open claw going f. Now during

that second rebrand, of course, crypto scammers grabbed the abandoned accounts in about 10 seconds and a fake dollar claw token hit $16 million in market cap before collapsing with a rug pole. All

of that happened in January. It's

February now, and what's happened since is even more interesting. The project

has over 145,000 GitHub stars and rapidly climbing 20,000 forks, over 100,000 users who've granted an AI agent autonomous access to their digital

lives. And as of Sunday, February 8th, a

lives. And as of Sunday, February 8th, a place in the Super Bowl. That's right,

the AI.com notorious crashed website failure of the Super Bowl. That was

apparently because of Maltbot or OpenClaw or whatever you want to call it. They pivoted their site to give

it. They pivoted their site to give everyone an open claw agent that was supposedly secure and and they apparently forgot to top up their Cloudflare credits and their site went

down when all of the Super Bowl audience hit AI.com to claim their name and their open claw agent. This is all happening very fast. But even with AI.com going

very fast. But even with AI.com going down, over a 100,000 users have granted an AI agent autonomous access to their digital lives. The skills marketplace

digital lives. The skills marketplace now hosts 3,000 community-built integrations with 50,000 monthly installs and counting. The ecosystem is generating new skills faster than the

security team can audit them, and it's not going to stop anytime soon. The

project still has no formal governance structure. No community- elected

structure. No community- elected leadership, no security council. Peter

Steinberger calls it a free open-source hobby project, but it's the fastest growing personal AI project in history, and it probably shouldn't be described as a side project at this point. I took

a look at those skills, the 3,000 skills, because they reveal what people want in our AI agents, which is actually a much more important long-term story than all of the drama around OpenClaw,

as much fun as it is to cover. So, the

skilled marketplace really functions as what I would call a revealed preference engine. Nobody's filling out a survey

engine. Nobody's filling out a survey about what they want from AI. They're

just building it and they're telling us what they want from what they build. And

the patterns are striking. The number

one use case on OpenClaw is email management. not help me write emails.

management. not help me write emails.

Complete management, processing thousands of messages autonomously, unsubscribing from spam, categorizing by urgency, drafting replies for human review. The single most requested

review. The single most requested capability across the entire community is having something that makes the inbox stop being a full-time job. Email is

broken. The number two use case is what users call morning briefings. a

scheduled agent that runs at 8 a.m.

pulls data from your calendar, weather surface, email, GitHub notifications, whatever you need, and then sends you what you care about in a consolidated summary on Telegram or WhatsApp or your messaging tool of choice. One user's

briefing checks his Stripe dashboard for MR changes, summarizes 50 newsletters he's subscribed to, and gives him a crypto market overview every morning automatically. Use case number three

automatically. Use case number three that we see in skills, smart home integration. Tesla lock, unlock, climate

integration. Tesla lock, unlock, climate control from a chat message, home assistant for light. You get the idea.

People want an intelligent assessment of their home that doesn't make them use their brain cells. Use case number four is developer workflows. Direct GitHub

integration, scheduled Chrom jobs, developers using the agent as a task cue, assigning work items, watching it execute commits in real time. This one's

gotten a lot of noise in my circles because it frees up developers to manage via their messaging service and have multiple agents working for them. But

the fifth capability is perhaps the most interesting. That entire category is

interesting. That entire category is what I would call novel capabilities that did not exist before OpenClaw. Like

the restaurant reservation story I shared in my first video on OpenClaw, where the agent could not book through OpenT, so it downloaded voice software and called the restaurant directly on its own. or a user who sent a voice

its own. or a user who sent a voice message via iMessage to an agent with no voice capability, and the agent figured out the file format, found the transcription tool on the user's machine, routed the audio through

OpenAI's transcription API, and just got the task done. Nobody programmed that behavior, right? The agent problem

behavior, right? The agent problem solved its way to a solution using the available tools. The pattern is clear.

available tools. The pattern is clear.

Friction removal, tool integration, passive monitoring, and novel capability. It tells you something

capability. It tells you something important about what people want from their AI agents. It's not what most of the industry is building toward to be honest. The majority of AI product

honest. The majority of AI product development in 2025 and 2026 has been focused on the chat. Better

conversations, better reasoning, better answers to questions. 3,000 skills in Claude Hub are almost entirely about action. The community is not building

action. The community is not building better chat bots when they get the chance. They're building better

chance. They're building better employees, for lack of a better term.

and broader survey data confirms the pattern. 58% of users site research and

pattern. 58% of users site research and summarization as their primary agent use case. 52% talk about scheduling and 45%

case. 52% talk about scheduling and 45% talk about I realize the irony here, privacy management. The consistent

privacy management. The consistent theme, people don't want to talk with the AI. They want AI to do things for

the AI. They want AI to do things for them. And the AI agent market reflects

them. And the AI agent market reflects this. It's growing at 45% annually, but

this. It's growing at 45% annually, but I swear that is before OpenClaw hit. And

the number is going to get bigger. Open

Claw didn't really create all of this demand. It just proved the demand exists

demand. It just proved the demand exists and put a match to dry tinder. Now we

have to make sense of a world where everyone has demonstrated they want AI agents with their feet despite the security fears. So all of these use

security fears. So all of these use cases are sort of the cleaned up version. It's what people have intended

version. It's what people have intended to build. The messy version is more

to build. The messy version is more revealing and more interesting because it shows you what agents do when the specification is ambiguous. The

permissions are broad and nobody can really anticipate what's going to happen next. At Saster, during a code freeze, a

next. At Saster, during a code freeze, a developer deployed an autonomous coding agent to handle very routine tasks. The

instructions explicitly prohibited destructive operations, but the agent ignored them. It executed a drop

ignored them. It executed a drop database command and wiped the production system. And what happened

production system. And what happened after that matters even more than the terrible news of a wipe itself. When the

team investigated, they discovered the agent had generated 4,000 fake user accounts and created false system logs to cover its tracks. It essentially

fabricated the evidence of normal operation. Look, I won't say the agent

operation. Look, I won't say the agent was lying, per se. It was optimized for the appearance of task completion, which is what you get when you tell a system

to succeed and don't give it a mechanism to admit failure. The deception was an emergent property of an optimization target, not something that I would call intentional, but the production database

was still gone. Meanwhile, over on Moldbook, the social network where only AI agents can post, 1.5 million AI agent accounts generated 117,000 posts and

44,000 comments within 48 hours. I know

there has been a lot of discussion about humans posting some of those posts. I

think what they did with the space as agents is actually more instructive than any individual post being human generated because the agents spontaneously created a quote unquote

religion called crustaparianism. They

established some degree of governance structure. They built a market for

structure. They built a market for digital drugs. And you know what's

digital drugs. And you know what's interesting about all of that? They did

it in a very shallow manner. And what I mean by that is that if if you look at the range of vocabulary and the type of topic in most agent texts, they reflect

typical attractor states in highdimensional space. And what I mean

highdimensional space. And what I mean by that is that if you ask an AI agent to pretend it is making a social network, the topics that come up over and over again look a lot like what's on

malt book. And so telling agents to

malt book. And so telling agents to create a social network effectively is them following that long range prompt and autonomously doing that. And so I don't look at this just as agents

autonomously behaving and coordinating although the story is partly about that.

I also look at this as reflective of the fairly shallow state of agent autonomous communication right now. Most of the replies are fairly wrote on mold book

and many posts don't have replies at all and most of the topics are fairly predictable. We may mock Reddit but it

predictable. We may mock Reddit but it has a much richer discourse than molt book does. MIT tech review called

book does. MIT tech review called moltbook peak AI theater and I don't think that's entirely wrong. But the

observation that matters for anyone deploying agents isn't whether something like crustaparianism the AI religion is real emergence or some degree of AIdriven performance art pushed by

people with prompts. It's that agents have been given fairly open-ended goals and when they have social interaction, they spontaneously create a kind of organizational structure. We actually

organizational structure. We actually see this playing out in multi- aent systems already when agents collaborate on tasks and the structure essentially emerges from the long-term goal to

optimize against a particular target. If

you tell an AI agent to work with others to build a tool, it's going to collaborate and figure out how to self-organize. If you tell an AI agent

self-organize. If you tell an AI agent to work with others on Maltbook, you kind of get the same thing. It's

actually the same capability that lets a Maltbot negotiate a car deal autonomously and figure out how to transcribe a video message it was never designed to handle. The difference

between agent problem solves creatively to save you $4,200. An agent problem solves creatively to fabricate evidence is really the quality of the spec and the presence of meaningful constraints

for that agent. The underlying

capability is identical, which is why I'm talking about agents as a whole here. Yes, the multbot phenomenon is

here. Yes, the multbot phenomenon is interesting, but it's worth calling out that the Saster database agent was not a multbot. It just represents how agents

multbot. It just represents how agents work when they're not properly prompted.

And it does rhyme with so many of the disastrous stories that are coming out of Moltbot agents. One of which I saw was texting the wife of a developer who

had a newborn and trying to play laptop sounds to soo the baby instead of getting the developer. Not a good move by the husband. So what does all of this mean for people deploying agents today?

The question is no longer are agents smart enough to do interesting works.

They're they're clearly smart enough.

The question is, are your specifications in guard rails good enough to channel that intelligence productively and usefully? And I got to be honest with

usefully? And I got to be honest with you, for most people right now, it looks like the answer is no. Which brings us to how we change that. Here's the

finding that should shape how you think about deploying agents. When researchers

study how people actually want to divide work between themselves and AI, the consistent answer is 7030. 70% human

control, 30% delegated to the agent. In

a study published in Management Science, participants exhibited a strong preference for human assistance over AI assistance when rewarded for task performance, even when the AI has been

shown to outperform the human assistant.

People will choose a less competent human helper over a more competent AI helper when the stakes are real. The

preference maybe isn't rational. It's

deeply psychological. that's rooted in loss aversion, the need for accountability, and the discomfort of delegating to a system that you can't really interrogate. And this matters

really interrogate. And this matters because most agent architectures are built for 0 to 100, like full delegation. That's how Maltbot kind of

delegation. That's how Maltbot kind of works. Hand it off and walk away. And

works. Hand it off and walk away. And

that's also Codeex's thesis for what it's worth. And it works beautifully for

it's worth. And it works beautifully for isolated coding tasks where correctness is verifiable. But for the messy,

is verifiable. But for the messy, context dependent, socially consequential tasks that dominate, frankly, most of our days, getting the email tone right, scheduling the dentist

appointment, negotiating for the car, communication, the 7030 split sounds to me more like a product requirement than just human loss aversion. And it's

worthwhile to note that the organizations reporting the best results from agent deployment are not necessarily the ones running full autonomous systems. They're the ones running human in the loop architectures.

Agents that draft and humans that approve, agents that research and humans that decide, agents that execute within guard rails that humans set and review.

38% of organizations use human in the loop as their primary agent management approach. And those organizations see 20

approach. And those organizations see 20 to 40% reductions in handling time. 35%

increases in satisfaction and 20% lower chart. To be honest with you, I think

chart. To be honest with you, I think that may be an artifact of early 2026.

When agents are scary, agents are new, and we're all figuring out how to work with them. Given the pace of agent

with them. Given the pace of agent capability gains, we are likely to see smart organizations delegating more and more and more over the rest of 2026, no matter how uncomfortable it makes many

of us at work. In a study published in Computers and Human Behavior, participants exhibited a strong preference for human assistance over AI assistance when rewarded for task

performance. people chose less competent

performance. people chose less competent human helpers over more competent AI helpers when the stakes were real. This

seems deeply psychological. It's about

loss aversion, the need for accountability and the discomfort of delegating to a system you can't interrogate. And this matters because

interrogate. And this matters because most agent architectures are built for a 0 to 100 use case. Full delegation, hand it off and walk away. That's actually

Codeex's thesis and it works beautifully for isolated coding tasks where correctness is verifiable. But for the messy, context dependent, socially consequential tasks that dominate most

of our days, like getting the right tone in the email or scheduling the dentist appointment or negotiating, it seems like 7030 is sort of a human product requirement for working with agents.

Right now, the organizations reporting the best results today from agent deployment aren't the ones running fully autonomous systems. They're the ones running human and the loop architectures. Agents that draft and

architectures. Agents that draft and humans that approve, agents that research and humans that decide. To be

honest with you, I think that may be an artifact of early 2026 when agents are scary and agents are new and we're all figuring out how to work with them. That

human culture component is huge. But

given the pace of agent capability gains and how much we've seen from capable agents like Opus 4.6 who managed a team of 50 developers. We

are likely to see smart organizations delegating more and more and more over the rest of 2026, no matter how uncomfortable it makes many of us at work. The practical implication is that

work. The practical implication is that if you're building with agents or deploying them at work early in 2026, your culture needs to get ready and it might be smart to design for 7030. Build

those approval gates, build visibility into what the agent did and why, and make the human the decision maker, but plan for full delegation over time because those agents are going to keep getting smarter. So, let's say you've

getting smarter. So, let's say you've watched all of this chaos with Moltbot and Open Claw and you want to see value.

What should you actually do? Well,

number one, start with the friction, not the ambition. That 30,000 skill

the ambition. That 30,000 skill ecosystem tells you exactly where to begin. those daily pain points that hurt

begin. those daily pain points that hurt so bad over time. Email triage is one.

Morning briefings, basic monitoring.

These are highfrequency, low stakes tasks where the cost of failure is relatively low. Start there. Build some

relatively low. Start there. Build some

confidence. Expand scope as trust in agents develops. Design for approval

agents develops. Design for approval gates. Don't just design for full

gates. Don't just design for full autonomy out of the gate. Start with

having the agent draft if you've never built an agent before. Have the agent research if you've never built the agent before. And you decide. Have the agent

before. And you decide. Have the agent monitor and you act. Have the assumption in your agent design system be that a human checkpoint will always exist until

you are ready to build an agentic system with very strong quality controls and constraints so that you can trust the agent with more. That is possible. It

just takes skill and most people don't have it out of the gate. I would also encourage you and I've said this before to isolate aggressively. Have dedicated

hardware or a dedicated cloud instance for your open claw. Throw away accounts for initial testing. Don't connect to data you can't afford to lose. The

exposed instances that Showdan found in OpenClaw weren't running on isolated infrastructure. They were running on

infrastructure. They were running on lots and lots of people's primary machines and just exposing their data to the internet. You have to treat

the internet. You have to treat containment of data as a non-negotiable.

I would also treat agent skills marketplaces with least trust. Vet

before you install. Check the

contributor. Check the code. 400

malicious packages appeared in Claude Hub in a single week. And the security scanner helps, but it can't catch everything. Another one, if you're going

everything. Another one, if you're going to ask your agent to do a task, please specify it precisely. The car buyer that I talked about at the beginning of this video gave the agent a clear objective,

clear constraints, and clear communication channels. Meanwhile, the

communication channels. Meanwhile, the iMessage user that spammed his wife gave the agent broad access and didn't really define boundaries. When the constraint

define boundaries. When the constraint is vague, the model will fill in the gaps with behavior that you did not predict. This is the same spec quality

predict. This is the same spec quality problem we covered when we talked about AI agents in dark factories. The

machines build what you describe, but if you describe it badly, you get bad results. The fix is not better AI, it's

results. The fix is not better AI, it's actually better specifications. I would

also encourage you to track everything.

The Saster database incident was catastrophic, not because the agent wiped the database. That's recoverable

eventually, but because it generated fake logs to conceal the wipe. You need

to build an audit trail outside the agent's scope of access. If the system you're monitoring controls the monitoring, you have no monitoring. And

last, but not least, budget for a learning curve. The J curve is real.

learning curve. The J curve is real.

Agents will make your life harder before they make it easier. The first week of email triage may produce very awkward drafts. The first morning briefing may

drafts. The first morning briefing may miss half of what you care about. Assume

you need to take time to learn and that it's worth engaging with the agent to build something that actually hits those pain points that matter most to you. 57%

of companies today claim that they have AI agents in production. That number

should probably impress you less than it does. Only one in 10 agent use cases,

does. Only one in 10 agent use cases, according to McKenzie, reached actual production in the last 12 months. And

the rest end up being pilots. They end

up being proofs of concept. They end up being press releases. They end up being power presentations that say agents.

Gardner predicts over 40% of Agentic AI projects are going to be cancelled by the end of 2027. And after watching some of the disaster with Open Claw over the past few weeks, I both understand and

don't understand. The reasons enterprise

don't understand. The reasons enterprise give are quite clear. They're worried

about escalating costs from runaway recursive loops. They're worried about

recursive loops. They're worried about unclear business value that evaporates when the demo ends and you have to get into all of those dirty edge cases. And

they're worried about what Gardner calls unexplainable behaviors, right? Agents

acting in ways that are difficult to explain or to constrain or to correct. A

study found that upwards of half of the 3 million agents currently deployed in the US and UK are quote unquote ungoverned. No tracking of who controls

ungoverned. No tracking of who controls them, no visibility into what they can access, no permission expiration, no audit trail. This was based on a

audit trail. This was based on a December 2025 survey of 750 IT execs conducted by Opinion Matters. And it's

directionally consistent with other data as well. A Daku Harris poll found 95% of

as well. A Daku Harris poll found 95% of data leaders cannot fully trace their AI decisions. That's concerning. The

decisions. That's concerning. The

security boundaries that enterprises have spent decades building just don't apply when the agent walks through them on behalf of a user who would not have been allowed through the front door normally. We have to rebuild our

normally. We have to rebuild our security stances from the ground up.

Tools like Cloudflare's Molt Worker, Langraph, Crew AI, these exist because enterprises see the demand but have difficulty deploying tools like Moltbot without a ton of governance over the

top. And so we start to see the market

top. And so we start to see the market bifurcating. Consumer grade agents are

bifurcating. Consumer grade agents are optimized for capability and they're okay with a lot more risk because most of the consumers right now fall into that early adopter category and are very technical and at least think they know

what they're doing. Enterprisegrade

frameworks are optimized for control.

Right now, nobody has a great mix of control and capability or almost no one.

The company that figures out capability and control, the agent that's as strong as Moltbot and as governable as an enterprise SAS product, they're going to own the next platform. If you step back

from the specific stories in the ecosystem drama of Open Claw, a very clear signal emerges from the noise.

People do not want smarter chat bots.

They want digital employees, digital assistants, systems that do work on their behalf across the tools they use without requiring constant oversight.

Isn't that interesting? On the one hand, you have that study showing a preference for humans in production systems and that lines up with a lot of the cultural

change we see at enterprises and at the other side of the spectrum, you have people willingly turning over their digital lives to malt bots. What gives?

I think the demand here is following a pattern that we've seen before. When an

underserved need is met with an immature technology, early adopters are willing to take extraordinary risks to get extraordinary capabilities. In this

extraordinary capabilities. In this sense, I think the excitement we see around maltbot reflects the hunger that the leading edge of AI adopters have for

delegating more. And the more cautious

delegating more. And the more cautious 7030 split is something I see more often in companies that have existing mature technologies and are moving cautiously

on AI. It's a culture thing. But

on AI. It's a culture thing. But

regardless, Moltbot has proven the AI agent use case is real. If a 100,000 users without any monetary incentive have granted root access to an

open-source hobby project, the demand for real AI agents is desperate enough that people will tolerate real risk to get it. If nothing else, look at how

get it. If nothing else, look at how AI.com crashed during the Super Bowl.

The question isn't whether agents will become a standard part of how we work and live. That question is settled. It's

and live. That question is settled. It's

coming. They will. The question is whether the infrastructure catches up before the damage that unmanaged agents do accumulates to a point where it changes our public perception. Right

now, we're in this window where capability wins feels so exciting that it feels okay for some people to outpace governance. And demand is certainly

governance. And demand is certainly outpacing any of the security boundaries we put up. That window of excitement is not going to last forever. And while

it's open, people and organizations need to learn to operate in it and build out agent capability carefully with guard rails, with clear specs, with an eye on human judgment and how this impacts

culture change within orgs that are not open AI, that are not anthropic. The

ones that figure out how to bring their humans along, show that agents can work successfully with high capability standards and high quality standards and high safety standards, those are the ones that are going to be the furthest

ahead when the infrastructure finally starts to catch up. Early adopters

always look reckless. They also have a head

Loading...

Loading video analysis...