State of the Claw — Peter Steinberger
By AI Engineer
Summary
Topics Covered
- Security Hype vs. Real Risk: How OpenClaw's Chaos Reveals AI's True Vulnerabilities
- The Legal Trifecta: Why Every Agentic System Carries Inherent Risk
- Running 10 Agents at Once: The Wild Reality of AI-First Software Development
- Taste Is the Human Moat: Why AI Can't Replace Good Judgment
- Saying No Is the Superpower: System Design Thrives When You Resist Scope Creep
Full Transcript
Our next presenter is the creator of Open Claw, the world's fastest growing open-source AI. He recently joined
open-source AI. He recently joined OpenAI to work on bringing agents to everyone. Please join me in welcoming to
everyone. Please join me in welcoming to the stage Peter Steinberger.
Good morning everyone.
So, Swiss asked me to do a state of the claw. Who here is running open claw?
claw. Who here is running open claw?
Give me some hands.
Ah, it's like 30 40%. Very good. Um,
yeah, it's been quite a few months. Um, the
project is now five months old.
I think it's fair to say by now that we are the fastest growing project in GitHub's history. Um, if you've seen the
GitHub's history. Um, if you've seen the the graph, usually it's some some projects look like a hockey stick, but ours was just like a straight line and a
friend called it stripper pole gross and that comes with its own challenges.
So, we have I think now we are the the largest number on GitHub stars. There's
a few that are bigger but they're basically educational target. No other
software project is that big. It's
around 30,000 commits. It we're closing in 2,000 contributors soon to be 30,000 PRs. Um,
see, and we're not slowing down. So, you
see that it's a ramp, but you know, it's we only have April 9. So, um,
velocity keeps keeps being good.
And at the same time, it hasn't been easy. You know, I I had two roads when I when I decided what I want to do and I I did the whole company thing. I was like, I don't want to do
thing. I was like, I don't want to do this again. And then I joined OpenI, but
this again. And then I joined OpenI, but then we also created the Open Cloud Foundation. And now I kind of have two
Foundation. And now I kind of have two jobs.
And running the foundation is like a running a company on hard mode because you have like all the all the things that you need to take care of but also you have a lot of volunteers that you
can't really direct.
So one of my goals has been working on the on the bus factor like who does comets.
Um and you see that it's slowly improving.
Vincent's actually talking after me but we're still not we're still not there.
Um, in the last months I I talked to a lot of companies.
So we now have people from Nvidia on board. We have someone from Microsoft on
board. We have someone from Microsoft on board to like help with MS Teams with like a Windows app. Uh, we have someone from Red Hat who's really helping us um
with security and dockerization. We work
with a lot of Chinese companies. We have
people from from Tencent and Biteance.
um they're actually much larger users than any other continent and yeah people from pretty much around the world but like the main thing I I
want to like talk a little bit about is about open claw is so insecure you know you've you've seen the you've seen the memes like open claw
invites the bad guys and you probably also seen companies like Nvidia
doing Nemo claw and like everyone has little lobsters.
So you also notice that like in the last two three months there's been a lot of releases where things broke.
I've basically been been dodoed by security advisories. So that's what I
security advisories. So that's what I did um and what I focused on. So far we got 1,142
advisories. That's around 16.6 a day. 99
advisories. That's around 16.6 a day. 99
are critical. Um we published around 469 and we closed 60% of them. So these
numbers sound like absolutely terrifying.
If you compare it for example to like other large projects like the Linux kernel gets like eight or nine a day. we
get like twice as much and curl so far has 600 reports we have like twice as much as curl.
So every time I I get a security incident, the rule is the higher the higher they screaming how critical they are, the more likely it's
slop. Like we we I mean you've probably
slop. Like we we I mean you've probably also seen the news like we we we are very fast moving into a world where we have to change how we build software
because all these AI tools are getting so good at identifying even the most weird multi-chained exploits and like we're gonna going to
break all the software that exists. I
give you an example like uh Nvidia they they launched Neimoclaw and Neimoclaw is a a plug-in and a security layer for
open claw. You can put it in a sandbox.
open claw. You can put it in a sandbox.
I the keynote was on Monday. They
invited me on Sunday to like work with them. I hooked it up to Codex security.
them. I hooked it up to Codex security.
It found like five different ways how to break out of the of their secure sandbox within half an hour.
That's because like if you use that product, you get access to the unnerved model that is quite a bit smarter in terms of cyber than what the public has
access. Exactly. Because it's dangerous.
access. Exactly. Because it's dangerous.
But yeah um also this whole industry those people for them it's like credits right the more the more issues they find the more
they seen so like openclaw was like the insecure product that everybody tried to break so literally like hundreds of people firing up their clankers trying
to break open claw um the typical attack surface is like remote code execution,
bypass approval, code injection, pass traversal. Uh again sounds all very
traversal. Uh again sounds all very dangerous and I give you I give you one one concrete example. Um
concrete example. Um Gshjp.
This is about a this is a CVSS of 10. So
it's like the scariest thing that you can possibly do.
It is an issue where if you uh sync for example the iPhone app that we haven't even shipped yet but is in progress and you give it only read
permission then you could like break the system to also get write permission.
So this this one was so critical that the I know this one's actually different one in all in all practical ways
it is not even an incident because the the the typical use cases you install it on your machine either in a cloud or if you have to on a
Mac mini I stopped fighting this I'm just letting people have fun now but in 99% 99% cases you'll either have
access to your gateway or you have not access to the gateway. In in in my defense, this was my mistake that I tried to create a a more permissive model. For example, if you have devices
model. For example, if you have devices that would target speech and then would only like read certain things. So
there's like some use case where you could like have a a reduced permission system would make sense.
Um but nobody's even using that. But
this doesn't matter because the rules of the of those how you create the CVSS numbers don't contribute to that at all.
And I try to play by the rules. So it is a 10 out of 10. And the world is going crazy over incidents that in all practical ways will not affect people.
There's some other stuff that does affect people. Uh we have nation states
affect people. Uh we have nation states trying to like hack people. There was
like ghost claw which is like from likely from North Korea which is basically confusing people with a different NBN package and if you if you go to a wrong
website and you try to download it you get like a a root kit. Um that's outside of our control. This happens for other people as well. Um,
also there's the Axios thing which funny enough we are not using Axios but we are using MS teams or Slack as a
dependency and they're using XIS and they didn't pin us and of course uh because that's how supply chain attacks work we were also affected.
Yeah. How do you survive 1,142? I'm sure
it's 1,150.
Uh for a while I I I tried to handle a loop by myself and which is absolutely impossible.
So So the fastest way to get help was like getting getting help from companies um and Nvidia has been really amazing to
like give us some people that basically work full-time going through the slop and hardening the code base.
Oh, there's also one that is okay.
That um this is one of the anglers. The other
angle is like there's a lot of companies that do fearongering and it's not just companies, it's also universities. I
don't know if you've seen it. There was
like this um paper who made the rounds agents of chaos and they say oh it's it's about agents in general but then there's four pages that explain the open claw
architecture in utmost detail but you know which page they didn't even mention a security page where we explain how you should install it because then it
wouldn't be fun then it wouldn't be it would be hard to make a good story. So
what they instead did is they ignored all of the recommendations we do on security. Recommendation is it's your
security. Recommendation is it's your personal agent. Don't put it in a group
personal agent. Don't put it in a group chat. If you put it in a group chat,
chat. If you put it in a group chat, turn on sandboxing because if anyone can talk to your agent, they can excfiltrate anything that the agent can do, right?
So if it's a team agent, it should only know what the team can know and not any secret data. And you probably want to
secret data. And you probably want to like have it restricted. If it's your personal agent, you should be the only one being able to talk to you. But if
you don't play by these rules, you can get some really fun interactions like, "Hey, I can talk to your agent and it can break your system." And then because I I was I was grilling them a little bit because I had some questions how to do
things. They told me, "Oh yeah, no, we
things. They told me, "Oh yeah, no, we run it in pseudo mode because we wanted the agent to be like maximum powerful."
So they actually fought the setup. It's
actually not easy to run it in pudo mode. You have to change code. um
mode. You have to change code. um
but they didn't mention it in the report because again that wouldn't give them cloud.
So yeah um my current frustration is like there's like a whole industry that try to put the project in negative light. It's a nightmare.
light. It's a nightmare.
It's insecure by default. It's
unacceptable.
Um and meanwhile a lot of people love it and people who actually read the security docs understand it can use it just fine. One example that I
found particularly great is u we had one remote one rce that panicked Belgium.
So the Belgium cyber security did a release uh about a remote execution environment and the whole bug was
a feature where a malicious website could create a link that would trigger the gateway and then forward
your gateway token. Now if you use the setup that is the default and that is recommended the gateway token is local only or if you have to it's in your
private network no external website can actually access it. If you
actively fight the setup and for example use cloud code to set it up without reading, you might be able to get this setup working.
But again, that's not anything what's said on the website.
So to be very honest, yes, there's absolutely uh risk. the the the big risk is the the
uh risk. the the the big risk is the the basically the legal trifecta. You know,
any any agentic system that has access to your data, has access to untrusted content and the
ability to communicate is something that's potentially at risk. That's not
anything special to OpenClaw. It's like
any any agent any power agent system has a problem. The more the more powerful
a problem. The more the more powerful you make it, the more it can do for you, but the more you also have to understand what it does. So this is like the the
main issue but people not talk about this. Yeah.
And then also um some part about maintaining.
So the problem is like if you get all those security advisories, you know that most of them are created with agents, but you still have to use
your brain to actually read it because we're not at the point where you can fully trust or I'm not at the point where I I can just fully trust that the agent will figure it out. So it is a huge burden on on time and you never
know. I mean sometimes you can you can
know. I mean sometimes you can you can often guess you know anytime the reput is too nice or like someone apologizes that's very likely AI because usually
people in security don't apologize. Um
but it is a huge problem and it's something that I see more and more open source projects complaining about or like breaking. Um,
like breaking. Um, some are very public about it like ffmpeg.
Usually you get the report. It's very
rare that you actually get a report and a fix. If you get a report and a fix,
a fix. If you get a report and a fix, it's usually a very bad fix. If you rush it, as I sometimes did in the beginning because I was overload, you will very
certainly break your product.
Yeah. So this is something that's just very difficult to pull up only with volunteers. So we so
volunteers. So we so what are we working on?
Number one is I people say like open AI bought open claw that's not the truth. they might bought my soul.md
my soul.md um but they very much understand that in order for what the world needs is like more people that play with AI to like
understand what AI can do to both understand the risk and also the possibilities they understand that if you or like someone who never played
with never used AI suddenly is at home and uses openclaw they'll come to work and they will ask why don't we have AI at work so they very much understand that like supporting this project is
very useful and in order for that project to be successful cannot be under one company. Therefore, I'm kind of
one company. Therefore, I'm kind of building Switzerland with the open glove foundation and I have Dave was helping me with it. Um, it's almost done. The
last thing that's keeping us going is like the American bank system which is a little bit slow and very confused when you're not American.
Um, it's inspired by what Ghosti did.
And this will actually then help us to hire full-time people to both keep up the pace, improve the quality, and free up some of my time that I can work on on
cool stuff again.
And that's my little update on State of the Claw. I'll be around later for like
the Claw. I'll be around later for like a Q&A. Thank you for listening.
a Q&A. Thank you for listening.
Okay, great. Thank you for the whoop. Love the
great. Thank you for the whoop. Love the
whoop. Um, so excellent. Okay, you've
chosen the claw uh track to get started on for our our breakouts and uh uh it's going to be great. I think it's going to be it's going to be a good session. Um
we are going to be hearing about a bunch of different things uh related to uh openclaw and just personal AI assistance in general. There's some open claw
in general. There's some open claw contributors, openclaw maintainers, uh um uh open claw competitors, uh and open claw creators, uh going to be here on
the stage. Um we're actually going to uh
the stage. Um we're actually going to uh be taking this through until the lunch break. Um oh, there we go. We can see up
break. Um oh, there we go. We can see up there. So, it's about an hour and a half
there. So, it's about an hour and a half of uh of sessions, slightly shorter sessions than uh than earlier, I think.
Um but we're going to be starting with uh an AMA. came in. You saw Peter earlier on, but you're going to get a chance to ask questions and there's going to be a bit of a conversation uh with Peter and Swix. So, I think to get
us started, I will simply invite Swix up who will kick things off. So, uh please welcome him to the stage. Swix, come on up. Swix.
up. Swix.
All right.
Actually, you can just go together.
You can come out together. There's no
secret. Peter, welcome. Everybody there
is Okay, so the deal for this is meant to be an AMA. Uh the the main idea is that I've run six of these AI engineers and whenever we have some big maintainer,
big VIP, we only give them a talk, but actually you guys have questions that you want to ask. Uh so uh we wanted to sort of create that opportunity. So you
can you can submit there. I'm going to moderate uh and and all that. Uh the
spicy one I'm just going to start off with. Pete just quote uh quote tweeted
with. Pete just quote uh quote tweeted uh me and saying send all your questions about closed claw right uh I think uh people have a lot of
questions about um the future of openclaw at openai uh and uh I wanted to give you the space what what is the what are people saying about closed claw and then what is your response
I didn't even think about it was like it came up when when I decided to go to to openi And
I think I think people have a point that open air wasn't always amazing with open source. And I I think a lot changed like Codex is open source now. They released Symfony which is a
now. They released Symfony which is a really cool orchestration layer. So like
like they're really leaning in and understanding open source now. They
understand that open cloud needs to stay open work with any model be it be it one of the the big companies or being a
local model um everybody in the industry wins if more people spend time with AI you know if if I if I think AI is
something scary and then suddenly I I I play with open claw and suddenly it's like fun and weird and then I come to work and there's no like I don't have AI tools at work. I'm going to get to my
boss and say why the f do we not have AI at work and and then like those companies would probably not run open claw but we want something that's like
hosted and managed and and then somebody can can make a sale. So they they're like very much on board. They provide me with resources. Um, actually it's me
with resources. Um, actually it's me like I could get a lot more people from OpenAI to help with the project, but that would just make a picture that they
could have taken over the project and I don't want that. So I I I brought in people from Nvidia, we have someone from Microsoft, from Telegram, someone from Salesforce of all the companies. So So
shout out actually there's cool people at Slack. So we have someone that
at Slack. So we have someone that maintains the Slack plugin. Now I
brought Tensent on board, Bite Dance. We
talked to Alibaba, Miniax, Kimi, like all the all the model providers. They're
like very much on board. Um, Nvidia has been immensely helpful. They
I think I one of the coolest companies in terms of here's some engineers who actually like just hire agency and just do things.
Yeah. Uh and now that I have all the other companies, I'm also bringing a few people in from OpenAI to to help maintain the project because it's I mean software is just like changing that the
the pace at which this project operates is is insane. You kind of like you need an army. Um and I'm working on that.
an army. Um and I'm working on that.
You have an army. Uh and but but you know even the contributor chart that you showed uh shows that it's hard to get quality contributors to stick around.
people keep hiring your maintainers and then you have to find new ones. Um so
there's a lot of questions about local models and open models. Uh you know like not every part of the stack is open.
There's many models where you don't have access to the models and and you know there's sort of weird restrictions. Um
how important is open and local models to the future openclaw? I mean part of part of what what motivated me to build open claw is you see all these large
companies and then they have connectors to my Gmail and then my my email is hosted somewhere then this company has full access to my email and then I can get a little bit down there like it's
much more exciting to me if I have all my data actually under my control and I and like a little bit of it goes up there if I need the top tier token.
Yeah. and like a second kind of hierarchy of uh fallback models.
Yeah, you want to I mean I'm I'm European at heart. You want to own your data, you know. So so so and nobody built it. So for me that was very
built it. So for me that was very attractive and also the the fact that you know if if you're a startup you want to connect to Gmail, it takes like half
a year and it's like a very very difficult process. But if I'm a consumer
difficult process. But if I'm a consumer my clanker can click on any website and it happily clicks on I'm not a bot. If
you have to give me the data somehow, if you can if you give me the data, my my agent is able to get the data. So you
can work around a lot of those those silos those big companies are building and ultimately you can do much cooler automation use cases that large companies can never do.
So it's it's like it's a little bit the the hacker way.
Yeah. And um any indications from the open team on GBTOSS? Is that continu continuing to be a stream of work that
uh will be aligned with open claw or or is that like separate?
I'm not I'm not in a position to give yeah give you insights on that just that um part of what opencloud triggered is that like more people in the company are
getting excited about open source. Um,
and I I love that that OpenAI is moving more into the open direction. Again, if
you compare it to some other top tier labs that start with an A, uh, that very much will sue you if you if you leak any
of their source um, or block you if you are too successful. I I I think Open is on a good direction.
Yeah. Okay. I want to highlight this question. Um, people love hearing about
question. Um, people love hearing about your coding workflow. I think right by now your idea of um uh the prompt request rather than the pull request is
is very well socialized and also you've been shocking people with just how you're spending tokens at OpenAI.
Uh so basically uh the people want to know how you ship and what do you do about agent waiting times like why is you know you're spinning out so many agents. I
agents. I I know like I I never imagined that this one picture of me would blow up so much.
Yeah. actually
uh give give some numbers just just to align people. I I think and there's
align people. I I think and there's times where I was running almost 10 sessions at the same time especially
when I used codeex with 50 51 it was quite slow I think now I have to say we it's still weird we made improvements they both make it faster and then
there's also fast mode so by now my typical workflow is maybe half of that maybe five six windows instead of double just because each loop is faster and like the
area of work I sync in workers is pretty much the same. So I I don't have to use split screen so much anymore and I think we're going to move into a future where
um token will be will be faster and faster.
So at some point like this is not natural that you work on on six things at the same time. Um
but it's basically a workaround until until faster. Yeah. Uh, one of my, uh,
until faster. Yeah. Uh, one of my, uh, interesting things of putting you next to Ryan was to see how the two of you kind of approach uh, token maxing.
Basically, I'm curious what you think about the the complete dark factory approach, right? That uh, you don't even
approach, right? That uh, you don't even review code that goes in.
I think that's more and more doable.
But also, you know, when I when I dark factory in a way also means I come up with everything I want to build in the beginning and I just don't think you can build good software in that way.
Like the way to the mountain is usually never a straight line. It is it is it is very curved. Sometimes you go a little bit
curved. Sometimes you go a little bit off track and then you you see something new that inspires you. You find like shortcuts. Um
shortcuts. Um once you're at the top you you you can find the optimal path but you never walk like this. So at the same time you will
like this. So at the same time you will the first idea that you have about your project is very unlikely going to be the final project. But if I if I suddenly
final project. But if I if I suddenly use the waterfall model again that will be the final project. For me that doesn't work for me. Like I I build steps I play with it. I see how it
feels. I get new ideas, my prompts
feels. I get new ideas, my prompts change. So to me, it's a very iterative
change. So to me, it's a very iterative approach. So I don't see how you could
approach. So I don't see how you could fully automate that. You can definitely build pipelines for certain things.
Yeah.
But even even for PRs, you don't just want to build a pipeline that just merges PRs because a lot of them just don't make sense, you know, like people people will pull your product into all
kind of directions. But if you automate that, the AI will very unlikely know what's the right direction. You can
guide it. I have like a vision document that I tried some of that but the bottleneck is still sinking and like
having taste.
Yeah, taste is very important. Uh how do you define taste? This is something that in my conversations with people everyone understands taste is the moat but nobody agrees on what taste good taste is. So
I'm just curious to hear yours. I think
in this day and age is like the very low level of taste if if it doesn't stink like AI and you know exactly what I mean you know if if something is just so writing style
personality also also also UI by now you've seen so many so much aentic built UI that you immediately know if it's AI yeah if it has the the color border on
the left right yeah I mean for a while it was like the purple gradient but much more so I I feel It's it's like a feeling the same
as you can identify AI written slop right away.
Yeah.
Um that's why I say it's a smell. Like
even if you can pinpoint this, you will know. So So that's probably the lowest
know. So So that's probably the lowest the lowest characterization of taste.
And and then going higher up because now so much of software is is automatable.
There's actually much more time you can spend on like the little details. I
don't know, you know, like like just when you when you when you when you run open claw, you get like a little message uh that sometimes roasts people.
Those are like the delightful details I think that you'll just not get if you prompt in a high level.
Yeah. One one of my favorite tastes of yours is how you you uh really put a lot of work into your soul soulm and you uh you know open source your approach and I don't think people worked on enough soul
until until you came along. So I think that's really interesting. Uh my I I I have a podcast I haven't done yet. I
haven't released yet with uh Mikuel Parakin, who was the CTO of Shopify now, but he was the uh guy leading Bing where Sydney was uh the original sort of
unaligned chatbot that emerged. Uh but I think people really have fun when when your soul your chatbot has personality.
Your clanker uh you know has different obsessions.
Well, it wasn't because it the world changed, right? We had we had chat GBD
changed, right? We had we had chat GBD in 2023 and 4 and it was basically us having AI without understanding what
AI can do. So we rebuilt a Google so you have like a search field and like you get a response and you you don't expect Google to have a personality.
Yeah. But now that we moved more towards agents, like if if I I didn't think about in the beginning WhatsApp relay and I just hooked it up to cloud code.
Um and then I when I was on WhatsApp, I noticed that it doesn't feel quite right. Like even even though like cloud
right. Like even even though like cloud code already has some personality, it didn't really fit how people would write to you on WhatsApp. So that that's how my whole iteration started was like uh
this again it's about taste, right? It
doesn't feel quite right. It's like too wordy. It uses too many dots. It it it
wordy. It uses too many dots. It it it my friends text different. And then
that's how I started working. They say,
"No, this isn't like try to write more like a human."
Uh yeah, I I actually run a writing like a lobster.
Uh like a lobster. Yes. Um
uh you know the one of my favorite quotes of yours is uh madness with a touch of sci science fiction. Yeah.
Right. Like that this is how you run um uh AI projects. And I think not all the art projects, but specifically something like OpenClaw would have never
been able, it would not have come out of an American company just because it would have been killed in legal long before it would have been released because it just has some problems that we haven't really solved as an industry yet.
Yeah.
But now we have some mitigations and it's getting better. The models are getting a lot better. But I don't see how any of the big labs could have
released that. You know, it would be too
released that. You know, it would be too much push back. Oh, and like not enough market proof that this is what people want.
Yeah.
So like it had to be done by someone like outside. Yeah. That that that
outside. Yeah. That that that sitting like literally like when I when I built it in the very beginning, I was like, "Oh, what's the worst that can happen?"
like it could exfiltrate my token, my emails. Yeah, nothing is nothing
my emails. Yeah, nothing is nothing nothing's in there that would like completely kill me. You could like upload some of my pictures. I was like, yeah, I guess the worst are already
online if you use Grinder. Um,
so it was like it was like, okay, I can live with that risk. It will
be uncomfortable, but it's like it's manageable.
Yeah.
Uh, if your company is a different it requires a little different approach.
Yeah. By the way, uh his Instagram account, good follow under underfollowed.
It's also it's also has some good stuff.
Um okay. Uh you were talking about WhatsApp, talking about Telegram. A lot
of these text apps. Um uh text apps are good. People are also looking for like
good. People are also looking for like the next form factor. People want like the maybe the the glasses, the earbuds.
What What is your sort of wish list in terms of having agents in your life?
I started on that actually already, but then I was just getting bogged down by all the people using it and just like
the daily grind.
But if you're at home, I want to be in any room and you know at Star Trek when you can when you say computer I I I want to like talk to my agent
wherever I am and it should just be able to like respond to me. It should know where I am. I have like little iPads in every room and and my agent can use the canvas feature and project stuff on
those iPads. So like if I ask a question
those iPads. So like if I ask a question that that is like easier to be to be answered by also showing me something like it could use like the nearest display because it's aware of where I
am. So the phone is just a very
am. So the phone is just a very convenient input point but I kind of want to like talk to it from anywhere.
Yeah.
Like yeah if I'm around and I have glasses I should just like be able to like listen in and like project something on me.
Um but just ubicular follow you I think yeah once we have really smart home. Yeah,
like agents on your phone, but really you want ubiquitous agents and then you want maybe you will have your your your
uppercase open claw your private agent at work. You might have your I don't
at work. You might have your I don't know lowerase openi claw and then that claw should be able to like talk to
your personal claw uh in a way that both your company and you are comfortable with. So that's kind of like the future
with. So that's kind of like the future where we need to work out.
Yeah. Uh one of uh I just did a podcast with Maran Dre who's a huge fan uh and and also uh have conversations with Andre Karpathy. Both of these guys are
Andre Karpathy. Both of these guys are running OpenCloud to run their house.
And I think OpenClaw for homes is like a kind of underrated, but like people are really discovering it. And my funniest sort of irony is that is it's only possible because the internet of
means that most smart devices are terrible in security, which means Open Core can run them.
Oh, it's going to be able to work so much better in in a few months when the models are getting really bad.
Yeah, they're very good. Um, okay. One
security question. uh about prompt injection. How do you want to solve
injection. How do you want to solve prompt injection or what what uh ways in which uh have you been thinking about the prompt injection problem?
Probably not enough yet. On the other hand, like the the the front end models are really quite good at detecting all the
all the cases where like just stuff randomly comes in from a website or an email is usually not a problem anymore.
You mark as untrusted content, very hard to excfiltrate you from that. If if I have unlimited access to your claw and can bombard it with stuff, then there's
still a chance.
Then then there's still a chance. But
like for one of things, it's no longer the biggest problem. If
you use that's also why why you know that this is probably the angle where like some people say, "Oh, Peter doesn't like local models." But then I see like people running like a 20 uh billion
parameter model that just does whatever you tell it and and it's not trained to have any defenses at all. That's still
problematic. If you run that and then you use a web browser or email um would worry me. That's why that's why OpenClow
worry me. That's why that's why OpenClow warns you if you use a small model. And
I know people spin the whole thing like we hate model. I I love I love I love that it we support everything, but like you have to steer
the regular user a little bit into a direction to make it harder for them to shoot themselves in the foot.
Um yeah, there there is some ideas for problem injection. It's
problem injection. It's just a little bit away. I haven't
announced that.
I think Simon Willis has been working a lot on on this. is I mean he coined the term prompt injection and the sort of dual LLM approach seems smart uh and I'm I'm not smart enough to figure out all
the ways that which it can be attacked like at at some point trust just has to be a thing right um and uh and I pro something interesting I found out from talking with Vincent who's speaking next
is that you guys had to implement the same trust system that Toby Luca had to implement which is uh you build reputation over time and things with more trust uh gets more privileged
access, right? And I think that that
access, right? And I think that that makes sense.
That's part of the story.
Yeah. Yeah. Yeah. Um okay, so uh some more broader questions. What cool
projects would you like to work on once you have more free time?
I mean, I wanted to work on dreaming and know like my maintenance worked on dreaming while I I'm there like while you were dreaming.
Uh so shift it, right?
Yes. What what is dreaming? Uh it's like a way to reconcile memories and like kind of create a little bit like like a dream log go through like your session logs. Um
logs. Um we we found out from the enthropic source code leak that they also working on dreaming, right?
Oh yeah. Yeah. I mean there's I'm pretty sure there's like more companies working on that. But think a little bit like how do we learn as humans? You you experience a lot of
humans? You you experience a lot of things during the day and then you sleep and and in sleep your your brain does like a garbage collect
converts some me some local locally stored memories into long-term storage and like drops others and that that's similar ideas that I think could also be very useful for
agents. Um and then like what we shipped
agents. Um and then like what we shipped on dreaming is like the first little step in that direction.
Yeah. It's related to the wiki uh thing that Andre has been talking about where you sort of collect everything into a wiki is is more memory but like everything kind of blends a little bit
together. Um that the beauty the beauty
together. Um that the beauty the beauty of open claw is that we can just try stuff you know like like everything what we worked on for the last months or so is that
in the beginning it was a big spaghetti codebased mess and now like everything everything is an extension a plug-in. So
you can replace memory, you can add the wicki, you can add dreaming, you can add I don't know your your your whatever crazy idea you have and just make it your own. You don't have to send
your own. You don't have to send everything to a pull request because we're still completely overloaded on those. But it's it's more like Linux
those. But it's it's more like Linux where you just can install your own parts.
Yeah. Yeah. And uh you are building what a lot of people think uh is the most consequential open source since Linux which I don't know how do you deal with that? How do you deal with the the the
that? How do you deal with the the the fame what is a day in your life uh as as the BDFL effectively of something like this?
What's my Well, there's still a lot of coding. There's also a lot of
coding. There's also a lot of by the way in in between sessions he was coding back there.
Yeah. They get tokenized. You have to like something has to be right.
You have to push the agents, right?
Yeah. Um
where it shifted a little bit now it's a lot more a lot more talking and steering people in the right direction like because there's a lot of things that we already learned at Open Claw. So
like part of my role at OpenI is like to like help them not make the same mistakes again. Um
mistakes again. Um and then and then open claw is like try out new things that seem exciting and some might work and some might not work.
Enable enable companies to like build their own claw without having to fork away but like making everything more more customizable. Um yeah and sometimes
more customizable. Um yeah and sometimes I sleep sometimes you sleep. Okay great.
Uh I think that maybe this is the last good closing questions. Uh, what skills do you want humans and engineers in particular to focus on developing in the age of AI?
Taste was a big one, but I already mentioned that system design is still very important.
Yes, you we talked about this in San Francisco. Yeah,
Francisco. Yeah, if you don't think about that, you will eventually swipe yourself into a corner, right? Just by defining the boundaries
right? Just by defining the boundaries like the funny thing is like everything is in the clanker but you still need to ask the right questions otherwise
that makes the difference of like good code that comes out or like really bad code that comes out and that's still where like all the knowledge you have like how you build software you can
apply to steer the agent into into something that is not slop.
Yeah. And then I think I think a skill that is becoming more and more important is saying no.
And and and that's something I had to learn as well because even the wildest idea is just just a prompt away.
And usually this one idea is never the problem but like this idea and this idea and this idea and this idea and then how all of that fits together that's the problem.
Yes. So like
I think we're still bottlenecked on syncing and about like big picture syncing because imagine the world from your clanker like you're being thrown into a code base. You might have an
outdated agent.md file, but you
outdated agent.md file, but you basically don't know what DF this is and you like then like you tell me, hey, add user profiles and you like somehow add
user profiles and connect it to the two things you see, but you didn't see the whole system, right? And then that's where a lot of those localized solutions comes where like your project has like
vS and and it's our job to like help the agent do its best work by like providing them with like hints. Hey, you want to consider this? You want to look there?
consider this? You want to look there?
How would this interplay with this? And
then and then ultimately you get like a much a system that actually is maintainable.
Yeah. Um well, thank you for maintaining one of the most important software of all time and thank you for spending time with us.
Thanks for having me.
Hopefully you stick around and answer questions. Thank you.
questions. Thank you.
All right.
Loading video analysis...