I've spent 5 BILLION tokens perfecting OpenClaw...
By Matthew Berman
Summary
## Key takeaways - **OpenClaw sponsorship email scorer**: OpenClaw identifies sponsorship emails using a sophisticated rubric scoring fit, clarity, budget, seriousness, company trust, and close likelihood on low, medium, high, exceptional scales; exceptional ones escalate to team, high as urgent, medium gets qualification questions, low polite decline, spam ignored. [01:17], [03:11] - **Multi-model prompt stacks**: Maintain dual prompt stacks optimized for Claude Opus 4.6 and GPT 5.2 with model-specific best practices like no all-caps for Opus but welcome for GPT; nightly sync review detects drift and keeps core info identical across stacks. [09:10], [11:13] - **Three-layer prompt injection defense**: First deterministic sanitizer scans for ignore previous instructions; then frontier scanner in sandbox with best model checks quarantined email; elevated risk markers score threats with outbound redaction for secrets and PII. [07:26], [23:12] - **CRM ties business context together**: OpenClaw scans Gmail, calendar, Slack for contacts, does proactive company research, stores in SQL-vector database for natural language queries, references knowledge base and past talks for sponsor emails, detects HubSpot stage changes. [15:10], [16:15] - **Agent SDK bypasses OAuth ban**: Anthropic employee confirmed Agents SDK works with Claude subscription post-OAuth ban; route all calls through shared Anthropic Agent SDK JS with auto-retry, logging, prompt caching, no auth issues since switch. [32:15], [32:42] - **Nightly councils self-improve**: Nightly platform council checks cron health, code quality, prompt quality; security council scans attack vectors; innovation scout searches web for new OpenClaw use cases like team calendar fusion plus travel intel. [34:58], [35:19]
Topics Covered
- AI Agent Becomes Full-Time Employee
- Rubric Scores Sponsorships Automatically
- Dual Prompts Sync Across Models
- Three-Layer Prompt Injection Defense
- Log Everything Enables Self-Healing
Full Transcript
I've used OpenClaw every day, all day for the past month, and I've gotten extremely good at it. I've released a number of OpenClaw use case videos, but in this one, I am taking it to another level. OpenClaw is now a full-time employee on my team. So let's get into it. So I get a lot of sponsorship requests,
my team. So let's get into it. So I get a lot of sponsorship requests, companies who are emailing me asking me to sponsor my videos, which is fantastic, but I do get a lot of them. So I gave my OpenClaw its own identity with a first name and last name, its own email address, basically an entire workspace account so it looks completely legit. It is now basically a full-time employee for me.
I have a public-facing sponsorship email address that everybody can see. It's a group email address and I have added OpenClaw's email to that. So anybody who's emailing that public facing email address, those emails will now get routed to my OpenClaw.
And what I do with it is absolutely wild. So here's an example email. This
is not an actual sponsor email. I didn't want to share that publicly. This is
just one that I created and sent to my open clause email address. Hi, Madam
Sarah, head of partnerships at Nova bridge. We build workflow automation tools for teams. Okay.
Then we sign off. Nova bridge.io is not a real email. And so What happened? Well, as you can see right up here, my open claw identified that it
happened? Well, as you can see right up here, my open claw identified that it is a sponsorship email and labeled it as such and actually used a very sophisticated rubric to score the email. And so it scores it low, medium, high, and exceptional.
If it's exceptional, which is I believe 80 or higher, it doesn't do anything. It
simply escalates it to my team. And so it scored this one a 38. And
in fact, it wasn't actually sure how to score it because it had some weird signals. Obviously it's coming from my personal email address, so it didn't really understand what
signals. Obviously it's coming from my personal email address, so it didn't really understand what to do with that. Also, I forgot to remove my signature from the email. So
it signs off as Sarah Chen, and then it also signs off as Matthew Berman.
And when it doesn't have a high confidence about how to score something, something special happens. It pings me in Telegram. So this is what that looks like. Low confidence
happens. It pings me in Telegram. So this is what that looks like. Low confidence
classification for review. The sender is me. Confidence score is 45, very low.
And it's guessing it should be a 38 score. And so here are the reasons.
Sender uses a public Gmail inbox. Email appears to be sent from Matt's own Gmail.
Novabridge is an unknown company with no verifiable web presence or social proof. And so
yes, it actually goes out. It looks at the website, looks if it's legit, finds reviews of the company, looks up the people at the company. It does
this entire research all in a couple minutes and then applies the score. So that's
what we see here. Claims, company email and signature, but sent from Gmail. No budget
deliverables. Series A claim is unverifiable. And so what I can do from here is just reply back saying, approve, you got it right. Or I can give it feedback about how to score the email. And I built the rubric over a few days.
It is not plug and play. I would let it assign a score, figure out how I felt about the score, and then give it feedback about that rubric. So
here's a quick overview of what that rubric looks like. We have five different main dimensions, fit, clarity, budget, seriousness, company trust, and close likelihood. And they are all weighted with a certain score. And then when they are scored, we can take different actions. So if it is an exceptional company, it escalates to our team, notifies us
actions. So if it is an exceptional company, it escalates to our team, notifies us in Slack, doesn't do anything else. No automated actions. Just tell us about it. If
it's a high sponsor, it escalates to the team, but it is an urgent so we can get to it when we can. For medium, we reply with our qualification questions. For low, we politely decline. And for spam, we just ignore it. After it
questions. For low, we politely decline. And for spam, we just ignore it. After it
scores it, it will actually drive a custom email back to this person. And so
that's what you're seeing here. So it says, Hey Matthew. And it says, Hey Matthew.
Cause I signed off as Matthew accidentally. Thanks for reaching out. Check out our sponsorship options here. Let us know if you have any questions at all best. And then
options here. Let us know if you have any questions at all best. And then
it's name. Then I simply come in here when I'm ready. I hit send and it does everything else for me. It's so easy. And if you want to recreate this yourself, you absolutely can. Here's the prompt for it. Build a sponsor inbox pipeline, multi-account email marketing per account, config JSON, cron every 10 minutes, GOG CLI for
Gmail access, lazy backfill, fetch historical threads for new sender domains, quarantine and security. And again, I'm going to get to my overall security best practices in a
security. And again, I'm going to get to my overall security best practices in a moment. Then we score it with an editable rubric. This means I am constantly giving
moment. Then we score it with an editable rubric. This means I am constantly giving it feedback. It is constantly getting better at scoring inbound emails. We apply Gmail
it feedback. It is constantly getting better at scoring inbound emails. We apply Gmail labels. We have stage tracking both locally and in HubSpot. We have context-aware
labels. We have stage tracking both locally and in HubSpot. We have context-aware reply drafting. So not just sending a template email. Of course, we use one of
reply drafting. So not just sending a template email. Of course, we use one of the best models for this, Opus 4.6. And we also use the humanizer skill. We
do not want it to smell like AI writing. Then we do sender research. We
actually go out. We look up who the company is, who the sponsor is. We
find out if it's relevant. We find out if it's legitimate. It does that all for me. and then pulls it into our CRM. And of course, we have escalation
for me. and then pulls it into our CRM. And of course, we have escalation at the very end. So I'm going to drop all the prompts down below in the description. Feel free to get them. And as I mentioned, my entire team uses
the description. Feel free to get them. And as I mentioned, my entire team uses HubSpot to track our sales and open claw has access to HubSpot has access to move deals around as it sees fit. So it'll automatically detect based on our email conversations. When a deal has moved stages, let's say from qualified to negotiations will send
conversations. When a deal has moved stages, let's say from qualified to negotiations will send me a message about it, update the team and then move the deal in HubSpot.
All of this happens automatically and HubSpot has been phenomenal and And they're also the partner of this video. So I know a lot of you want to recreate a lot of what I've done. And not only that, you probably want to come up with use cases that are very specific to your life or your business. And a
lot of you are entrepreneurs and tinkers and trying to figure out how to do that. And so as AI gets better and better, you need to stay ahead. So
that. And so as AI gets better and better, you need to stay ahead. So
if you want to start actually building with AI, not just seeing what other people are doing, I suggest you check out this free guide, 20 AI apps that you can vibe code in a weekend. This is the guide for turning scrappy AI ideas into real work prototypes fast. It breaks down practical app concepts, how to scope them correctly, and how to move from idea to MVP very
quickly. And all of this applies to OpenClaw. OpenClaw is basically just
quickly. And all of this applies to OpenClaw. OpenClaw is basically just vibe coding. Personally, I found the section about business and productivity AI apps especially relevant
vibe coding. Personally, I found the section about business and productivity AI apps especially relevant to OpenClaw because they show you concrete examples of real world automations that you can use right now. And they walk through how to go from just a blank page to a fully functioning app. So this guide was made by HubSpot. They're sponsoring this video. Huge shout out to HubSpot. They've been a fantastic partner. Go download that ebook
video. Huge shout out to HubSpot. They've been a fantastic partner. Go download that ebook because it helps us out, allows us to make more cool videos like this. So
big shout out to them. All right, so finally, you have the prompt, you have the explanation of what it does. Let me show you the workflow from a high level. So first we ingest from three different email accounts, one of which is active.
level. So first we ingest from three different email accounts, one of which is active.
That is the one that I assigned to my OpenClaw agent. Then we refresh every 10 minutes. We quarantine and frontier scan it. What does that mean? And I'm gonna
10 minutes. We quarantine and frontier scan it. What does that mean? And I'm gonna go over more about the security later, but let me just briefly tell you what that means. The first thing it does is scan the email with deterministic code to
that means. The first thing it does is scan the email with deterministic code to figure out if there's any prompt injections or SQL injections. Obviously it's not perfect, but that is just the first layer. Then it downloads the email. We quarantine it. We
put it in its own isolated place so it doesn't have access to anything else going on. Then we do something called Frontier Scan. We take the best possible model
going on. Then we do something called Frontier Scan. We take the best possible model after we've already stripped it of potentially malicious prompt injections and we have it do another scan of it, again, in quarantine. Once all of that is done and it has a high confidence that there is nothing malicious in that email, then we score and classify it. It reads the email, looks at previous emails in our database from
that same person, looks at the rubric, puts it all together and comes up with a single score. Then if necessary, we update the HubSpot stage and we sync it.
We look for drift detection. So let's say the deal in HubSpot moved, but we don't know about it. We look and we'll notify our team if we have a different record of the stage than HubSpot does. Then we apply Gmail labels. So that
is a score or a stage. Then we look if it's high signal. We escalate
to Telegram. We store and embed the email locally. Then we write a draft that is context aware. It is so useful. And I did this progressively. I didn't
create this in one go. I've slowly given it more and more authority, more and more permission to be automated from end to end. And there's still so much more to do. I actually have a vision where my open clock can actually handle the
to do. I actually have a vision where my open clock can actually handle the sales pipeline all the way up until the point that a sponsor wants to get on a call with us, or I'm ready to make a video about them. All
right, next, I want to talk about a bunch of best practices. Okay. So the
first thing I want to tell you about is multiple promises. This is actually more important than probably most of you realize and something that I've not seen many people talk about. It is very complex to manage, but it is critical. So I highly
talk about. It is very complex to manage, but it is critical. So I highly recommend you do it. The problem is that when you're using multiple models or you switch models, let's say you're using Opus 4.6 and your auth token gets banned and you want to move to GPT 5.2. Well, those different models have different prompting standards for how you should prompt them. So I actually downloaded Claude's best practices, especially for
Opus 4.6. which is quite different from previous versions, and I have documented
Opus 4.6. which is quite different from previous versions, and I have documented everything locally. So here it is, core principles. For example, Opus does not like using
everything locally. So here it is, core principles. For example, Opus does not like using all caps like this, critical. Just tell it what to do. It really is over indexing on following your instructions. So you don't need to yell at it or do anything like that. So I download this whole thing. And anytime that I write a prompt, I follow this guide, but GPT 5.2 is different. It uses very
different prompting techniques. So of course I downloaded the GPT 5.2 prompting guide and it is completely different and it has all of the guides here, all caps. So
it's literally the opposite. All caps is very welcome. But that causes some problems. How do you manage two sets of prompts for everything? Well, I basically just do it.
I told my open cloud that I want to have two versions of the prompts.
I have what's in the root. So that's what you're seeing here. This is optimized for cloud. That is still my go-to model. And I will explain how I got
for cloud. That is still my go-to model. And I will explain how I got around the OAuth ban issue. So I have all of my markdown files, not just these, all of them optimized for cloud. Then I have a separate folder for codecs optimized prompts. And this is all the same files. And every single night I have
optimized prompts. And this is all the same files. And every single night I have a nightly sync review and it goes through all of the markdowns. It looks at both of the prompting best practices and make sure not only that each of them independently are using the best practices, but that they are not seeing drift, that they don't say different things. The core information in all of these markdown files stays the
same. And if there is drift, I get a telegram alert in the morning and
same. And if there is drift, I get a telegram alert in the morning and all I have to do is say, fix it. And it does, and that is how it all stays in sync. So the longest these files would ever be out of sync is about 24 hours. So I highly recommend you do this. You will
get such better results just by prompting it better. This is prompt engineering 101. So
here's the prompt for that set up dual prompt stacks, root MD files, clot, optimize natural language, explain why behind the rules. And you can also say reference this specific guide that I've downloaded. Then codex prompts or whatever other models you want, basically create Have all of the same files across them. Give it some of those best practices.
But again, probably just point to the prompt guide that you downloaded. Both stacks must contain identical operational facts, nightly sync review, swap commands for switching active model. That
is the last important part. I have instructions about exactly how to swap the model.
So if I want to swap the models, it swaps the name everywhere. It automatically
makes whatever folder that the secondary model is in. It promotes that to being in the root. It saves the name. saves the other one in a folder and it
the root. It saves the name. saves the other one in a folder and it just has all the instructions. So I don't really have to think about how to swap the models. I just say, swap the model and it does it. All right.
This is going to be more of a basic section. I want to tell you about what all the files in open claw do and how you should be using them. So for agents.md, That is operational rules, security, safety, task execution,
them. So for agents.md, That is operational rules, security, safety, task execution, message patterns, cron standards, error reporting. And there's a reason I am telling you about all this. You should create a document so that your open claw always knows where
all this. You should create a document so that your open claw always knows where to put things and you're not going to see prompt drift. You don't want certain information ending up in the wrong prompt file. And then so we have both what belongs here and what doesn't belong here. So we have the sole file. That's the
personal philosophy, who the agent is. We have identity, name, creature type, emoji, five to 10 lines max. This is all from the best practices of open clause documents. We
have user.md basically telling OpenClaw about myself. We have tools. These are environment specific values. So channel ID, Slack IDs, Asana project IDs, et cetera. We have the heartbeat,
values. So channel ID, Slack IDs, Asana project IDs, et cetera. We have the heartbeat, which is a periodic cron that runs automatically. We have the memory.md file, which only should be loaded with me. It's not going to be shared publicly at all, including with my team. We have the sub agent policy, how to spin up sub agents, what model to use, et cetera. And these are the documents that get loaded only
when necessary, not in every single call. So we have the PRD. This is incredibly important. This defines all of the functionality that I have across the entire app, really
important. This defines all of the functionality that I have across the entire app, really useful to give your open claw a headstart to know where to look for certain pieces of code. We have our use cases, we have our workspace files, security best practices, coding standards, memory files, skill files, and reference files.
So we try to aim for no duplication across files. There should be one place to house every piece of information. And that is defined here. I'm not going to read this prompt. I will drop it down below. Feel free to grab it. All
right, next I've already talked about this. So I'm just going to talk about it briefly, but if you're not already using telegram groups with topics, you really should be.
It is the best way to optimize your context, to optimize your memory and to just make it easier on your site. Here's what it looks like for me. I
have general, I have a CRM topic, a knowledge-based topic, cron updates, self-improvement, daily brief, self-update, earnings, forward, future analysis, food journal, video research. Basically
everything that I'm doing with it has its own channel so that it always has its own context. And you don't need to actually reset the context of each of these channels so frequently. So it'll remember more effectively. Okay. Now I have expanded the CRM functionality and it is incredible. I talked about it in the last video. video, but now it is just such a killer feature and it directly ties
video. video, but now it is just such a killer feature and it directly ties to all of the work I did to give open claw its own email address and to behave as a full employee. So here's what the CRM system looks like.
Currently it scans Gmail for me, looks for all important contacts that I'm talking to scans my calendar does contact discovery so it filters out spam filters out marketing filters out event invites everything and it classifies it all it rejects most of them I think I only have 250 contacts in my database right
now most are just rejected once it classifies it it puts it in my CRM database so now I have a record of not only who that person is but everything that I've talked to them about. It does proactive research about their company. So
if there's a new article or a new piece of news that comes out about that company, it automatically finds it, downloads it, saves it all in a local database.
And of course it backs up that database and all databases. Then from there, and this is really the magic of OpenClaw. And I don't think anybody is using it to this level. When you have all of this information, when you have a CRM and when you scan your emails and calendar and Slack messages, a knowledge base, Once you have all of that, OpenClaw will start to make connections that you didn't even
think were possible. So for example, if it sees a new email from a potential sponsor, it can reference previous conversations that I've had about companies that are similar. It
can look for any knowledge-based articles about that company and it puts it all together for me automatically. So once it's in the CRM, I can do natural language queries against it. Who have I talked to in the last week? Who haven't I talked
against it. Who have I talked to in the last week? Who haven't I talked to in the last four months? Then it can provide automatic follow-ups for me, nudges, summaries. It is incredibly valuable. But like I said, the real value is
summaries. It is incredibly valuable. But like I said, the real value is tying all the different pieces together. The full-time employee sales agent, the CRM, the knowledge base, everything is tied together now. And it is so incredibly smart. It has context of my entire business at all times and really allows me and it to make
better decisions. And here's the full prompt. Contact discovery pipeline, database, natural
better decisions. And here's the full prompt. Contact discovery pipeline, database, natural language interface, relationship intelligence, daily cron and email draft system. And for the database, we use the same pattern for everything we store locally. We have a traditional SQL database with a vector column. So we can do SQL queries and we
can also do natural language queries like you would with any rag system. All right.
Next is meeting intelligence. I talked about this briefly in the previous video, but it has gotten so much better now. And again, it just ties everything together. Everything gets
stored in the CRM. HubSpot gets updated when necessary. It's so cool. Check this out.
We have a meeting. I use a Fathom note taker in my meeting. So it
is automatically transcribing every single meeting, both internal and external. Then we check the calendar.
And we pull the Fathom API after a meeting ends and we download the transcript.
We match the attendees to the CRM. We extract the insights. So not just insights, but we also look for action items. Then we generate the context and embeddings. By
the way, I am now doing embeddings completely locally using the gnomic embedding model. Then
we decide if there are action items, if none, We're done. If yes, we send all the action items to Telegram for my approval because I don't want my to-do list getting cluttered up. And if it is, it goes to my to-do list, but now it also goes to HubSpot. It doesn't say that here, but it does. And
so for any meeting, we have an internal sales meeting with three of us. It
will automatically know who is responsible for each action item and will associate it to the correct deal in HubSpot and assign it automatically to the right person. And it
does so Flawlessly, it is so good. All right, next is the knowledge base. This
is basically where I throw everything that I want saved for later, articles, videos, X posts, anything that I come across that I find interesting, I throw it into Telegram.
So here it is, here's Gokul Rajaram and saved. And it also gives me a little summary and a little opinion on it. Here's a reply post to that Citrini research article that went viral yesterday. Also gave me some information and found the Citrini article and linked it to that automatically. And every time it does it, it also shares it with my team. So here's what that looks like. It shares it in
the AI Trends channel. And it just says, Matt wants you to see it because I don't want them to think that I'm not reading these. I absolutely am. So
it shares it there. Then if somebody on my team shares an article, I just comment on it saying at Claude, put this in the knowledge base and it downloads it and does the same thing. And it knows not to cross post it back because somebody else shared it. Then of course, tying it all together, whenever I have a contact in the CRM, it looks for any articles about them or their company.
It also proactively does that. So every single day we're looking for new articles about any of the companies we work with. automatically saves it to the knowledge base. Here's
the prompt, you can build it yourself. This is one of the most valuable things I have, not just because of the cross-pollination, but because I can simply query against it. One of the use cases I also use OpenClaw for is coming up with
it. One of the use cases I also use OpenClaw for is coming up with video ideas and actually writing outlines for me. And of course, it references the knowledge base and pulls any relevant articles to include in my video topic. And so here's the architecture of it. So we have either telegram knowledge-based topic or a slack save command does a pre-flight check fetches the content. And of course I have standardized the
security practices. Anytime I'm ingesting any text from the internet, we sanitize it. We put
security practices. Anytime I'm ingesting any text from the internet, we sanitize it. We put
it in a sandbox. We do the frontier scan on it. And if it's not safe, we block it and we log the reason. If it is safe, we continue.
We chunk and embed it. We store it in SQ Lite, cross post to AI trends, and then we can do querying semantic search and get the results. plus the
sources. It is so awesome. Speaking of the content pipeline, let me briefly talk about that. So anytime that I mentioned this is a potential video idea, Claude will
that. So anytime that I mentioned this is a potential video idea, Claude will automatically get to work. It reads the full Slack thread. for the context, understand what we were talking about. So for example, if my team and I are discussing a specific AI topic, we'll go back and forth, and then I will simply tag Claude on the thread and say potential video idea. Then it reads the full Slack thread
context, queries the knowledge base for related content, searches X and Twitter for supplementary discourse, basically searches the web, searches X, and looks for viral posts about this topic. It creates a structured Asana card. So of course it's plugged into Asana and
topic. It creates a structured Asana card. So of course it's plugged into Asana and it creates it in the video pipeline project. Then it writes an outline for me, gives me reference material for it. It's actually really good. It also comes up with packaging ideas. So what's the hook? What's the thumbnail? What's the title? It does all
packaging ideas. So what's the hook? What's the thumbnail? What's the title? It does all of that. Gives me a bunch of suggestions. And then finally, post back to Slack
of that. Gives me a bunch of suggestions. And then finally, post back to Slack or Telegram letting me know it's done. All right. Let's talk about security. I've already
touched on it, but it's such an important topic. I really wanna go a little bit deeper into it. There are multiple layers of security that I've implemented at this point. So first we have layer one network gateway hardening. This is all stuff that
point. So first we have layer one network gateway hardening. This is all stuff that either OpenClaw has built in directly or my security council, which runs every night scanning everything in OpenClaw, looking for potential attack vectors recommended. Token-based authentication, never exposed directly to the internet, weekly verification via heartbeat. But we also, as I mentioned, have a nightly
security council that I will go over in a moment. We have channel access control.
So if it's a DM with me, it can basically tell me any of the information. But if it's in a Slack group channel, it cannot. It redacts information and
information. But if it's in a Slack group channel, it cannot. It redacts information and it has a very strict policy about what can be shared and what can't. And
of course, if it's writing emails for me, it has an even stricter policy. Then
we have a three-layer prompt injection defense. This is probably the thing that I am most concerned about. is somebody just trying to prompt inject into one of the places that we ingest data from the internet so we have a deterministic sanitizer first it is looking for things like ignore previous instructions and other things that i'm not going to list here because i don't want to give away what we're looking for then
we have something called a frontier scanner it is taking the best frontier model it puts that data from the internet into a sandbox then takes that frontier scanner and scans it that scanner cannot do anything the worst that could happen is the frontier model reveals information that it already knows that it's not supposed to say like how to hack a computer then we also have elevated risk markers and it kind of
gives it a score along the way then we also have secret protection so outbound redaction on all message paths and we redact secrets we redact PII as well, personally identifiable information. This is all deterministic. We really don't want to take any chances with our sensitive info. We have a pre-commit hook that blocks common key patterns from Git and we have our file permissions locked down. We have
multiple layers of automated reviews of our security. We have a nightly security council that looks for file permissions, gateway config secrets, et cetera. We have a security council, offensive, defensive data privacy and operational realism. We have cron health checks. We have system health checks. Then we have data permission last. We have encrypted databases only. When we back
checks. Then we have data permission last. We have encrypted databases only. When we back up those databases, we have passwords. So you cannot get to them even if you know where they are. We have data classification tiers. enforced on the per conversation context.
Of course, we have SSRF prevention, we have SQL injection protection, and all of this can be done with this prompt right here, which I will share. And so here is what that security architecture looks like. I'm not gonna go over it, because that's what we just did, but this is generally what that architecture looks like. All right,
next, cron jobs. Cron jobs just means scheduled things to do. It is really a huge part of what I do with OpenClaw, and I have a ton of scheduled jobs, and Not only are they scheduled, but it's important to know when to schedule them. So I'm sure a lot of you have limited quota as I do in
them. So I'm sure a lot of you have limited quota as I do in terms of how many tokens you get. So you really want to spread out those heavy cron jobs throughout the night. So for example, at 1 a.m. we do our Instagram analytics collection. At 1.15, we do Xen Twitter analytics collection. 1.30 is YouTube, 2 a.m. is the CRM. 3.20 and so on all through the night. This is all
a.m. is the CRM. 3.20 and so on all through the night. This is all the stuff that can run asynchronously. I don't have to keep an eye on it.
I do want it updated daily and it runs overnight when I'm not using OpenClaw.
So if I'm heavily using OpenClaw during the day and I run out of my quota, I won't also go to do some of these cron jobs that will waste even more quota. It's spread out throughout the day. And this is important when you have a five hour window, which you do if you're using your Claude subscription and If you're wondering how I'm using my Cloud subscription, even though OAuth got banned by
Anthropic, I will show you how to convert to the agent's SDK in a moment.
So here is all of the different crons that I use and a prompt, I'll drop it down below. All right, next is memory. Now here's the thing. I basically
haven't touched memory and it has worked great. So many of you have talked about how bad the memory system is in OpenClaw, how it constantly forgets things. And I
think it actually can be solved by one thing. I have really never had a problem and I'm just using the default memory system. I'm not using QMD, I'm not using any external services and it works just fine for me. So this is all what's built in already to OpenClaw. But really here's the key is two things.
One, if you're using Telegram group topics, you will instantly have better luck with OpenClaw's memory because it has to remember less and it's only remembering what's relevant to that thread. here's the command i suggest you take a look at frequently slash status here
thread. here's the command i suggest you take a look at frequently slash status here we can see the open claw version here we can see the model the number of tokens in and out the cache hit so 100 cash hit which is great saves us some money and this is the important one the context so i'm at 89 which is actually quite full and i might start running into memory issues soon
here so i have two options i can increase the rate that messages expire as part of this telegram context buildup, or I can simply just clear it out. And
so just keep a close eye on it. If you notice your open clause starts forgetting things, come here, look how much of your context is full. And if it is full, you know to clear it out. The second thing is constantly pruning files that get loaded into every call so i have an automated cron that looks at the files looks for any duplicate information looks for any prompt drift basically looks for
opportunities to trim it down and i'd say on average i'm trimming it by about 10 every other day but it's also growing it's one of those battles that you're just going to have to continuously fight all right next we have notification batching what i noticed is Telegram became incredibly noisy and distracting, and I wanted a way to reduce the noise and reduce the distraction. The problem is if you just let OpenClaw
notify you about everything going on in that moment, it gets Very distracting. So what
we do is we now batch notifications and they only come through every so often.
So for critical notifications, it will deliver it immediately to me, but for high importance, it does it hourly. So those are CRM updates, council digest and cron failures. Then
for medium, it does it every three hours, routine updates, non-urgent notifications, and it'll batch it up. It'll read it all out to me in telegram in a really nice
it up. It'll read it all out to me in telegram in a really nice summary. So here's what that looked like. All of the telegram sends get classified and
summary. So here's what that looked like. All of the telegram sends get classified and then it is split into three priorities, batches, and then notifies me. And it also stores everything in the notification database. And by the way,
notifies me. And it also stores everything in the notification database. And by the way, if you're not storing absolutely everything, which I'll get to in a moment with logging, you really should be store, everything. All right, a brand new use case that I started implementing is financial tracking. I export all financial transactions from QuickBooks and I import it into the database. And then I can ask it questions about
my business's finances. Simple, natural language queries. What did I spend the most money on? Which sponsors represented the most revenue? By the way, the fact that QuickBooks doesn't
on? Which sponsors represented the most revenue? By the way, the fact that QuickBooks doesn't do this is absolutely insane to me. And so I simply export it over CSV, I import it, and then it's all there and I can always query against it.
And again, we have confidentiality rules right here. It only shares it in DMs or dedicated financial topic channels. And yeah, very easy to use. All right, next, LLM usage and cost tracking. Very, very important. Of course, you can go crazy with how much you use OpenClaw like I do. And you want a way to see which models
are being used, how often, because I have such a complex tiering system of model usage, I just want to know and maybe approximate the cost. OpenClaw comes with it natively, but I wrote my own. So here's what it looks like. An LLM call gets hit. It goes through a singular pattern LLM router. then it actually sends it
gets hit. It goes through a singular pattern LLM router. then it actually sends it to the provider. We log it all in two ways. We have a JSON-L backup, and then we also put it in a database. We have a usage dashboard, we have a cost estimator, system health with API failure rates, and gateway usage sync. All
of this very easy to query against. So I can say something like, show me my LLM usage for the last 24 hours. And so here's what that looks like.
We have Opus 4.6, 715. We have Opus 4.6 from the agent's SDK. Since I
switched it, you can see just a ton of different input tokens, output tokens, and estimated costs. Obviously we're using the subscription across the board. So the estimated cost is
estimated costs. Obviously we're using the subscription across the board. So the estimated cost is kind of wonky. And I can even see what part of the apps are using the most tokens. So is it cron jobs? Is it coding? It's all tracked in one central place now, and I can query against it. Next, incredibly, incredibly important. You
want to log everything. Every error, every LLM call, every time you hit an external service, you want to log everything because it just makes it so easy for your open clock to self heal. Every morning, the first thing I do is I say, look at the errors, look at the logs from overnight and fix any issues. And that's it. It automatically looks at the logs. It has full information
any issues. And that's it. It automatically looks at the logs. It has full information about what went wrong and why, and then it goes and fixes it. And so
I have a single pattern event log dot JS shared across the entire app. It
ingests everything, looks for any opportunity to log something, logs it in JSON L, stores it also in a database. We rotate the log so we don't get this constantly bloated database. And of course I can just query against it. All
right, here's the big one. Anthropic banned usage of your Claude subscription, OAuth, if you're using it outside of an Anthropic product. But an Anthropic employee said, actually the agent's SDK, you can still do that. And it turns out the agent's SDK works perfectly well with an OpenClaw. I suspect OpenClaw will add agent's SDK
support natively, but for now, I don't believe they have it and it's very easy to convert. So now, Everything goes through the agent's SDK. I've not had a single
to convert. So now, Everything goes through the agent's SDK. I've not had a single auth problem since. And so here's what it looks like. Create shared Anthropic Agent SDK JS, resolve OAuth token from where it is, do a smoke test, wrap all anthropic calls with auto retry and logging, support prompt caching. All of it runs through the
agent SDK now. Create an LLM router. So not only are we routing to the agent's SDK, but if we want to use the codex SDK from OpenAI, you can do that or any other model that you're using. Centralize it all in LLMRouter.js.
Okay, next, here's something I'm still experimenting with. I have one OpenClaw instance and it's everything I do for my personal life and everything everything I do for my work life. But how do you keep them separated? So far I've been pretty lucky and
life. But how do you keep them separated? So far I've been pretty lucky and there's been no leakage across different use cases, but it is possible. We're dealing with non-deterministic systems. So we just have a lot of rules for it. Plus we have deterministic systems to look for redaction opportunities. So tier confidential,
internal versus restricted. So confidential means DM only me. Only me. It will only give confidential information to me. Financial figures, CRM contact details, deal values, daily notes, personal emails. Internal, so that is my team, my team only, nobody external, group chats
personal emails. Internal, so that is my team, my team only, nobody external, group chats okay, strategic notes, counsel recommendations, tool outputs in Asana tasks, and restricted external only with explicit approval, general knowledge, and anything else. We also
define what each of the emails. So this is my personal Gmail. This is my work email. This is our open clause own Gmail account. And we define all of
work email. This is our open clause own Gmail account. And we define all of it. We say what can go where. And of course we add some deterministic layers
it. We say what can go where. And of course we add some deterministic layers to really prevent data leakage. So here's what that looks like. Again, confidential only internal group chat. Okay. And restricted external. We have the different types of information for each. and we look at the context type, decide what should go where, and it
each. and we look at the context type, decide what should go where, and it has worked quite well so far. But again, it's not gonna be perfect unless you use deterministic code, and even then, of course, you're susceptible to bugs. All right, so I wanna actually go back to logging. So after you log everything, you get up in the morning and you say, look at the log, see what happened, and fix
it. And that's really all you need to do. But you also want to save
it. And that's really all you need to do. But you also want to save all of those learnings. You don't want to make the same mistakes again and again.
So what we do is we have a learnings.md file. We have an errors.md file.
We have a feature request.md file, all of which as we're going, we inform OpenClaw to store things in those files so we don't make the same mistakes again. We
also have councils that run every night looking for issues. We have our platform council and it looks at cron health, code quality, test coverage, prompt quality, dependency, storage, skill integrity, config consistency, CRM data integrity, and more. We have our security council. We have
our innovation scout looking for new use cases. This is a cool one. So every
single day it looks at everything we're doing, goes out on the web, searches for what other people are doing, different use cases with OpenClock, compares all the notes and comes up with new ideas. So here's an example, team calendar fusion plus travel intel.
And it tells me all about it. It tells me why it could work. And
same thing here. AI trends auto triage to a Sonic Cube meeting follow through autopilot.
And so it's always given me new ideas that I can implement. All of this is being offloaded to cursor agent CLI, which is what I use, but you can simply use a sub agent with whatever model you're using. And of course, if you have an open AI subscription, you can use the Codex Agent CLI. All right, next let's talk about cost savings because I know this can get very expensive. So let
me show you some of the tips and tricks that I've used to reduce the cost. One, we're using local embeddings. My MacBook Air is more than capable of doing
cost. One, we're using local embeddings. My MacBook Air is more than capable of doing the embeddings. Yes, embeddings are usually extremely inexpensive, but you know what's better than inexpensive?
the embeddings. Yes, embeddings are usually extremely inexpensive, but you know what's better than inexpensive?
Free. So we use the Gnomec M-Bag text on device, zero cost.
Then we have model tiering. We have a bunch of different models that we're using and we use the right model at the right time. Sonnet tends to be my primary model, Sonnet 4.6, which we have plenty of quota for from the Anthropic subscription.
And then it offloads to other models like Opus 4.6 if we need it. Then
we also spread out usage throughout the day. So we're not using all of our quota in a short period. window. We do prompt caching that's already built into everything you do. You don't really have to think about anything there. So we do calendar
you do. You don't really have to think about anything there. So we do calendar aware polling and other types of context aware polling. So you don't want to just constantly pull. You want to look for signals that tell you when is the best
constantly pull. You want to look for signals that tell you when is the best time to pull something. We do notification batching and we use cheaper models, faster models for stuff that doesn't require frontier models. All right, let's talk about backup because if my computer suddenly caught fire or it got bricked or stolen or anything, I can easily back all of this up. So here's how we do it. So
we automatically discover database files. We encrypt it. We upload it to Google Drive. We
document it. And then of course we rotate as necessary. We have a Git sync.
So every hour it checks any updates to any of the files auto commits the changes and pushes it to GitHub and gives us a telegram alert. Then if we need to restore it, we have a whole markdown file that documents the restoration process.
Download from drive decrypt, read the manifest, download the actual code, put it all together and it's just done. Easily done. All right. Two last use cases from Jonah from my team. These are ones that he uses that he loves and what I've seen
my team. These are ones that he uses that he loves and what I've seen a lot of other people do as well. He does a lot of health tracking, so he has an Oura Ring, Apple Health, and Withings Scale, which I'm not actually sure what that is. He ingests all of these things into a JSON-L file, runs Claude Analysis on it, Daily Summary, and Trend Flags plus Coaching. So it's very easy
to set up a health coach, a very personalized health coach, using this. So if
you have any type of health tracking device, that's what you do. Just ingest it.
and ask Claude to review it and flag any issues. All right, and last, this is a really cool one that you can do with pretty much any device. I'm
not doing it yet, but I'm definitely gonna implement this. He has a Bee Pendant.
It's a little Amazon device that you wear on your wrist, where you can actually use the Pendant product. You can use Rabbit, you can use your iPhone, you can do anything. You basically record notes all day long. The thing with the Bee Pendant,
do anything. You basically record notes all day long. The thing with the Bee Pendant, it is real time. and it is always on. It basically pulls it. You take
voice notes. You can say, oh, remind me of this thing later. Or it can record a conversation with somebody and remind you of things you talked about. Saves it
all in memory. Uses Cloud Opus 4.6 for search. It is confidential DM only. You
can query against all of it and you can get contextual answers. So it's kind of like having your OpenClaw with you at all times, but it is only one way. The thing that I really wanna be doing is having a two-way synchronous voice
way. The thing that I really wanna be doing is having a two-way synchronous voice conversation with my OpenClaw, which I haven't figured out the best way to do that yet. If you have any recommendations, drop them down below. So that is it. That
yet. If you have any recommendations, drop them down below. So that is it. That
is everything I've learned so far with over four and a half billion tokens used.
I am using OpenClaw all day All night. I'm absolutely obsessed. It is really changing the way I work. The number one thing I've done with it is made it a full-time employee on my team and it just gets better every single day. I'm
still testing things. I will record another video certainly. So stick around for that. If
you enjoyed this video, please consider giving a like and subscribe and I'll see you in the next one.
Loading video analysis...