A Deepdive on my Personal AI Infrastructure (PAI v2.0, December 2025)

By Unsupervised Learning

Summary

## Key takeaways - **Avoid AI Surprises via Augmentation**: The main reason for building Kai is to get better at everything by avoiding surprises, which means understanding how things work at every level, using AI as an augmentation system. [02:31], [02:43] - **Five-Level AI Job Impact Model**: AI progresses through five levels: pre-2022 no AI (all human work), 2023-25 chatbots with manual work, current agentic layer (level two) where Kai sits, and future less human-focused levels. [03:31], [04:51] - **Scaffolding Trumps Model Power**: Scaffolding is more important than the model; choose excellent scaffolding with an older model over the latest model with poor scaffolding, as good scaffolding magnifies models. [08:25], [09:12] - **Code Before Prompts for Determinism**: Be as deterministic as possible by doing anything possible in code first before using AI prompts, treating Kai more as a tech orchestration system than pure AI. [10:25], [10:58] - **Self-Upgrading via External Sources**: The upgrade skill pulls latest content from Anthropic blogs, GitHub releases, YouTube, and security research, reviews Kai's documentation, and implements improvements like adding 'use when' keywords in minutes. [21:27], [23:17] - **Custom Skills for Precise Routing**: Custom skill management adds explicit routing on top of Cloud Code, achieving 95-98% accuracy by mapping natural language to workflows and deterministic code tools. [23:46], [24:49]

Topics Covered

AI Jobs Model Levels
Scaffolding Trumps Models
Code Before Prompts
Self-Upgrading AI System

Full Transcript

So today we're going to cover basically two things. Um so we're going to cover

two things. Um so we're going to cover at a high level how to think about augmenting yourself with AI as well as some tactical examples and demos from some really cool stuff Daniel has built.

And then the second part is basically open Q&A. So ask Daniel and I anything

open Q&A. So ask Daniel and I anything you want. So just sort of open

you want. So just sort of open discussion. Little bit about Daniel.

discussion. Little bit about Daniel.

He's a longtime great friend of mine. So

I'm a huge fan of him as a person as well as his work. He runs unsupervised learning which is one of the biggest and best security newsletters in my opinion.

He recently keynoted OASP global apps USA which is pretty cool. He was

previously a security leader at Apple, Robin Hood and other places but recently the past couple of years he's been focusing on AI so both applying it to security where he does some consulting

work with companies as well as using AI for human flourishing which is what we're going to be uh focusing on a bit today. And just a fun anecdote,

today. And just a fun anecdote, basically every week sometime between 10 PM and midnight, I get a text from Daniel that's basically like, "Dude, I just built the sickest thing ever." And

then he sends me a screenshot of some uh cool new like dashboard or something he's built. So today, we're going to be

he's built. So today, we're going to be covering a number of those things that he's been building cuz he's constantly tweaking and improving his stack. Yeah.

With a focus on using AI to augment yourself. So that's a little bit of

yourself. So that's a little bit of context, but yeah, let's jump right into it. Daniel, I pass it to you.

it. Daniel, I pass it to you.

>> Awesome. Thanks, Clint. Appreciate the

intro and thanks for uh having me. I

appreciate the time here.

>> Appreciate it.

>> So want to talk about why I built Kai.

Some stuff you need to know before you can build a similar system if you want to do that. Core principles and engineering that I put like into it like the design concepts and everything. Many

of which are actually different than uh cloud code natively. So the Kai system is built on cloud code, but it's kind of designed to be agnostic and not just have the base stuff that's in cloud

code. I've extended it quite a bit.

code. I've extended it quite a bit.

We're going to do a deep dive on an actual skill. We're going to do a quick

actual skill. We're going to do a quick demo and I'm going to show you how to get started on your own. And uh we got an FAQ which uh you could take a picture of. I don't think I'm going to read

of. I don't think I'm going to read through that one. And then as Clint said, we have time for discussion and QA afterwards. So like Clint also said, I

afterwards. So like Clint also said, I want to apologize beforehand because this could be like an hour for each section. And in fact, when I do it other

section. And in fact, when I do it other places, it is actually much longer. So

we will have to go pretty fast. But um

I'm going to go in more detail about a lot of these things than I've done anywhere else. So that should be fun to

anywhere else. So that should be fun to uh get into. But the goal is ultimately to give you enough detail and direction to build your own system like this. So

why I made it in the first place? The

main reason is because I just wanted to get better overall at everything that I do. I don't like being surprised by

do. I don't like being surprised by things. I like understanding how things

things. I like understanding how things work. And when I'm surprised, that means

work. And when I'm surprised, that means I didn't know how it worked at whatever level. So that's basically like an AI

level. So that's basically like an AI augmentation system was the first thing I I thought about at the end of 22 when this all went crazy. I also think regular jobs are going away. I talk

about this postc corporate world and I'm really worried about that for humans and I think the best way to get ready for that is to get really really good at being a human and that means understanding yourself and using AI to

magnify yourself. So this is my model

magnify yourself. So this is my model for thinking about how AI is going to impact like the job market and how much it's going to I guess get inside of the workflow of how we do work and

eventually take over more and more of that. And I use this to sort of keep

that. And I use this to sort of keep pressure on myself kind of thinking about where it's all going and basically where am I and how ready am I and how

ready are other people as well. So it's

five levels. Before 2022, we didn't have any AI, right? So we basically did all this work ourselves. 23 to 25 roughly. I

mean, this is all general, right? This

is like the first level of AI. It's chat

bots. Everyone understands this. It's

like you ask a question, you get a thing back, and then you could do manual work with that thing, which is still really valuable. And you notice at the bottom

valuable. And you notice at the bottom here, we're inside of the section for human- centered work. So the first three sections here are largely human focused.

And what we're entering into now is like this whole agentic thing, which by the way, I hate the word agentic. I I think it's a good word. I think it's a cool word, but everything is agentic now. Got

aentic toaster ovens or or whatever.

It's uh it's being overused, but uh this has been going on for a couple of years.

I I think it's starting now. Yeah, I

think it sort of started at the end of 24 a little bit and then they were like, "Oh, 2025 is going to be the year of the agent." Turns out that that was kind of

agent." Turns out that that was kind of true. And that's this layer here, right?

true. And that's this layer here, right?

That's level two. And the next two we could talk about later if you want to hit me up afterwards, but they're less human focused in terms of how much of the work is being done. And this system

that I'm going to show you is firmly right here in the uh level two. So what

can you do to get started? So the most important thing is understanding like you can't just start building, right? A

lot of people try to just start building random stuff inside of a system like this and then they try a few things, they don't really work that well and they kind of abandon them. And I I taught a course on this four times so

far and I always start with the same thing, which is who are you? What do you care about? What do you actually want to

care about? What do you actually want to get good at? And how can technology save you time so you can actually do more of the stuff you care about and less of the stuff that's just like busy work. So for

me, this is roughly what mine looks like. One through four is definitely

like. One through four is definitely like the most important to me. Reading,

thinking, writing, and discussing things. Five is what I've been spending

things. Five is what I've been spending most of my time on, which is building.

This used to be known as coding, by the way. I've stopped saying like when I

way. I've stopped saying like when I talk to Clint or whatever, what have you been doing? Uh, coding, right? Cuz

been doing? Uh, coding, right? Cuz

that's what I was doing before, and now it's like I'm actually thinking and writing, which is producing the coding.

It's a weird abstraction. I also do a lot of consulting for customers, building a bunch of products. For

physical stuff, I play table tennis, like the crazy kind where you're really far away from the table. I play drums and I'm getting into kickboxing. And

perhaps most importantly, I orient a lot of my life around trying to help other people do the same thing that I'm I'm doing here, which is just self-discovery and like self-magnification.

So, the system itself and its design.

So, these are the principles that I built Kai on top of and what sets it apart from cloud code. Cloud code has some of this naturally built in, but I've kind of augmented it for the Kai

system to be a lot more of of these things with a heavy focus on on these principles. So, what we're going to do

principles. So, what we're going to do is go one by one through these and talk about them as a concept and show you what it looks like inside the system.

So, they go roughly in order of importance. And this is definitely the

importance. And this is definitely the most important one. Even though it's kind of invisible, it it's become less talked about. So basically, if you

talked about. So basically, if you remember back to early 24, I don't know, maybe in 23 as well, it's it's hard to actually remember because it flows together, but a lot of companies were talking about prompting. Oh, you got to

learn how to prompt. Prompt engineering

was the big word, right? Put out the fabric project, which is a whole bunch of crowdsourced prompts, and that was just kind of the thing that you did. AI

was heavily associated with prompting.

And in my opinion, prompting never became less important. In fact, I think it's more important than ever now. I

think it's just more hidden because there's lots of other shiny things that people are talking about. The other

reason is because AI is doing a lot of that prompting for us. The way I think about this is clear thinking is basically the center of everything and clear thinking becomes clear writing and

then clear writing is essentially what prompting is and that that becomes really good AI. So a good heristic for this is like can you explain this to yourself especially to yourself like 6 months later when you might not have

remembered what you actually built. and

you explain it to others and if you can't then AI really can't understand it either right when the AI is confused that's when everything goes sideways so I think prompting is still like the most

important thing to AI cuz at the end of the day it's all language it's all instructions so it's all prompting so what this actually means in practice is I've spent many thousands of hours at

this point working on my whole structure I don't know why this thing is 7 gigs I actually took it down from 10 gigs but that's a lot of text that's not actually the scaffolding cuz that would be a nightmare and nothing would be able to

parse it. So there's a lot of outputs

parse it. So there's a lot of outputs and other stuff, but the system is definitely growing. And the most

definitely growing. And the most important directories inside of here are things like skills and hooks and history, which we're going to talk about. So this leads really well into

about. So this leads really well into the next one, which is the scaffolding is more important in my opinion than the model. Now, there's some exceptions to

model. Now, there's some exceptions to this, and all the news, of course, is about the new models when they get released. you know, Gemini 3 recently,

released. you know, Gemini 3 recently, Opus 4.5 recently, but I've always been team scaffolding, and I I continue to be team scaffolding. Obviously, it's it's

team scaffolding. Obviously, it's it's best to have both, right? If you have really good models, it magnifies the scaffolding, and if you have really good scaffolding, it magnifies the models.

But if I had to choose between the latest model with not very good scaffolding or excellent scaffolding with a model from 6 months ago or even a year ago or even 18 months ago, honestly, I would definitely pick the

ladder. And a quick quick note on this.

ladder. And a quick quick note on this.

Yeah. So I totally agree with Daniel about the scaffolding one thing. So I

spent a lot of time reading and thinking about people who are using AI for vulnerability detection for example like analyzing source code or scanning running systems and I think it's difficult to say like a given model can

or can't do this because again to Daniel's point like the scaffolding around it makes such a huge impact like if you see OpenAI's Arvar or deep sleep or codemen from Google basically they

are giving all these sorts of tools and orchestration layers like around the models which allow them to perform orders of magnitude better on like the same task. So when someone says this

same task. So when someone says this current model can or can't do something, it's kind of actually hard to know that for sure given just better orchestration and context management and things like that can cause meaningfully different

outcomes. So I think scaffolding is

outcomes. So I think scaffolding is huge. So yeah, that I think that's a key

huge. So yeah, that I think that's a key point. So I just wanted to like

point. So I just wanted to like emphasize that. Okay. Yeah, back to you.

emphasize that. Okay. Yeah, back to you.

>> Yeah, I think those are all good points.

Another one is the AIXCC competition.

Trail of Bits really crushed it there.

They they did a lot of scaffolding there. The Atlanta team as well. So good

there. The Atlanta team as well. So good

examples of that. So uh my skills directory is probably the most important center of the scaffolding. Currently

have like 65 skills in here. A lot of these are just very uh pointed at my own stuff that not really useful to other people, but a number of these are just core and just essential to everything I

do. Next one of is the last of my three

do. Next one of is the last of my three central philosophies for the system. And

this one is to be as deterministic as possible. Right? So what does that mean?

possible. Right? So what does that mean?

It means in practice basically code before prompts is what that really turns into. If I have anything that I can do

into. If I have anything that I can do in code, I do it in code first. I don't

even use AI at all. And if you think about it this way, Kai is more like a tech orchestration system. Not really an AI system cuz I had something kind of

similar before before even AI obviously wasn't as good. But when you add AI on top of it, it really magnifies it. But

it's more of an orchestration framework.

This is my art skill for example which we're going to talk about more and the tools directory within that skill and this is just deterministic regular code right anything that my art skill does is

actually running code at the end of it right at the end of the day it's actually just deterministic code and that provides as much consistency and control as possible and also has the upside of not involving AI at all for

the step which saves a bunch of tokens and usage and everything. This one I've broken out just because it's so important, but really it's the same thing we just talked about. Code before

prompts. I'm not really sure what my current percentage of this is, like 80 versus 20. I think I'd like I don't

versus 20. I think I'd like I don't know. Curious what you think, Clint,

know. Curious what you think, Clint, what this balance should be like the ideal balance, but I feel like it should be mostly deterministic with like AI wrapping it 8020. I don't know. I'll

have to do more thinking on that and see how it plays out. Yeah, I was just going to say I think if there's something that can be done programmatically deterministically, I think code is the right solution for it because it's like cheaper and you know you're going to get

the answer you expect. In the past creating that code maybe has been time or cost prohibitive but now with like cloud code and other similar coding agents like creating the deterministic code can be done in a fraction of the

time. So yeah, I think it depends on the

time. So yeah, I think it depends on the domain. When you need like a fuzzy

domain. When you need like a fuzzy answer that's sort of difficult to solve generically completely deterministically, then maybe sort of some sort of prompt and code system is better. But yeah, I think the core

better. But yeah, I think the core intuition here is like if you can solve it deterministically, like probably that's the better solution and maybe you vibe your way to like code that does it.

But yeah, trying to solve everything with prompts, at least as of today, is going to be inefficient, costly, and like if you were like find all the routes in this repo, you're going to find a lot of them, but cloud code or whatever system is going to miss some of

them. So, if you can do that with like

them. So, if you can do that with like another tool that you know is going to work, it's just better. But yeah, you were saying about Anthropic.

>> Yeah, Anthropic came out with a thing about this actually kind of throwing shade at their own MCP. like they

invented MCP and they're like, "Hey, you might want to actually just do this in Typescript and use the MCP to get the service that you want to use, turn that into TypeScript and actually run that

instead cuz then you're not calling all these uh tokens before and after. You're

just getting the results and then you could use the results to give to AI."

So, I thought it was really cool and within like a little while of them releasing that, I upgraded a couple of my MCPs to not use MCP anymore. So this

next one is specifications, tests and evals. And this is also playing at the

evals. And this is also playing at the whole concept of determinism and consistency. So there's a big tendency

consistency. So there's a big tendency in AI to use vibes. This is just like a gentic everything is vibes. Vibe

hacking. Now vibe marketing, of course, vibe coding. Big thing Clint and I

vibe coding. Big thing Clint and I actually talk about a lot is how do we know any of this is working, right? How

do you actually test any of this stuff?

How do we actually get consistency from what we build? This is the skill.mmd for

my development skill. And you can see I'm starting with specd driven development. And this is roughly based

development. And this is roughly based off of GitHub's really excellent project called spec kit. And I basically simplified it because it was it was a little bit too involved. But um first

you create specs, then you create plans, then you write tests, then you write code. And that's the flow that uh that

code. And that's the flow that uh that system uses. And I'm always optimizing

system uses. And I'm always optimizing this, but this is the general flow that I follow. So, the next one I've been

I follow. So, the next one I've been obsessed with since being in college in like the 1990s. And uh shout out to my buddy Kundi who uh showed me the command line for the first time, which honestly it might have been one of the best days

of my life when I found out I could pipe the output of one command into the input of the next. I mean, it truly tripped me out. And I feel like I've been building

out. And I feel like I've been building systems based on that concept ever since. So the way it materializes inside

since. So the way it materializes inside of this system is I try to have each container do one thing well and I build different skills to call each other instead of replicating that

functionality inside of each one. I've

got a number of examples of this within Kai. But this is a red team skill and

Kai. But this is a red team skill and this red team can actually hit a network architecture, an application architecture, threat modeling. It could

do like all sorts of different stuff.

But I often use it to attack ideas that I have to see like blind spots if I'm missing something. But the red team

missing something. But the red team skill calls a first principal skill and breaks that down further into other pieces, right? So it works in a flow to

pieces, right? So it works in a flow to really break open ideas and attack the ideas. This is another really cool

ideas. This is another really cool example which is called lifelog pulls off of my uh necklace pendant which I'm wearing right now. I I could show when I turn on the camera. It's basically a thing that I can turn on when I'm

walking and I could say, "Hey, new idea or new blog or yeah, I should do a piece of content on this or whatever." And

then I just talk and it captures it and then when I get back I could say okay that thing I just said when I was walking take this and do that with it.

Right? So it goes and pulls the content from the transcript, pulls out the section, summarizes it from there. I

could do research on it. I could blog on it. I could do whatever. I could red

it. I could do whatever. I could red team the idea for example and it's all done using natural language prompt. Uh

just me talking to Kai and it cross calls all those different ones. I also

want to mention real quick custom slash commands. These are commands you could

commands. These are commands you could just type forward slash include code to run. And these are also calling one or

run. And these are also calling one or multiple. The cse one is calling create

multiple. The cse one is calling create story explanation which is a skill and it's just another way to call into skills as well worth mentioning. So if I take a piece of content I want to get

information from like this is a great article on tlddrc and this was from Jason Chan who built this security program at Netflix. So you could take that and you could do like this for/cse5

which will give me five levels of explanation of what this thing is about.

And this is what it does when it's working. It actually uses fabric here

working. It actually uses fabric here which I forgot about that but it uses fabric switch u which goes actually uses Gina AI to do this and it pulls down the markdown for the thing and then it runs

the cse skill and it returns the results in five levels. And cool little thing for Kai is Kai also updates the tab name inside of the Kitty terminal to be the

result and reads it in Kai's voice which we'll see later. Next one is engineering or S sur principles which is also part of this determinism story which I feel

like it's I might move that up to be like the most important one. But my

background is hacking in like the security sense, but also in the more pure sense of like building and creating and breaking things, right? So, I've

been a crappy programmer since the early 2000s, but I've never been an actual software engineer, right? And there's a huge difference, as anyone who's been both knows, there's a huge difference

between an SWE and someone who programs. So what I'm doing now is I'm learning a lot of engineering stuff that I should have learned in college and trying to build that into the DNA of the system

which usually manifests as tests and evals and stuff like that. The way it mostly manifests is through like the development skill where I'm going through you know true tested engineering

practices of like building plans, test-driven development and all that sort of thing. This is a thing I talk to Clint a lot about. Most people don't know this because he never talks about

it, but Clint Loki is a PhD in computer science, and it definitely shows when he starts talking about tests and evals.

You will see him light up like a Christmas tree. So, that's always fun.

Christmas tree. So, that's always fun.

>> For context, a lot of Daniel's skills and prompts and things like that. Some

of them have like this detailed backstory about maybe the person's persona and their goals in life. And uh

I'm like, Daniel, does that does that help? And he's like, you know, vibes,

help? And he's like, you know, vibes, baby. But then also like test them

baby. But then also like test them rigorously in practice to make sure that they work consistently.

>> Yeah. And I've got some examples of that when we talk about the voice system. I

put all this effort into it and I'm always wondering myself and then especially when I talk to Clint, I'm like, "Okay, what if I had this same exact prompt without all this personality stuff?" Yeah. So, we're

personality stuff?" Yeah. So, we're

we're doing a bunch of eval stuff so we can test this stuff at scale. This one

is super cool. This is a relatively new addition to the Kai system. So, not only am I trying to write code for as much as possible in the system, but I'm actually trying to have that be executed via CLI

instead of just calling the code and having the model try to figure out how to actually run it. So, I love the command line so much. Terminal is my favorite place to live. And I love the

fact that there's documentation, there's flags, there's switches, there's options, right? And it means you know

options, right? And it means you know how to use it. You know how to use a command line by running the help command. And you know who else loves

command. And you know who else loves that? AI loves that. AI absolutely loves

that? AI loves that. AI absolutely loves when it doesn't have ambiguity in what it's supposed to do. So, going all the way back to the concept of clarity and AI not being confused, like there's

nothing more clear than how to use a command line tool, assuming it's well documented. So, I've got a command line

documented. So, I've got a command line tool for launching Kai. Actually, when I type K, that used to just be a ZSH alias, but now I've got a actual command line tool for it. And the most useful

switches I I think are actually the uh switch M switch to dynamically load MCPS. And uh shout out to Indie Dev Dan

MCPS. And uh shout out to Indie Dev Dan for this one. He's a bull developer who's doing AI stuff on YouTube. You

should definitely check him out. He also

did this and I was like, "Oh, that's a great idea." So I built the command line

great idea." So I built the command line to be able to do that. And here's the actual command for generating images using my art skill. So I can pass in a model, but the default is actually nano

banana pro for obvious reasons. It's

just incredibly good. But I have all these different options. And this is what Kai actually uses to generate images. So this next one is a highle

images. So this next one is a highle flow for a concept that solidifies a lot of what we've been talking about. It's

just a way of thinking about how to organize the entire system. Basically,

you figure out what you want to do. You

figure out if you can do it in code.

Then if I can, I build a command line tool around that. Then I use prompting to run the command line tool. And then I use skills or agents to call it or to run it in parallel. And that's that's

kind of the flow and the structure. And

this is basically how all these skills work is going from the top level goal all the way down to the codebased implementation. This next one is super

implementation. This next one is super fun. It is also super useful. Basically,

fun. It is also super useful. Basically,

I have a whole bunch of capabilities within Kai that are used to update Kai himself, right? So it's like

himself, right? So it's like self-update self-healing self-improvement, and not just like a little component or a module or something, but like the system overall.

So you've seen all different components of the scaffolding. We have skills, we have workflows within the skills that execute things. We have code in the

execute things. We have code in the command line tools. Then we have the models, right? We also have different

models, right? We also have different services that could be called via MCP or API or whatever. So the best example of this is I have an upgrade skill. I guess

that's a good name for it if it's doing upgrades. So the upgrade skill, it's a

upgrades. So the upgrade skill, it's a universal skill that multiple sources, it hits multiple sources on the internet and I'm looking at those sources because they're constantly releasing stuff that

I cannot keep up with manually and I don't want Kai to get behind. So when I say run the upgrade skill, it will go and find all these different sources.

It'll parse the latest content. It will

review all of Kai's documentation.

There's a single file that documents all of Kai. So Kai will read that,

of Kai. So Kai will read that, understand how he works, and then understand all the updates that it just pulled from different sources, and then look for opportunities to improve. So

one of the sources it looks at is the anthropic engineering blogs, all of their releases on GitHub. I mean,

they're releasing stuff constantly, like every day, multiple times during the day. There's no way I could possibly

day. There's no way I could possibly follow it all. I also do this for YouTube channels. Somebody talks about a

YouTube channels. Somebody talks about a new technique, automatically parse it and bring it in. And security-wise, I do it for security research as well. So

like all the talks that Clint puts out when he puts out like, oh, here's the latest videos from so and so conference or whatever, I parse those and update my testing methodology if there's like a new technique. So here's an actual

new technique. So here's an actual example of me running this a little while ago. So Anthropic had a release

while ago. So Anthropic had a release saying how they could improve uh routing within skills using this keyword use when. and they basically emphasized,

when. and they basically emphasized, hey, look, you need to be using this because if you're not getting your skills to function the way that you want, they're not being triggered properly, you need to make sure you have

this in the front matter. So Kai ran the upgrade skill, found this, and came back with this as the top recommendation. I

said, "Okay, do it." And within like 5 minutes, the entire Kai system was upgraded with this uh piece of functionality. And after that, skills

functionality. And after that, skills worked way better. So, as I was making this like prepping for the conversation here, I said look for in our history for learnings because everything that I do

for an upgrade actually across the system, it's all captured in the history system and it's broken down into structures which we'll actually see a little bit later. But this basically looked up our archive of things that

we've done to learn over time. And

that's exactly what it found. It found

this use win thing. All right. So, this

one is absolutely crucial. I should

probably raise it in priority as well.

custom skill management. So this is probably the most significant thing that I did on top of cloud code. It it's just a completely different structure. So

skills are already really good inside of cloud code. It's pretty good at routing

cloud code. It's pretty good at routing by itself, but what I did was add a supplemental system that's more explicit about the routing. And I'm getting like I don't know 95 98%. I don't know the actual number. Me and Clint will have to

actual number. Me and Clint will have to figure out the actual number. We need uh you know rigor around this thing. but

it's basically routing in the system prompt which goes to a routing table of workflows that go to specific prompts and then within the directory there's also a tools directory that the workflows actually call and like we

talked about those are ideally deterministic code as opposed to prompts. So, what this does is it

prompts. So, what this does is it produces way better results for being able to do multiple things inside of a category such as art. And I'm going to show the art skill later, but what it

does is allows me to just speak in plain language and get exactly pretty much what I asked for. Last couple here, custom history system. We touched on this one a little bit, but basically I

have sessions, learnings, research decisions, all sorts of different categories here. And when we get done

categories here. And when we get done doing anything, if any agent does anything, if I do it, if Kai does it, if any sub agent does it, Kai thinks about what we did, turns that into a summary

and writes it into this history system.

That's part of the reason I've got like six gigs of stuff grown over time here.

But file system is cheap and file system is fast. So I I like this way better

is fast. So I I like this way better than rag for most things. This is what the directory actually looks like. And

it basically allows me to understand where we've been, where we are, and where we're going, right? Cuz I hate making the same mistake over and over. I

especially have a system like this for bugs, calling out bugs that keep coming up when I'm building web applications.

So I could say, capture a learning on this, capture a learning on this, and have that be crystallized into a thing that Kai can then use to upgrade the development skill. So we don't do that

development skill. So we don't do that anymore. And this last one is kind of

anymore. And this last one is kind of flourish to be honest, but it can be useful as well. I basically have a fully customized voice system for Kai and all the different agents that we use within

it. So we've got architects, engineers,

it. So we've got architects, engineers, researchers, QA testers, interns. They

all have different personalities. This

is the part that I'm not sure how much this is actually helping, but it's fun.

And they all have different approaches to their work, too, right? Some are like library scientists, some are like super curious, and they have different voice characteristics based on their personalities. So, what this means is

personalities. So, what this means is while the agents are talking to me or to each other, you can actually hear emotion in what they're talking about.

If they come back with a finding and they're excited about it, like you can actually hear that in the voice rendition. I still need to do more evals

rendition. I still need to do more evals on this uh as I talked about. So, the

whole thing goes through a voice server, which we have, which is part of the PI system. It goes through 11 Labs API.

system. It goes through 11 Labs API.

then it gets read out on the local system in the particular voice of that particular agent. And uh it's mostly

particular agent. And uh it's mostly just pretty cool like I said, but I do get use out of it because if I'm building like 20 different things off on the side in their own sessions, they can be reporting back with what they did and

I could actually tell by their voices who they are and plus they give me a summary of what they've done. And these

are all the different agents that we have. Like I said, we got engineers, pen

have. Like I said, we got engineers, pen testers, all kinds of different folks in here. So that is the overview of the

here. So that is the overview of the system and I just want to reiterate that the whole point of all of this is to go from the left to the right here. A human

on the left who has human interests and human goals using a system that does all this different stuff with tech and AI or whatever with again the output to be

human outcomes that help humans. So it's

humans tech in the middle. It's not

important and then humans on the right side as well. So I want to do a deep dive on a real skill or at least just show it live in a quick demo. All right.

So first we're going to invoke Kai.

>> Kai here ready to go.

>> All right. So KS is to go to the skills directory. Kai skills. See the art. And

directory. Kai skills. See the art. And

inside of here we have that level. We

have that structure. We have workflows and we have tools. So if we go inside of workflows, this is what that looks like.

We've got annotated screenshots. We've

got apherisms. I haven't even tried that one. Comics. This is like an XKCD type

one. Comics. This is like an XKCD type thing, although I didn't use the Randall Monroe style because I thought that was kind of rude. Comparisons essay. Essay I

use all the time. It's actually what produces art for my site now.

Frameworks.md is actually for talking about like architecture frameworks.

Maps, mermaid diagrams. This one makes it look like mermaid. Stats, taxonomies,

technical diagrams is another one that I use all the time. timelines and

visualize is actually just if I say visualize so and so and I give Kai input he will decide which of these to use or which combination of these to use and then if we go into the tools directory

we can see these here I think I showed one of these already so no need to show that but what I want to do >> hi here ready to go >> let's have somebody give a concept somebody from chat to give a concept

>> okay we have a couple purple dogs skiing down the mountain ghost in the machine style timeline horizon for The AI revolution. Trees are younger than

revolution. Trees are younger than sharks. Adopting UTCP. Pig with a

sharks. Adopting UTCP. Pig with a jetpack. Longing for my lost youth. Do

jetpack. Longing for my lost youth. Do

any of those AI and humans living together?

>> A human story arc. I like that one. And

I love the uh the trees are younger than sharks. That blew me away. Sadly, I

sharks. That blew me away. Sadly, I

learned that only like two years ago. I

had to Google multiple places cuz I thought it was fake news. Okay, I want you to visualize a human story arc. So

basically all the way from I guess coming out of the ocean assuming you believe in that sort of thing and then moving through different stages of development agriculture I don't know we could talk about warfare we could talk

about science we could talk about whatever you want to put in here but basically just show an arc and actually show where we're currently at and then move beyond that as well and show other

parts like the AI transformation or what comes after that. All right. So, that

was captured using dictation, which is how I do 99% of everything now. And

we're going to see what Kai comes up with.

>> Yeah. Shout out Whisper Flow.

>> Absolutely. I think I'm at I'm about to cross 600,000 words for Whisper Flow.

>> Does Whisper Flow have different tiers like you are a certified yapper or uh >> Yes, it does. You can actually set if you want to do formal. I always have mine set to formal. I like to capitalize

the first word and use a period rather than a casual one is all lowercase all the time. While this is working, Daniel,

the time. While this is working, Daniel, a couple of questions that have popped up a couple times. Um, people curious about the um overall cost of this. So

maybe the like cloud code costs for running Kai. Obviously, you're using

running Kai. Obviously, you're using Whisper Flow and various other services like Nano Banana from Google for generating images. Yeah, I'm curious

generating images. Yeah, I'm curious maybe if you could concisely say like here's the main tech stack and like approximate maybe monthly costs.

>> Yeah. Yeah, it's a great question. So

first of all, Kai uses lots of different AIs inside. So he's calling often times

AIs inside. So he's calling often times Google stuff like this art stuff is going to use Google. I use openAI stuff sometimes but these are call outs from the base which is cloud code. On cloud

code I'm using the maximum maximum which is $200 a month. So I would say that my usual cost for using Kai is less than $250 per month.

>> Cool. That makes sense. Yeah. Thanks for

sharing.

>> All right. So look at that bun run. By

the way, Anthropic just bought bun, which I'm super excited about. But you

see, now he's using the art tool, the generate.ts, which is the command line tool to generate using Nano Banano Pro.

Any other questions? Yeah, $250

absolutely is cheap. Hey, what's up, Rowan? Oh, yeah. With 11 Labs, that's a

Rowan? Oh, yeah. With 11 Labs, that's a point a good point. I didn't add that on. Maybe that's like another $20. I

on. Maybe that's like another $20. I

think that's also pretty cheap. I have

only hit my claw limit once and it was actually a couple of days ago when I was going absolutely nuts for the entire weekend and then it switches over to using a key instead of using the subscription and that will hurt you very

quickly if you're not inside the subscription. I think I worked for maybe

subscription. I think I worked for maybe an hour and it cost me I think like $70.

Yeah, I think uh something Aaron mentioned in the chat, but I think thinking as like a person, you're like, "Yeah, 250 to 300 per month is like a lot, but if you're thinking of it from like a business cost point of view, and

you're like, well, how much more output do I have? How many more things do I accomplish? Am I doing things that might

accomplish? Am I doing things that might lead to additional revenue or things like that? So, for example, you do a lot

like that? So, for example, you do a lot of security consulting. So, if you're like, yes, this costs me x amount of money, but then I deliver work, which I charge like this much for." So, it's like you're still making more money than it is costing. So I think that's also a

useful frame.

>> Yeah, completely agree. This is pretty interesting. It's kind of bunched up on

interesting. It's kind of bunched up on the side over here. This is definitely not using my technical diagram. Are we

making multiple? Cool. Oh, when I said visualize, I also have a visualization skill and that produces uh D3 graphs. So

I think that's what he's trying to do there. Yeah, this one's pretty

there. Yeah, this one's pretty interesting. I'm actually curious. I

interesting. I'm actually curious. I

want to I want to do a different one.

>> Hi here. Ready to go.

>> Use uh the technical diagram workflow to make this one inside of the art skill. I

I think that would be a pretty cool visual. I put a lot of effort into that

visual. I put a lot of effort into that workflow recently. So I want to see how

workflow recently. So I want to see how he does there. Oh, I forgot to show something. This is how I see what Kai is

something. This is how I see what Kai is actually working on. So this Daniel here, you can see my prompt and you can see that Kai is working on a thing. We

got the stages that it's doing AI transformation, current post AGI, cosmic unknown, future. So that's how it broke

unknown, future. So that's how it broke down the different pieces to visualize.

And then yeah, this is my observability system for watching what the different agents are doing. This is a custom interface that I built. Obviously, I

wouldn't release because it's too cool except for it's already released. It's

already in the project.

>> Yeah. Is that in the PI repo?

>> It is.

>> Oh, I actually didn't know that till just now. Awesome.

just now. Awesome.

>> Yeah. And I just updated it. So, it has all this new UI stuff. And so, check this out. Also, a UI workflow. It shows

this out. Also, a UI workflow. It shows

the activities of what it's working on.

It shows the different skills, shows the different tools that are being used. It

actually obuscates uh if there's keys being used, it won't show the keys in the flow as well, just in case I'm uh doing a YouTube video or a webinar.

>> So, a bunch of people both in the normal chat as well as the Q&A are asking about like costs. You already talked about

like costs. You already talked about that a little bit, but I guess I just wanted to convey if you wanted to do a blog post on like a deeper dive into like example monthly costs for like specific services and workflow and stuff

like that. Seems like there were five to

like that. Seems like there were five to 10 questions about that. Maybe more.

>> Yeah, that makes sense. I could

definitely do that.

>> And then there were broadly some questions about Emmy asked about sort of like the owning your own AI in terms of investing in GPUs to run local models.

You know, should you buy your own hardware and run local models versus relying on third party providers? Other

related questions I would say are like, well, could you do the same approach using like codecs or Gemini CLI? And

what made you choose cloud code versus like open code? There's like a couple of questions around which models and like tooling to use and why.

>> Yeah, great question. So, Cloud Code in my opinion has the best scaffolding. We

already talked about the scaffolding and how important it is, but it's just to me vastly superior to any of the other options in terms of like an overall infrastructure to work from within. The

way that I call cuz I saw a number of people ask that question. The way that I call is via command line. So, Gemini is a tool. Gemini is a command line tool.

a tool. Gemini is a command line tool.

Codeex is a command line tool. For

example, I have a Gemini researcher.

Gemini researcher uses Gemini and then it uses my Google account to do a deep research using Google deep research because I want that to be a whole separate ecosystem of basically how

Google thinks about search and how it has its own structure and biases and stuff like that. I have a Grock one. Oh,

nano banana is down. That would explain it. That's why you pre-record demos

it. That's why you pre-record demos >> while you're waiting. Question from John Roberts. So if you're not CLI savvy or a

Roberts. So if you're not CLI savvy or a programmer, how far can you get? Clear

thinking and writing doesn't require a CLI but the next steps do code before a prompt for instance. I guess broadly advice for people who are not CLI savvy or a programmer.

>> Yeah, I would say I mean I'm not writing these CLI tools. Kai is writing most of this code. I have the ability to go into

this code. I have the ability to go into the code and fix it or change it or modify it. But more and more I am using

modify it. But more and more I am using my interaction with Kai to change the code and I'm using my scaffolding to control what code gets made in what way.

But I'm not over here spending time writing code. That would massively slow

writing code. That would massively slow me down. Honestly, it would be worse

me down. Honestly, it would be worse code compared to Kai writing code that is controlled by me and my scaffolding.

So I would say you should not be intimidated. You should not be

intimidated. You should not be intimidated by I'm not a coder or whatever. The one thing I'm trying to do

whatever. The one thing I'm trying to do with PI is make it clear. You can be completely non-technical because again, I come in here and and I I could basically say anything that doesn't

involve a model that's currently down.

So, let's create a human arc story using one of the other models that you have access to using the technical diagram. I

can't remember. I think I have Replet in there with a few models. So, look at this. Over here it shows that the art

this. Over here it shows that the art skill is running and over here it says the art skill is running. And then in the tab up here, it says human arc story creation, which is also in the visual.

But just to complete that point, do not be intimidated. The whole point of PI is

be intimidated. The whole point of PI is that you should be able to be non-technical and come in here and do all this stuff. Yeah. So you can see he can pivot to using a open AI model. I

haven't used that one. I don't know if the key is good.

>> One thing I was curious about though, Vashall asked, what controls or guard rails do you have in place to protect Kai from performing malicious activities? Especially because you're

activities? Especially because you're like, you know, scraping websites which could have arbitrary content. You have a lot of surface area to potential prompt injection, for example.

>> Yep. So, that one I'm not going to talk too much about just because I I don't want to talk about the controls that much, but I've basically got four to five layers of different defenses uh put in there to to block that kind of stuff.

A lot of it involves >> human story arc diagram generated with GPT image one open for review.

>> Yeah, so this one is not bad. I mean, I like this. It's nowhere near as good as

like this. It's nowhere near as good as a Nano Banana Pro in my opinion, but still looks pretty cool. To finish

answering that question, a lot of Kai understanding what our purpose is, what Kai's purpose is for us in the ecosystem, and then recognizing things that are attempting to hijack that purpose. So, a lot of things around

purpose. So, a lot of things around prompt injection. I'm also using the

prompt injection. I'm also using the anthropic controls for different types of control of tools, what it can and cannot call, especially in conjunction with a request, which is out of

character. So, I've got a bunch of that

character. So, I've got a bunch of that stuff built in. I guarantee that I could probably break it and a lot of other AI oriented pen testers could probably break it as well. I would guess it's

probably an 85 to 95% decent defense and then that's the type of thing I just keep improving as well.

>> For some additional thoughts on that, I think anthropic shipped like a cloud code sandbox that tries to sandbox things by default and then to your point, you have different sub agents for different tasks. So one may be like a

different tasks. So one may be like a researcher or something that is talking to the internet whether that's like YouTube transcript or scraping from a web page or things like that and the capabilities that you give that agent

could be mostly like readonlyish type things and then maybe you just write a summary of like the plan of what you're going to implement or a distillation of whatever you read and then what actually

implements or makes file system changes or has access to execution or things like that on your computer could be like a separate agent that just like reads whatever was written by the previous step which you could as a human read

first before just sort of blindly executing. So, so basically similar to

executing. So, so basically similar to um Simon Willis's blog post about the I forget what he exactly called it like the dangerous or the critical three.

Basically having separation between external sources that you don't control and then like executing things locally.

You could have like a human review uh in the middle.

>> Yeah, I I think that's that's all very smart.

>> Yeah, please leave a little trifecta.

Thank you everyone in the chat. So, I'm

sure everyone or mostly everyone probably already guessed this, but this was the actual prompt that was used to generate all the art for the entire presentation. And this is using the

presentation. And this is using the technical diagram workflow. Yeah. All

that stuff that you've been seeing. Kai

made all that stuff. So, getting started on your own. When I say that like the purpose of all this for me is to really enable people to be the best versions of themselves and like magnify themselves,

like I'm actually quite serious. So, the

whole thing it's all open sourced.

There's a whole bunch of skills that aren't migrated over. Most of them are just complete garbage to anyone else.

There's a few that are good for other people that I just have to be careful about how I move them over. Talk about

stress. Every time I push from Kai to Pi, it's like one of the most stressful things because I've got, you know, sensitive stuff in here. Um, and stuff that's, you know, yeah, that should not

be shared in a public repo. So, I've got like, yeah, pre-commit hooks. I've got

all all sorts of defenses to make sure I don't share anything uh private hopefully, but I know for a fact that that's going to happen. So, I've also got a key rotation routine I could execute if necessary. So, the project is

called Pi, which unfortunately rhymes with Kai. Kai was just my first role

with Kai. Kai was just my first role playing game, and it just happens to rhyme with Kai. It's unfortunate, but uh it is all out there for people to use and download and play with. So, these

are some of the most common questions that I get, and I'm not going to go through these. I think we might have

through these. I think we might have touched most of them in the content or in the questions, but if you want to just take a picture of this or screenshot or whatever, those might be helpful. And with that, I want to say

helpful. And with that, I want to say thanks for your time. You can hit this QR code to connect with me and different projects I'm working on. Got a whole bunch of stuff coming out in January to

get a lot more of this like on demand inside of uh videos. I also help companies like do this and train their uh people to do this at work. So, you

can hit me up if you're interested in that. And with that, I'm happy to take

that. And with that, I'm happy to take questions.

>> Maybe here's a quick one from Henry. Can

you speak to the role of Git and give Git ops in your workflows?

>> Yeah, I I would say everything is Git heavy heavy Git usage. I've got a bunch of security controls there as well. The

context management also matters there a lot. Kind of the big thing you need to

lot. Kind of the big thing you need to worry about is you got four tabs open, five tabs open, and they're all separate sessions. They're all doing really cool

sessions. They're all doing really cool stuff. And maybe you go into the wrong

stuff. And maybe you go into the wrong one and you say, "Yeah, great work. Go

ahead and push this." But maybe it got confused about something that happened earlier. It thinks this content should

earlier. It thinks this content should be pushed to that repo. So I got a whole bunch of defenses around that to make sure okay what is your current working directory? What is the history of our

directory? What is the history of our conversation? Therefore, what directory

conversation? Therefore, what directory do I probably mean when I say that? So a

lot of scaffolding around that. But in

general, I'm huge file system. Get work

trees are also really powerful. So you

have separate branches on separate file systems doing different experiments and the one you like is the one that gets brought over and merged. I would say I use Git very very heavily. Good

question.

>> Uh there's a question. Are you going to do a deep dive course or training or something on this at some point?

>> Yeah, absolutely. So I've been doing augmented since 2023. Early 2023. I did

a AI course and I'm converting that over to be an online version instead of live.

So the the previous four were all live and it just it doesn't scale and you have to be there and it's just like it's uh it's not great. So that's going to be what I release in January as part of this human 3.0 program that I'm doing.

So it's going to be more modular too where you could just like watch the history system video or you could watch the skill building video and stuff like that. But also talking about the overall

that. But also talking about the overall arc of like how to get into this if you're not technical, how to do this for work, lots of different stuff like that.

>> Awesome. Yeah. Yeah. Uh question from Lad. How do you decide which workflows

Lad. How do you decide which workflows to automate that is add to Kai to get the highest ROI? So, you know, how do you avoid wasting time on some complicated skill or flow that's going to take too many iterations to make it

work, right? And I think there was

work, right? And I think there was another question in the chat about how do you avoid spending all your time yak shaving versus actually building new things and making progress.

>> Yeah, that that's a challenge. Um, it's

a challenge because I love the tooling.

Like I've also you're also talking to someone who spent hundreds of hours on their neov config. So I'm also a tech nerd, but I also think humans matter

more than tech, right? So I'm 80% human, but also 20% tech nerd. So the answer is sometimes I overfocus on the tech and I have to remind myself, hey, what are you doing? That's why I have a bunch of

doing? That's why I have a bunch of sticky notes in front of me. I have a whole TLO system for staying on track and like goals and stuff like that. I

think it's fine to nerd out and to go crazy with new tech and learning stuff as long as you could pull yourself back and get get back on track to the goal.

>> Yeah. And and just to be open about it, I think like Daniel and I have this conversation often where we were chatting about evals recently and I was like, "Oh, here are some services that are like these built-out eval platforms." And he's like, "Well, what

if I built my own eval platform from scratch?" And then like one day later

scratch?" And then like one day later sends me a screenshot of like a working version. So I think yeah, it's easy to

version. So I think yeah, it's easy to rabbit hole with this stuff in terms of priorities. I think if you do like a

priorities. I think if you do like a time audit, like where am I spending my time today and what is low value versus high value? like the things that are

high value? like the things that are taking a lot of time that are not maybe uniquely suited to you or your role. So,

lots of time, low value are good places to try to automate and streamline or things that happen often that seem easy to streamline. It's also useful. Yeah,

to streamline. It's also useful. Yeah,

>> totally. It's a TLO te. It's also a GitHub repo. A lot of the stuff that

GitHub repo. A lot of the stuff that I've talked about is is on GitHub. It

it's basically a system for managing goals, priorities, challenges to your goals. It's like alignment towards a

goals. It's like alignment towards a particular goal. So, you could use it at

particular goal. So, you could use it at work. You could use it like to run a

work. You could use it like to run a family. You could use it to run a United

family. You could use it to run a United Federation of Planets. It really scales to any different size.

>> Oh yeah. I think um some other broad questions are like they they see your setup and then they see the public pie.

Let me describe it and you can feel free to correct me. My understanding is you have like Kai and this is your personal thing um like what you're working on and then you are taking the generalizable

public safe to share pieces of it and putting it into pie which is kind of a subset of Kai or at least the public version of it that's not the rapidly changing stuff that you're working on dayto-day.

>> Yeah, that's right. And and I'm even putting like the super high value stuff like the art skill for example is up there and a lot of people told me do not put that up there. That is like a massive advantage for you as a content

creator because I could basically feed it a blog post and it makes the perfect art image. And if you modify your

art image. And if you modify your aesthetic file and you say what you want it to look like in the aesthetic file, your PI system will do that for you. But

my answer to that person was, well, I want more people to be content creators.

Like that's the whole point of this entire thing. So like I am essentially

entire thing. So like I am essentially trying to get everything that I can into the public version as fast as possible.

like I'm uh generally handling the issues and PRs inside of Pi within hours or a couple days. Try to do as fast as possible.

>> Related question from Brandt. Do you

have any good workflows for updating your personal PI? So basically you have like the public pie and then you have your local customizations and like how do you sort of manage those in terms of like maybe pulling the latest that you

and other community members are sharing without maybe overriding or complicating your customizations people have made locally. Um, so I haven't had many

locally. Um, so I haven't had many situations where people have added too much that I've pulled back in just because it's so early. I think that'll start happening a lot more in like the

next months and year, but right now it's mostly me pushing out.

>> Yeah. But if you're like a user of Pi and you maybe are like trying to balance those changes with the ones you've made locally.

>> Oh, right, right, right. Yeah, I haven't found a good solution for that yet. We

are actively working on that inside of the Pi repo in the community and the discussions. We're trying to figure out

discussions. We're trying to figure out if that looks like a minor fork situation, if that looks like a side by side type thing where their PI agent have it sitting on the side all the new

stuff and then it will migrate over to the authoritative one. So that's that's one option for that. Yeah, I guess it could also be standard Git and GitHub workflows in terms of like you fork Pi, you have your private fork in a private

repo which you then commit all of your personal or like stuff specific to you and then you're just periodically merging in from the remote that is like the official public pie merging it into your private one.

>> That's right. I mean, and you can have your agent do that and decide what fits for you, right?

>> Keith had a great point. Maybe you have like a local folder that gets get ignored where people can create customizations and then still get upstream stuff.

>> Yeah, I think that's a good idea. Yeah.

Also, just shout out to Keith. He and

the Trail of Bits folks are doing awesome uh AI work. So, check out their blog. I enjoy reading it very much.

blog. I enjoy reading it very much.

>> Oh, yeah. Yeah. They're doing amazing stuff. Crushed it at the AI XTC as well.

stuff. Crushed it at the AI XTC as well.

>> Absolutely. So, there are still a lot of questions, but I wonder perhaps for time sake if we might want to answer them like asynchronously after or how are you feeling?

>> Yeah, I think we can uh get some of those together and yeah, maybe we do another session or something or send them out. Yeah, I think async is the way

them out. Yeah, I think async is the way to go.

>> Yeah, people seem to have a bunch of excellent questions and uh no shortage of them. So, also your workflow and how

of them. So, also your workflow and how things work dramatically changes every few weeks in terms of it gets better and better. Happy to do this again uh maybe

better. Happy to do this again uh maybe in a few months or so. Cool. Yeah,

Daniel, thank you so much. This was

awesome. Tons of stuff to dig into. So,

really appreciate your time. Have an

awesome rest of your day and talk

Loading...

Loading video analysis...