My Agentic Engineering Workflow (step by step workflow)

By Ras Mic

Summary

Topics Covered

Experience, not tools, steers agentic development
GP Loop: autonomous review-and-fix until five stars
Plans exist to help the human, not the agent
Smaller PRs give AI reviewers a fighting chance
Sub-agents keep the main thread free for chatter

Full Transcript

My agentic engineering workflow has changed. It's better. The models have

changed. It's better. The models have got better. Some of the tools have

got better. Some of the tools have switched up. The main important thing

switched up. The main important thing that you need to understand though is that the experience I have from building applications, it's what steers me when I do agentic development when I use agents to build my applications. Now, usually

the videos I've done is I'll show you the tools that I use and sort of highle experience my workflow. There's an app I've been working on called Pluto and I'm going to build out one of the features that I had planned with you.

This video might be long. This video

might be short. This video might get in the nitty-gritty. I don't really have a

the nitty-gritty. I don't really have a plan for this other than record me building a feature. And if that excites you, I hope you're ready. Sit back,

relax. Let's get straight to it. So,

high level my workflow. I'm using GBD55 extra high fast. And I'm using it in cursor. Now, I know a lot of people love

cursor. Now, I know a lot of people love using codeex app and the codec cli. You

can do that as well. I genuinely just prefer cursor. Now, cursor is on the

prefer cursor. Now, cursor is on the more expensive side, but in my opinion, it is worth the cost, especially for the app that I'm building. Second thing that I'm introducing is I'm using GPile for the code review. Now, there's other

great code review tools as well, but the reason why I stick with Grapile, and I really like Grapile, is the slash Grep Loop skill that they have. I'll explain

what that is in a second. And third, I use Whisper Flow. I've noticed that, man, when you speak, when you use speech to text, you will say a lot more than you type, right? It's going to take me a second to type things out, but if I'm

just to yap, I'm already a yapper. I'm a

YouTuber. I yap for a living. Whisper

flow just makes so much sense. And I'll

be honest with you, I haven't been on the paid plan. I've used it for the last couple months. I haven't paid for a

couple months. I haven't paid for a single thing. I don't even know what

single thing. I don't even know what paid users get. That's how much I've been using the app. And that's how generous the free tier is. So, we're

going to use cursor GBT 5.5 extra high fast. We're going to use gravile. I'm

fast. We're going to use gravile. I'm

going to talk about the gre loop and we're going to use whisper flow for all our prompting. Let's build out this

our prompting. Let's build out this feature. So, what I want to build for

feature. So, what I want to build for Pluto is an artifacts feature like COD, right? And if you're not familiar with

right? And if you're not familiar with artifacts, I have an example here. I

prompted the agent here, "Show me financial projection of someone who invests $500 a month from age 18 and how much money they'll have by 40." Now,

Claude built like an inline component, which is actually pretty cool, and maybe this is another feature we look at.

Right now what I want is basically what's on the right and like an HTML page, a React page, whatever it is being generated and I can visually see this.

This is what artifacts was. This is what I think Anthropic like innovated on and it was pretty cool and I'd like that for Pluto. Now Pluto is pretty awesome.

Pluto. Now Pluto is pretty awesome.

There's a lot of cool things that come with Pluto out of the box, right? I

obviously have a chat interface. I can

connect my iMessage, Telegram, Slack, but there's a couple cool form factors, right? I have a cam band called tasks. I

right? I have a cam band called tasks. I

can also set up routines which are repeated tasks. Every agent gets its own

repeated tasks. Every agent gets its own email, right? Thousand plus connections,

email, right? Thousand plus connections, right, using Composeio. And then we have a files a dedicated files workbench. And

this is basically where you can upload like you know invoices, contracts, spreadsheets, whatever. And like there's

spreadsheets, whatever. And like there's a OCR workflow and like there's a very specialized files workflow where the agent has precise knowledge and data especially for very very large files.

This is pretty awesome. And then cards.

You we're actually working on I'm actually working on being able to give the agent its own credit card, virtual card, so it can make payments. Has its

own phone line. And then finance is something cool where you can connect your business's finances and it can read all the information. Right? So this is basically an agent for businesses. And

when we go to chat, you can also give every agent has its own dedicated computer. Right? Right now we have Linux

computer. Right? Right now we have Linux machines. Soon we might be able to give

machines. Soon we might be able to give access to Mac machines or Windows machines, but right now we have Linux.

So this is all pretty cool. This is

Pluto in a nutshell. If you want a more dedicated video on Pluto and how it works, let me know in the comments down below. Let's build out this feature. Now

below. Let's build out this feature. Now

the first thing I'm going to do is I'm obviously going to open up cursor. I'm

going to open up cursor. Let's give let me zoom in a little bit and I'm going to start yapping. I want to build a clawed

start yapping. I want to build a clawed artifacts like feature. If you're not familiar with cloud artifacts, basically I can prompt the agent to do something and if there's like a visual component,

whether it's writing HTML or a markdown file or whatever the case, maybe a React file, it will preview it to the side.

And because you're a smart agent, you have access to a web fetch tool, why don't you search the web and learn what the cloud artifacts feature is and tell me about it cuz this is what we're going to build. And whisperflow processes

to build. And whisperflow processes that. We hit enter. Now the way I'm sort

that. We hit enter. Now the way I'm sort of working on this app or at least the workflow that I have in terms of CI/CD and all that type of stuff, I'm using GitHub of course, but the way I'm

developing everything is I have a staging branch. Everything gets like,

staging branch. Everything gets like, you know, I'm I'm working on a feature locally. Once I like it, move on to

locally. Once I like it, move on to staging branch. I test it out on staging

staging branch. I test it out on staging branch for some time and if I like it, move it over to the main branch. Now, I

talked about GP loop for a second. I

kind of want to explain to you how that works. So, one of the code reviewers I

works. So, one of the code reviewers I have here is Graptile. This is a pretty large PR, so we won't be able to review the entire thing, but there was a moment in time it did. Let me show you exactly

where that could be. I think we just need to unload. And here you have it.

Now, what's cool with Gravit, you get the summary and you get this confidence score, right? You get this confidence

score, right? You get this confidence score. Right now, this is a four out of

score. Right now, this is a four out of five. Anything four out of five and

five. Anything four out of five and higher, obviously being a five out of five, is good enough for me. But what's

cool about Gre Loop, and if you don't know how to set up, if you haven't had it set up already, you literally just go to Grab Tiles repo, find their skills, and the GP loop skill is what you want.

And essentially how Gre Loop works, I can diagram this for you. Let's say

there is me right here. I actually have a great icon for this. Let's say there's me. I push a change to my app. What

me. I push a change to my app. What

Gravile is going to do is Gravile is going to review it, right? It's going to do a review. And then let's say I get a two out of five, right? Let's say I it works, but there's some security features that I missed. There's some

edge cases that I missed. I just missed a bunch a bunch of things. Now I can read the comments, give it to my agent, and get the agent to address the

comments. Or I can just enter GP loop,

comments. Or I can just enter GP loop, assuming you have the skill installed.

Once I have GP loop loaded, what's going to happen is my agent is going to read from GitHub. It's going to read the

from GitHub. It's going to read the comments. It's going to take in the

comments. It's going to take in the comments. It's going to address the

comments. It's going to address the comments. And it's going to push a

comments. And it's going to push a change. And then what's going to happen

change. And then what's going to happen is it's going to wait for a new review to be generated, right? Because every

time you push to that same branch, GPile files a review. Now, let's say a change was made and it gives it a three out of five. Meaning, yeah, you address some

five. Meaning, yeah, you address some things, but there's still some more missing. What Gre Loop is going to do is

missing. What Gre Loop is going to do is it's going to wait till it gets the new review. When it sees the three out of

review. When it sees the three out of five, it's going to realize, hey, this isn't a five out of five. Let's go back.

Addresses the changes, pushes again. It

will keep going. I think there's a maximum of like five to six turns, but essentially, it will keep going until it gets a five out of five, right? And the

reason why I like GPT 5.5 extra high is extra high fast is it's a really intelligent model. And especially when

intelligent model. And especially when it comes to building complex features, it just writes a bunch of tests. And in

this case, this is actually great because whenever I get feedback, it reviews the test and realizes, okay, the test case the originally wrote pass. I

need to add some more additional things.

And this has just been a great experience thus far. So, let's see what we got. It says, I found the feature. I

we got. It says, I found the feature. I

think you mean clawed artifacts. Oh, did

I write cloud? My bad. Claude artifacts

are standalone piece of generated content that appear in a dedicated panel beside the chat. Claude creates one when the output is substantial, self-contained, and is likely to be edited, reused, previewed, or referenced

later. Enthropics examples include

later. Enthropics examples include markdown, plain text do code snippets, single page sites, SVGs, diagrams, flowcharts, and interactive React components. So, so far so good. Core

components. So, so far so good. Core

behavior from the docs. The artifact

opens in the right side preview. Users

can ask Claude to modify the artifact, etc., et etc. So it it's got the gist for what we're building. The important

product shape is an agent produces a normal chat most of the time, but when it creates a visual or reusable deliverable, we promote that output into a first class artifact with its own life

cycle type, title, source, preview, renderer, versions, error state, and update path. The main architecture

update path. The main architecture pieces we'll likely need artifact detection and creation makes sense artifact registry preview runtime code

preview toggle iteration loop sandboxing and security error capture and sharing export later. Okay, so so far we're

export later. Okay, so so far we're good. Now this is what I'm going to do.

good. Now this is what I'm going to do.

You know exactly the type of feature that I want. I now want you to create a plan on how we're going to build this.

Make sure you review the entire codebase. Make sure you understand how

codebase. Make sure you understand how things work. I don't want us to build

things work. I don't want us to build this feature for the cost of breaking another one. So, make sure you do a

another one. So, make sure you do a great job. And yeah, give me your plan.

great job. And yeah, give me your plan.

So, it's going to generate a plan. Let

me go to plan mode. It's going to generate a plan. Now, there are other skills that I have, one in particular that I really use a lot, and it's called /code-

structure and basically take you guys to the repo. And again, I'll link this down

the repo. And again, I'll link this down in the description down below. This is

my personal skill. This basically

restructures a specific feature, the code base in a service layer. Therefore,

it's very clean. It's very

understandable if I need to dive in and look into the code, which I'll be honest, for the most part, I haven't really been after using this. But it

also helps the agent read the code and understand what's going on. Right? So,

this is another skill we'll be using as well. Now, let's go back to cursor. We

well. Now, let's go back to cursor. We

see that multiple sub agents using Composer 25 Fast have been deployed and it's going to be working on this plan.

While the features working, I can open up Steam and I've been I've been obsessed with uh Red Dead Redemption 2 again. I played it before. I finished it

again. I played it before. I finished it before, but for some reason, I don't know why, I just have this urge to play it again. So, while this features

it again. So, while this features working, we can play. So, right now, I don't know if you can see, but I'm taking Jack Fishing. Uh I think his name is Jack. He's John Martinson's kid and

is Jack. He's John Martinson's kid and yeah, we're going to wait for cursor to cook and I'm going to play in the meantime. While AI is generating code,

meantime. While AI is generating code, let me show you how you can get better at Gentic Engineering and that's with today's sponsor. Before I introduce

today's sponsor. Before I introduce today's sponsor, let's hear from everyone's favorite CEO, Dario. Let's

see what he has to say. I think I don't know. We might be 6 to 12 months away

know. We might be 6 to 12 months away from when the model is doing most maybe all of what SWES do end to end. So we're

6 to 12 months away from all software engineering being done by agents. Yet if

I go on Anthropics careers page and I select engineering and design for product, I see 20 open roles. It's very

important for us to understand that engineering is not dead. In fact, it's become more alive because generating code has become so much easier. That's

why I highly recommend Scribba, the sponsor of today's video, and their full stack developer path. If I was getting started today and I didn't want to spend four years in college, I would take this exact path, the full stack developer

path. You're going to learn everything

path. You're going to learn everything from HTML CSS to responsive design to setting up backends using Node.js databases, Express UI design. You're

going to learn React. You're going to learn advanced React. You're going to learn about Nex.js, one of my favorite frameworks, although I find Spel to be better. And here's what's interesting

better. And here's what's interesting about Scribba. Scribba is the most

about Scribba. Scribba is the most unique platform and I'm not saying that because they sponsored this video, but I want to show you just a random video.

So, I'm going to click play at the moment where we're making our fetch request.

So, you hear the instructor talking, right? And you see him screen sharing,

right? And you see him screen sharing, but at the same time, this screen is an IDE in the browser. And then I could go back to watching the video.

This is by far the most unique learning platform I've ever used. If you want to take advantage of the times than we're in, then there's no better way than learning how to write and architect and engineer. And the best way to get

engineer. And the best way to get started is using Scribbo. Make sure to check them out. The link is in the description. I highly recommend the full

description. I highly recommend the full stack developer path. Now, let's get back to the video. Okay, so plan complete. It says chat artifacts preview

complete. It says chat artifacts preview plan. Add a right side artifacts panel

plan. Add a right side artifacts panel to slashboard/hat that opens when the assistant creates substantial visual or re reusable content. The first milestone should

content. The first milestone should support markdown HTML SVG and code artifacts safely. Framework component

artifacts safely. Framework component previews should be added through a follow-up sandbox compiler path rather than by running arbitrary React fel code directly in the main app. Okay, cool.

What I found the main chat surface it already handles list direct gateway streaming fallback schedule runs active run preview recovery message rendering attachments a desktop side sheet and a

sub agent inspector. All right, so it's glazing my code. We love that. And this

is the diagram we get. User prompt chat run row gateway stream checks if it needs to preview the content. Artifact

blog parser. Okay.

So, one thing I noticed there's a rollout strategy here. Um, it's planning on doing five PRs. Let me see. Are you

planning on doing five separate PRs? Is

it because it's going to be a large code change? Now, I will say for Gravile to

change? Now, I will say for Gravile to work, for any code review to work properly, you're going to want to keep the PRs minimal, right? If the PR is too large with too many files, it's going to be hard for any agent to review it. So,

minimal PRs that are to the point that fix a specific thing. Chef's kiss. It

says yes, that's my recommendation because it's a fairly large crosscutting feature. Okay. But it doesn't have to be

feature. Okay. But it doesn't have to be five PRs. A practical split would be

five PRs. A practical split would be backend data model plus parser and test PR2 chat integration plus side panel.

Five PR version is the most reviewable path. Honestly, I like the original plan

path. Honestly, I like the original plan of five PRs, right? I rather keep it minimal and simple. All right, it says sounds good. We'll keep the original 5

sounds good. We'll keep the original 5 PR rollout. That's safer path for this

PR rollout. That's safer path for this feature because it lets us verify chat streaming and persistence before layering on previewi and executable content. That's fair. Again, I would

content. That's fair. Again, I would rather my PRs be minimal. I can test. I

can verify things look good and then I can move on to the next thing versus having this giant large PR. And I know you probably like, "But you have a large one for the staging." Again, the staging is a place where I'm testing it, right?

I'm testing the feature. If something's

not working, we're back to local and then we'll merge back to staging, right?

But for stuff like this, I need to have multiple PRs and that's what we're going to do and we're going to let this agent cook. Highly recommend this album,

cook. Highly recommend this album, mixtape, whatever this is. fire, my

favorite song, Gen 5 or took a break.

And yeah, this is kind of the life of aic engineering. It's like it's it's

aic engineering. It's like it's it's just going and I'm just waiting and I could maybe read a book, play a game or work on another project. Also, another

side topic, I can't believe Arsenal won the Premier League. I can't believe like I have been a proud Arsenal hater for

basically all my life. I made it like a known thing that like one of my goals um as an avid soccer football fan is I I I give great joy watching Arsenal lose and

the fact that they won the Premier League honestly breaks my heart. This is

how you know Jesus is returning soon that Arsenal winning the league. It we

really are in the end times. All right,

so cursor is done with the task. We see

here that it's implemented the chat artifacts preview plan. I'm not even going to read all this. Let's go just test out the feature. Let's say create an artifact that explains how World War

II went. And let's just hit enter. So,

II went. And let's just hit enter. So,

let's see. This is a first try. Again,

probably might not work. It might work halfway. Let's see what we get from the

halfway. Let's see what we get from the agent. Oh, okay. So, it is writing HTML

agent. Oh, okay. So, it is writing HTML is streaming HTML off rip. Probably not

something I wanted to do. I probably

wanted to just like say it's, you know, cooking. I don't want to see the stream.

cooking. I don't want to see the stream.

But so far so good. It's working. All

right, let's see. Oh, and by the way, the underlying model that I'm using is GLM 5 simply for a cost perspective.

Like the cost to the type of knowledge you get is pretty high. Obviously, it's

no Opus or, you know, GBT55, but it'll do the job. And there you have it. We

have our preview. Now, again, it is ugly because I'm using GLM55. But I know if I use Opus would probably be very beautiful and chic, but I mean it did it. Now, there's a couple things from a

it. Now, there's a couple things from a product perspective. I I would love to

product perspective. I I would love to be able to slide this right here. So,

I'm just going to take a screenshot real quick. Let's copy this. Let's go back to

quick. Let's copy this. Let's go back to cursor. Paste this. And I'm going to say

cursor. Paste this. And I'm going to say So, you got it right. It works. But the

one thing I'd love to be able to do is I'd love to be able to resize the panel, the window for the artifacts, just like I can do with desktop. literally look at the desktop resizing and just implement

the same thing and hit enter. And

basically what I mean by that is if I open the desktop. Oh, and that's probably something I should think about.

When I open the desktop, you can see here I can resize this to my liking. But

with this right here, it's just a fixed thing and I can download the HTML if I want to. Now, can I make changes? Let's

want to. Now, can I make changes? Let's

see. Can you change the theme from like the light mode that it's in to uh dark mode? Let's see if it can do that.

mode? Let's see if it can do that.

Oh, okay. This is a known bug I have on the app. When I open the desktop and

the app. When I open the desktop and close it, there's like a routing issue.

So, assume that didn't happen.

Embarrassing, I know, but that's another bug for another day. I'm going to try again. Can you change the theme from

again. Can you change the theme from light mode to dark mode? Let's see if it actually updates the existing artifact.

Would be interested to see if it actually worked out the box. Okay, we

can see the resizing has been added.

Great. But I asked it to change it to dark mode and it said it already was in dark mode. I sent it a screenshot and it

dark mode. I sent it a screenshot and it says I see the issue. The artifact

preview is still showing the light mode one. Let me emit. I see this is why the

one. Let me emit. I see this is why the streaming is annoying. Okay, we're going to fix that. I don't like it streaming the HTML. Let's go back here and say

the HTML. Let's go back here and say when the HTML has been written right now what we have it is on the chat UI it

will stream. Can we just have like an

will stream. Can we just have like an animation that says oh like you know writing or building or actually it should say something like writing HTML or crafting artifact. Actually I like

crafting artifact. Write crafting

crafting artifact. Write crafting artifact. Let it animate and pulsate

artifact. Let it animate and pulsate nicely instead of the entire HTML streaming. So we'll have this queued up.

streaming. So we'll have this queued up.

Another thing that I noticed is okay.

See there you go. It worked. It says

here I see the issue. The artifact

preview is still showing the light mode.

Let me emit an updated version with the same key to refresh it. Another thing I noticed is I created an artifact and then I asked the agent to update

existing artifact and it updated it I believe but it did not show it. So can

you please review that process and make it so that I can see every update. I

also want to see every older version.

Right. Yeah. Make that happen. And then

we're going to hit next on this one. So

we have these two queued up. We have

this almost done. I believe GVT55 like it always does. It's writing a test.

Tests are great. It's okay. We're going

to be happy with this. Now, we're

getting to a point where this this looks pretty good. I like this feature. Now,

pretty good. I like this feature. Now,

I'm going to show you how in just a bit once these two are done, I'm going to show you how I'm going to merge this into staging. And this is where Gravile

into staging. And this is where Gravile is going to come in play. Some

interesting findings here. I I can see the chat artifact instructions that it's generated. It says when creating a

generated. It says when creating a substantial standalone visual or reusable content emitted in an artifact fence, use this exact opening fence shape. Open agent artifact type HTML

shape. Open agent artifact type HTML title short title key stable kebab key.

Okay. Supported artifact type values are markdown, HTML, SVG, and code. For code

artifacts, include language TS or another short language ID when useful.

When revising an existing artifact, reuse the same key. So the update becomes a new version of the artifact.

Put only the artifact source inside the fence. Continue conversational

fence. Continue conversational explanation outside the fence. That's

pretty interesting. It says your artifact updates are returned with full version history. The side panel shows

version history. The side panel shows version history. Selecting an order

version history. Selecting an order version updates both preview and source views. New update defaults back to

views. New update defaults back to latest unless explicitly select an older version. The agent prompt now tells the

version. The agent prompt now tells the model to reuse the same artifact again.

We just read that. Let's see right here if we can see. Yep, we see V2, V1, and let's say add World War I uh history in the same artifact as well. Let's make

this history document HTML. World War II was a global conflict that pitted the Allied powers against the Axis powers that began with Germany's invasion of Poland on September 1st, 1939 and ended

with Japan surrender on September 2nd, 1945. And now we have that. Okay, so I

1945. And now we have that. Okay, so I don't need to see the streaming. I can

just see that it's writing HTML. That's

great. We have multiple different versions right here. I can close this. I

can open this. I can resize this. I

mean, I don't think there's much that we're missing. Now, what's interesting

we're missing. Now, what's interesting is I don't think we followed this plan right here. This uh roll out plan. We'll

right here. This uh roll out plan. We'll

see. I'm going to ask it to push this to a branch and make a PR to staging. But

here's one thing I do want to say. I

don't necessarily create the plan for the agent, although I do think it helps.

There are times where I'll just build the feature going back and forth with it. The plan sometimes and actually most

it. The plan sometimes and actually most of the time is really for me because I'll work on multiple features at a time and I need to remember what it is that I was working on or what it is that me and the agent were working on. So low key it

actually helps me. I'm pro plan for myself but I also use it with the agent but more so myself to be honest with you. Now let's go back here. The update

you. Now let's go back here. The update

has been made. We can see version three active and yeah, we see World War I and then World War II. So, this feature is pretty much done. I really like it. I I

thought we'd have more issues. Um,

family GPT 5.5 extra high fast is amazing. So, let's clean up. I want you

amazing. So, let's clean up. I want you to push this to a new branch and from that branch create a PR to staging.

We're not going to merge to main, we're going to merge to staging. So, push the branch, create a PR, and give me the PR link. We're going to hit enter. So, now

link. We're going to hit enter. So, now

what's going to happen is it's because I have GitHub connected, it's going to create a new branch, push that app branch, create a PR. It's going to give me the PR link. And then we're going to review the PR, and we're going to see

what Score GPL gives us. Oh, and by the way, for the text stack, I am using SpeltKit. This is a full spelt app. You

SpeltKit. This is a full spelt app. You

don't believe me? Let me open Let me open. There you go. They addel file you

open. There you go. They addel file you have oh it's also not only just a web app there's also a desktop app using electron for that there's a web app and

then there's an admin dashboard to manage admin stuff using spelt to power everything convex best backend in the world convex literally orchestrates

everything deploying this on Daytona Daytona is the best agent cloud provider I used a bunch of them fell in love with Daytona and there's a couple other tools

like super for memory for memory, agent mail for mail, plaid for the financial stuff, Twilio for the phone. So really

incorporating a lot of services, creating these very composable service layer abstractions so that each service connects to specific thing and I can find the code easily. So this is a very

this project in my opinion is a very well thought of agentic engineered project. It's not perfect by any means,

project. It's not perfect by any means, but it's pretty dang good. So let's open this PR. I can view it in cursors PR

this PR. I can view it in cursors PR viewer. But I'm gonna be honest. I am

viewer. But I'm gonna be honest. I am

going to go on GitHub. Let's go on GitHub. But cursors is pretty nice too.

GitHub. But cursors is pretty nice too.

It's just not real time. Meaning like

when an update pushes, I have to click like refresh here to make sure I see it.

But let's go back here. We could see Gravile is fired off. We do have CI pipeline. I'll explain that maybe in

pipeline. I'll explain that maybe in another video if you're interested. But

now we see 2,000 lines added, 13 removed. Great summary written by

removed. Great summary written by cursor. We're just going to wait on the

cursor. We're just going to wait on the gravile review and we're going to see what we get. All right, the review is here. And ladies and gents, we got a

here. And ladies and gents, we got a three out of five confidence score.

Let's see why. This PR adds a full chat scope artifact system, fence parser, convince persistent versioning, and resizable side panel with safe preview rendering for markdown. Let's see. Okay,

it's explaining. Let's let's see the issues. Okay, this is a security issue.

issues. Okay, this is a security issue.

The artifact persistence and rendering pipeline is well structured, but the message card matching logic has a defect that can surface draft content under

past messages during active stream runs.

The artifact cards for message function contains a matching condition that can cause past message artifact cards to resolve to the current streaming draft

when draft shares on artifact key with a message already persisted. Because chart

ad effects removes the persistent copy in favor of the draft, past messages lose their correct historical reference and instead show a live incomplete content. This makes sense. Visible to

content. This makes sense. Visible to

any user. This makes sense. And then

there's some security stuff. Now,

usually you get these comments, right?

And these comments basically tell you where the issue is and you can copy the prompt to fix where the issue is.

Sometimes you'll get commit suggestions where it will commit the message for you, but usually you can just copy the prompt to fix. Now, here is where gp loop comes in. I'm going to go to

cursor. I'm going to do slash gp loop.

cursor. I'm going to do slash gp loop.

And we're going to hit enter. Now,

what's going to happen is what I explained to you earlier. I push the change. I got a three out of five. I

change. I got a three out of five. I

fired gre loop. G loop is going to read the feedback. It's going to make

the feedback. It's going to make changes. The cursor agent is going to

changes. The cursor agent is going to make changes. Push to GitHub. Re-review.

make changes. Push to GitHub. Re-review.

If it's a four out of five, back to cursor cursor updates and then when it's a five out of five or there have been five turns, then it stops. So this is

again my process. Build the

functionality, test it, you know, actually see if it works. It worked, but there's some edge cases we can't catch off an initial use. Then we fire that off to Gravile. Gravile gives us a review. There's some security things we

review. There's some security things we missed as well. Slash Grep Loop starts cooking. So you can see here it says

cooking. So you can see here it says Gravile left three actual comments, one real draft leak bug, one sandbox tightening, and one small cleanup around an identity helper. I'm going to patch

those, update the affected source test, run the focused artifact test, commit, push, and then trigger the next gravile iteration. This is where /gp loop works

iteration. This is where /gp loop works and cooks. And now I'm probably going to

and cooks. And now I'm probably going to go grab something to eat. I'll be right back. Got some u pasta cream sauce.

back. Got some u pasta cream sauce.

Let's see if our GP review has changed.

Let's refresh. It actually pushed the change. And now you see I didn't even

change. And now you see I didn't even write that. The GP loop did it. So it

write that. The GP loop did it. So it

fired, you know, GitHub's API and wrote add grapile review. And whenever Grapile like drops this emoji, that means it's reviewing the code changes. And you can see a review started a minute ago. In a

couple minutes, we'll see if this is a five out of five, a four out of five.

Sometimes, I'm not going to lie, especially if the PR is big, it might even degrade. So, let's see what we get.

even degrade. So, let's see what we get.

All right, so we got an update and we got a four out of five. It says it's safe to merge with the iframe error detection gap address before shipping the repair workflow to users. It tags

this specific file and says the onair event wiring needs a different approach.

Example, post message from inside the frame to actually surface rendering errors to HTML and SVG artifacts. So, it

again it addressed it like see this review is complete. got a thumbs up and then now we have this one comment and again I can copy this paste it and then

taggile review for a new review after push has been made or I can just wait on GP loop to continue to cook. So notice

this we're literally following the same trajectory went from in this case three out of five to four out of five. Now

hopefully next we go five out of five.

Again, there are times where it will get stuck at four out of five. If I notice it going in a continuous cycle, I'll probably stop. I'll review myself and

probably stop. I'll review myself and I'll just merge, right? Cuz you don't want the agent to keep editing, editing, editing, and then it's going to start hallucinating and making stuff up. You

know, short, simple, concise, to the point, not too long. That's the sauce that I've seen success with. Now, it's

fixing up that edge case. If I click here, I can see the changes it's making.

And I just got to wait. I can work on another project or u I'mma play a little bit of Red Dead Redemption. All right,

so we got a three out of five. It says

you're safe to merge for markdown and code artifact. HTML SVG artifact preview

code artifact. HTML SVG artifact preview will silently show non-interactive content due to sandbox configuration in the version history query could become a bandwidth concern for active chats,

right? And we get some feedback here.

right? And we get some feedback here.

Now I could just fire off the GP loop here, right? And to fire off the GP

here, right? And to fire off the GP loop, all I would do is like /gp loop.

But in my humble opinion, this PR is a little too big, right? It's over 2,000 lines. So, what I'm going to do is as

lines. So, what I'm going to do is as follows. I'm going to go to cursor and

follows. I'm going to go to cursor and say the PR has been made. We got a three out of five on a reptile, but the PR feels a little too big for the gravile agent to be able to capture everything.

Oh, what do you think about splitting the PR into smaller chunks? That makes

sense so we can get Gravile to review the code and we can merge it safely. And

we're going to hit enter. And the goal is to at least get this to a couple hundred lines each. Maybe even if it's thousand, that's fair. But I feel like 2,000 lines is just a pretty big PR. And

I don't want to get into this cycle of like Gravile keeps catching issues because again code base like the PR is just large, right? So let's try to make it smaller. And if you're an engineer

it smaller. And if you're an engineer and you've worked in the engineer or you know, you know, the smaller the PR, the more focused the PR, the better your life is. And I think the same applies to

life is. And I think the same applies to the agent as well. All right, so we got a response. It says, "Yes, I think

a response. It says, "Yes, I think splitting up is the right move. This PR

mixes parser contract, convex, schema, persist, and secure rendering, and a large UI integration. Grab do better if each PR has one review surface." And the suggestion is four PRs, add charted

effect, fence contract. Okay. And then

the artifact persistence. Okay. Preview.

Okay. This all sounds good to me. I'd

like to keep this as stacked PRs rather than independent branches because later pieces genuinely depend on each other.

And I'll be like, do it. Looks good.

This looks like a genuine good plan.

GBT55 extra high fast for a to win. It

sounds like a Starbucks order. And let's

see what it generates. So the PRs have been split. I have four PRs here. And if

been split. I have four PRs here. And if

I open up on my browser, I have PR 1 2 3 4 all under a,000 lines code. It's going

to be much easier for the GPile agent to review and for us to deploy a fix using GP loop. All right, so the reviews came

GP loop. All right, so the reviews came in and every single one got a three out of five. Every single one. So this is

of five. Every single one. So this is great. So let's see the issues here. It

great. So let's see the issues here. It

says safe for basic single block artifact but markdown artifacts containing nested code fences will silently be truncated at the first inning closing fence. The closing fence reics matches any bare triple back tick

line. So markdown artifact embedding a

line. So markdown artifact embedding a fenced code snippet will have its content cut off at the inner fence with no error. Okay, so this is pretty good

no error. Okay, so this is pretty good catch. It gives us some feedback. Let's

catch. It gives us some feedback. Let's

use the GP loop. Now we're on PR87.

Remember we have a stacked PR here. So,

we have a number of them. We want to fix 87 first. So, let's go back here and

87 first. So, let's go back here and let's say please review. Actually, no, I don't need to do that. I'm going to do /grap loop. And I'm just going to say

/grap loop. And I'm just going to say PR87.

There you go. Little golden text right there. PR87. So, now what's going to

there. PR87. So, now what's going to happen is the cursor agent knows to run the GP loop on PR87. It's going to read the contents. Again, I'll show you this

the contents. Again, I'll show you this diagram I drew earlier. It's going to read the review, read the feedback, fix it, push a change. That change push is

going to call the gravile review again, and then it won't stop until it gets a five out of five or it's taken five turns. Whichever one comes first. So, in

turns. Whichever one comes first. So, in

this case, we see it says here, gravile left three actionable comments on the parser contract. It points that out.

parser contract. It points that out.

It's thinking right now. It's going to push the fixes and it's going to re-review it and it won't stop again till it either gets a five out of five or it has five turns. So, you probably

noticed a shirt change. I had day job work I had to do. I had to take care of my lady. I got a little busy. But guess

my lady. I got a little busy. But guess

what? We got our GP loops done. So,

instead of boring you with the details, I'm just going to show you what I did.

We g looped each PR. We g loop 87 once I got a five out of five. Merged it. Then

we went to 88 greb loop once I got a five out of five merged it. 89 grap loop five out of five merged it 90 merged it.

We did that and you can see here five out of five here five out of five here and all of it has been merged to staging. Now what we have left is to

staging. Now what we have left is to actually test this thing. So let's go and open chat. Let me just refresh real quick. Open chat. Let's say, can you

quick. Open chat. Let's say, can you create an artifact sharing the best restaurants in Toronto?

So hopefully this works and if it doesn't, we're going to debug together, but if it does, we cooked. All right. So

what happens is it says here, I'll delegate this research task to a sub agent. Then can gather the current

agent. Then can gather the current information about Toronto. So it spawned a sub agent and I can click on open inspector here and see what's going on.

Basically, it saw that. Okay, I'm going to need to do some research for this.

And instead of blocking the main thread, I'm going to deploy sub agent. The

reason why this is cool is I can say, "What's up with you today?" Just being a weirdo talking to AI like a friend. And

the main thread is not blocked because this has been given off to a sub agent.

And I can chat with the main agent. And

you can see here it responded to me saying, "Hey, I'm doing well." And it told me it has a sub agent running. And

I can get it to do other things, which is pretty cool. I do talk about it in this video. My agent is better than

this video. My agent is better than Claude Co-work on how I architected the agent and how like the sub agent stuff works. So check it out. It's literally

works. So check it out. It's literally

like the fourth video on my channel. And

if we go back here to the sub agent, we see that I got good data from both uh condom traveler and timeout Toronto. Let

me fetch a few more sources to get a comprehensive list. So it's doing its

comprehensive list. So it's doing its research finding the best restaurants in Toronto. And we can see the artifact

Toronto. And we can see the artifact here. Now mind you, not the prettiest

here. Now mind you, not the prettiest one, pretty ugly. And that's probably because I'm using GLM5.

But it got it done. It worked. We

finished building the feature. And it

was all because of this simple workflow where I use GPT 5.5 extra high fast. We

have GPA's GP loop skill, right? Minimal

PRs, right? We don't want the PR to be too big. We want them to be minimal. And

too big. We want them to be minimal. And

just a little back and forth and a little structure gets you a long way.

Now, something I'm going to do, and I I talked about this earlier, is I won't show it in this video, but I'll probably run this skill right after, just so it can clean up the code, and we have this

nicely tidied, documented functions where I know exactly where artifacts are and the agents know where artifacts are.

And that's pretty much it. This is how you do agentic engineering. At least

this is how I do it. Ladies and gents, I hope you found value in this. I know

this was rather a long video. Let me

know if you like stuff like this. Every

time, you know, I hop on podcast or other people's channels and I share this stuff, people seem to really like it.

And I never really done it on my channel. So, let me know your thoughts

channel. So, let me know your thoughts down below. Would really appreciate a

down below. Would really appreciate a like, a comment, a subscribe. Thank you

so much for watching this video. I'll

see you in the next one. Peace.

Loading...

Loading video analysis...