How To Use Codex CLI: Complete Beginner Guide & Tips
By pookie
Summary
Topics Covered
- The Terminal is the New IDE
- AI Agents: Air Traffic Control for Code
- AI Coding Agents Require Architect Mindset
- Iterative Refinement: Plan, Build, Debug, Refactor
- Beyond Code: AI as a Web Browser and Assistant
Full Transcript
Almost exactly one year ago, I was so impressed by cursor that I even started making YouTube videos just to spread the gospel. The autocomplete functionality
gospel. The autocomplete functionality was nothing I had ever seen before. And
also the chat functionality that you get that you could ask questions about your codebase simply was so useful that I thought I need to share this with other people. But I think no one except for
people. But I think no one except for some Vim fanatics would have thought that in 2025 your terminal actually would become the most powerful coding
tool and not some kind of AI IDE because 2025 really has become the year of the CLI with code codes and GPD codecs completely almost completely taking over
as the mo as the most powerful AI coding tools. So it's kind of weird that um
tools. So it's kind of weird that um yeah in the old days we were coding on a terminal then Steve Jobs brought us um
the guey and now in 2025 we are back to uh using the terminal. So um my outrageous prediction for generative AI in 2026 is that we will go back to using
punch cards. So yeah, you better learn
punch cards. So yeah, you better learn how this works or you won't be a 100x engineer in 2026.
Um but anyways um I initially also was very skeptical very skeptical about um Codex because I thought why would I want to code in my terminal and it doesn't
even have markdown um it doesn't look very good I much rather code in in cursor but after using it for a while
you quickly get used to it and you also notice that this is a completely different way of coding and it's much more similar to having like five um
junior coding employees at your service than something like cursor. It's really
an entirely different way of coding and that's also why it's yeah it does have uh some kind of learning curve. So
that's also what I want to show you in this video. Like I want to teach you
this video. Like I want to teach you everything and know about GPD codec and how you can best use it. So you can go uh from Luffy in episode four of One
Piece to Luffy Gear 5 very quickly with just watching this one video. So let's
first think a bit about the Codex CLI mindset. What you need to um have in
mindset. What you need to um have in your head when you're using this tool because it's quite different than if you're used to cursor. I would say cursor is more akin to being put in a
very um fast car. So it speeds you up a lot, but you're still the one driving and actually um working very closely with the AI to get it to do what you
want. While with Codex CLI, it's more
want. While with Codex CLI, it's more like um instead of the AI living um in your code editor, you let it run loose
on your PC so it can so it can really do things completely independent for you.
And you also um if you want to make it um as useful as possible, you should have many of these agents running in
parallel and your job will just be to check on them when they are done with working um review their code which is also very important but in general it's
more like you're monitoring all of these agents working independently and I think it's much closer to being
some kind of um air traffic controller or like it feels like you're often like yeah if you would play chess against multiple opponents simultaneously
simultaneously you constantly need to switch context between what each agent is doing and very quickly thinking about okay what needs to happen next or was
what this agent did all right and that is quite different than um than just using cursor but because with cursor everything is like directly instructed
by you. You're only um working on one
by you. You're only um working on one thing at the same time and yeah, it's a completely different way of coding. So
what I often do with codec CLI is like um I'm building an app. For example, at the moment I'm building an investment app and what I do is I have two
terminals open that work on my front end. I have two open that work on the
end. I have two open that work on the coordination between front end and back end. For example, if I um if the front
end. For example, if I um if the front end needs some kind of API method that the back end doesn't offer um the coordination
um agents can do that and I also have like two agents working on my back end.
Um I will show you later how I exactly set this up such that these don't interfere and you can still um yeah be um up to date with what everything is
with what every agent is doing. But
that's for later. Um but that doesn't mean that um you have these autonomous agents running and that you can just sat
back um put a cigar in your mouth and just um do nothing anymore. No, it's
it's not like that at all. It's more
like a traffic um controller in that um yeah, you really need to check on these agents all the time because they finish their tasks in yeah 3 to 5 minutes and
you're running multiple in parallel. So
every time you're really reviewing the code of one agent, giving it another another task and then waiting um and by the time you're um done with that
another agent has completed this task.
So you need to check on that. So it's
not less intense than just manually coding or it's just a different way of coding. Um also these agents often make
coding. Um also these agents often make mistakes. these are not perfect and in
mistakes. these are not perfect and in that case it's also much harder to debug because of course you haven't written the code yourself and um I think
especially if you're newer to coding you will notice that it's much harder to look at someone else's code and understand what it's doing um compared to when you have written the code
yourself. So um I often like a a few
yourself. So um I often like a a few days ago I spent like 2 hours debugging something an agent has done but because
um still even with spending this time uh debugging um the project will still be done much faster because I can work on it in parallel parallel and actually implementing the feature would have
taken me for example 5 hours. So if it is done with in in 10 minutes with codec cli and I spend 2 hours debugging I still have one 3 hours of time. So it's
in that perspective you have to think about it. You're constantly um going
about it. You're constantly um going really fast and then you often get stuck. You need to debug for a while and
stuck. You need to debug for a while and then you go really fast again and um doing that in parallel like five features at a time. That's the way you should think about coding agents. And I
also got to say like don't throw cursor away because um like I said debugging gets much harder and having some kind of
um IDE with AI powered functionality can make debugging much more easier as um I often just ask my um favorite um LLM how
do things work and because you're only working on one feature when you're debugging of course then these AI IDs still become very useful and they are a really good tool for debugging because
as I said the debugging will get way more complex. So any AI assistance in
more complex. So any AI assistance in that is often very helpful. So yeah um another thing to think about it is like how do I get speedups from this? Well,
it's mainly um it's not like you send off an agent to do an task and then do nothing. No. Um you should immediately
nothing. No. Um you should immediately send off another agent and do something or do something else in parallel yourself. you should try to keep
yourself. you should try to keep juggling as much agents as possible because as much um paralization you can do the more gains you get and this is quite a challenge to um to manage all of
this. Of course another thing to keep in
this. Of course another thing to keep in mind is these agents like codex and and cloud code are not agi yet. they can
solve every coding task like um Carpati even said in in an interview that when he was working on his um I think it was GPD mini or nano library that the AI
would keep suggesting wrong code because for cutting edge things there of course isn't enough training data that the AI has seen yet. So if you're working on
really new things, then AI is probably not going to help you. But I think the majority of things that we want to do um or build um that that there is of course
enough training data about that because the majority of things that we do are often just like using some kind of um superbase or posgress database um
creating a NexJS front end making a stripe integration like on most things these AIs have seen enough training data but of course um
you can't expect them to be AGI and implement the next PIT torch for you. So
always keep that in mind. Um you should really be able to ask yourself is this some kind of task for which the agent could have seen training data and if not you still have to manually code it
yourself. It's also important to know
yourself. It's also important to know like if you're using these coding agents you should really if if you want to use them to their maximum you really need to
be able to plan ahead very well. You
need to um know what to build. You can't
just ask Codex CLI, go build a game or go build an app. No, you need to plan exactly what needs to be built. You need
to chunk it in small enough features such that it can be finished in the 5 to 10 minutes that Codex CLI runs. And you
also need afterwards to of course to debug it. And this is also why I think
debug it. And this is also why I think like using these tools to their maximum potential is very hard for people who are um yeah new to coding because I
think especially like planning um knowing the entire architecture of your project and also debugging and reading other uh people's code is not very
developed if you're new to coding. So um
um yeah, you should be in the mindset of an architect when using codeex. So I
hope you haven't fallen asleep yet by my mindset explanation. We will now start
mindset explanation. We will now start with the actual practical part. Okay. So
I would say the first thing you really need to install and start using is Git.
And I think if you're a software engineer, you're already using this, but you're really letting an AI agent um go loose on your PC or at least restricted
to certain folders. But you also want to be able to revert your folder to their previous state because if Codex CLI changes a file um yeah you of course
want and it's not very good you want to go back to the previous version and it can often take a few commits or changes until you notice something. So version
control is essential and yeah just install it if you don't have it. Um, you
should really be used to like common features like get add, get commit, get push, get stash, get branch. Um, I'm not going to give a get tutorial and I assume you already know this, but if you
don't know it, just search for some YouTube videos or ask chatd to explain you the basic things or ask Cordex CLI um himself how it works. The next thing
you need is Codex um CLI, of course.
Just go to the OpenAI website and install it with your favorite installer.
I often like to install um every one of my packages with uh Homebrew. Um and to install Homebrew, of course, go to Homebrew, which is a package manager for
Mac OS. So, just copy it. Um and let's
Mac OS. So, just copy it. Um and let's go to my terminal.
So you can see that um which cordex you can see I've already installed it with with homebrew. To be able to work with
with homebrew. To be able to work with this you need a chat GPT subscription either chat GPT plus or I think it's called chat GPD pro. Um, you can also
link your own API key if you want, but I really wouldn't do this because um, these often cost like in one session you often go through hundreds of thousands
tokens per um, yeah, per 30 minutes or so. So you will um so if you especially
so. So you will um so if you especially if you're going to use these in parallel as I expect you would um you will pay way much more by using the um by using
your API key than you would by just buying a $20 CHPD subscription. But
yeah, you do you. Okay. So what we're now going to do is we are going to create a new folder CLI test. Um go in
it. Um, of course you don't have to do
it. Um, of course you don't have to do this all with the CLI, but um, let's now look a bit at how Codeex works. To
launch Codeex, if you installed it, just go to the root folder in which you want to initialize Codex, which um, for now is going to be your project. Um, but
later I will show you in a setup where you actually initialize it in the root of multiple projects. So it can work in multiple or C multiple in parallel. But
at the moment just go to co yeah my codeex cli initialize codeex and the first thing um it is going to tell you is that this folder is not version
controlled so we haven't initialized git in it so it's going to recommend us that for every um file that codex wants to modify delete or any command like git or
web request it wants to execute that we need to have we need to give it manual approval because of course um this is the safest way of operating rating especially if you can't go back to the
previous state. Um but you can also give
previous state. Um but you can also give it um permission to do all of that without permission in this folder. But
this is maybe a good opportunity to go about um the sandboxing and security modes of of Cordex because if you go to this um Cordex security guide um you
will learn a bit more about the sandboxing and approval modes and um as you can see like version controlled non-ver folders are read only and version
control folders will allow Codex to automatically read and write to every file in that folder. But for some um commands it wants to do it will um ask
for an approval request. And there are multiple um things you can set up. You
can set up a certain setting for sandbox. So um and you can set up a a
sandbox. So um and you can set up a a certain value for when codex is going to
ask you for approval. Um let's maybe go over the um send box options. So send
box mode. Um so the values can you can set up are read only. That's if you don't ever want Codex to manually automatically um adjust or edit any
file. You can set it up workspace right
file. You can set it up workspace right which is what I would recommend. And in
this case in every folder and subfolder it can do edits. So if you initialize it at the root of a project, it will be able to work on your project, but it
can't modify other files on your PC, um, which is a very good trade-off because you probably don't want Codex to mess with your system files, um, to fix some
kind of test. And you can also give it danger full access. And in this mode, it can just modify any file on your PC and
it can even cause um yeah, Codex to modify system files and will cause your PC to crash. So this one um danger full
access. I would say just only do this if
access. I would say just only do this if you um are using dev containers and dev containers is maybe a bit of an advanced
um I can't type anymore. dev containers
is a bit of a more um advanced concept, but in this way you're basically um developing in in a kind of [snorts]
docker image which um is a replica of your setup in which you want to run your eventual code. And of course, if you're
eventual code. And of course, if you're developing with these dev containers in some kind of docker image which is not your actual um system, then you can let
codeex run loose. And this is also something which I would eventually um which eventually can become very useful.
But you should of course realize that Docker images can take like 4 GB of your of your RAM of your PC. So you can't run
like 20 of these in parallel. But I
think um as computers will become more and more powerful and um yeah you can run more of these in parallel. I
definitely think that um having each agent work in a separate image in which it can fully run loose is something that will be used more and more in the future. So if you already want to get
future. So if you already want to get started with it definitely look into dev containers. But anyways for now um I
containers. But anyways for now um I think you should just set it to workspace right which is also um the
default value um which I think is here um with the auto preset. So you can see workspace right and then you also have
this ask for approval option and this is a bit more vague I would say because you will soon see that um ask for approval
or approval policy is it called that um when I will run codec sometimes it will ask can I run this command and then you
have to manually put yes or no or always um and um by default it will ask you for some things like I
think the default is untrusted. So if um yeah it isn't this is often a bit weird to me like I don't really often see the difference between onrusted and on
request. I think certain commands are by
request. I think certain commands are by default set as untrusted by your OS. For
example, things like network request or some kind of git commands and you always have to approve these. Um and then you have also have on failure. So if Codex
tries to execute an command but your OS security so Mac OS says oh you don't have permission for that then it will ask for approval. And I think on request
is just if Codex CLI thinks it should ask for approval then it asks for approval and never is that it will just if it encounters a command that the Mac
OS security or your Linux security or Windows security um says like oh you can't execute it with um your sandbox
setting then it will just not ask for permission and not do anything at all.
So um if you want to go fully autonomous you just go um here with the dangerous full access option and that's just bypass all approvals and work on any
folder. But that's not something I would
folder. But that's not something I would recommend. So um we are actually going
recommend. So um we are actually going to go back. So um cancel all of this and um the first thing I'm going to do is
init um a git folder. So it's it's completely tracked and initialize codex.
Um yeah, let's um yeah, let's update.
Hey, let's restart Cordex.
Um and it says since this version uh since this folder is now version controlled um you can go into auto mode and this will just give CEX to uh
permission to edit any file in this folder and subfolders. So let's just do that.
And now we are in the main um codeex screen. From here on you can just ask
screen. From here on you can just ask Codex to do anything in this folder from the subfolder like um implement uh X
feature or what does um the code base do or just like anything you would ask chatd and will go work on that directory
that is um set here. Codex also offers these slash commands which start with a slash. So um these are basically things
slash. So um these are basically things like you can uh modify which um model you should use. I would just recommend
using the re the the default um model because it's often the best like codex mini can be good to save some tokens because you will have some um rate limits but it will automatically switch
to it when you once you hit um 90% of your um of your rate limit. So I
wouldn't do it by default and just Codex Max is at the moment the best model. So
just use that. Um you have also you can also modify what kind of approval. So
sandbox mode you do. So like I said um I'm not going to give it full access.
I'm just going to go with the default option. Um you also have a review uh
option. Um you also have a review uh slashcomand and I will show this later but this is very useful to after you've um after codeex has coded something you
can review the changes it did and you can even like compare against a git branch um at the moment I don't have any branch
so I can't um do it but you can review a specific commit search for it this is very uh very very nice to do um then you
also have new which new basically um you see here I have 100% context left and this is like these um yeah the context
window of your um LLM that's getting filled over time but sometimes the code the the context is just not um good enough or you see codeex is going into
loops and with the new command you can basically start a new conversation and start at 100% once again um so that's also very useful to
Um because else you would just yeah like I did just have to quit codex and restart it again but with the new command you can just do that as well and it will
create an agent spint markdown file. So
at the moment I will run this it will just um agent spin markdown file will contain important instructions for the
agent itself. And this is a very
agent itself. And this is a very powerful tool to control how um how the agent will work. It's like the readme domarkdown file but then for agents and
it will basically respect any rules you um you set in it. Um as [snorts] you can see um with the init method it will give
codeex automatically task to create an agent.mmarkdown file. But of course
agent.mmarkdown file. But of course there is almost nothing in my repo and I also there is nothing in this repo and also didn't give it any instructions. So
I'm wondering what it comes up with.
Um created an agent.mmarkdown file. Um
so you see an agent markdown file has been created.
Um it gave itself some instructions. Um but
to be honest I don't find this in it method very useful because we will later I will later show you how we can better create this agent markdown file for
yourself. What else can we do? Um
yourself. What else can we do? Um
like codeex slash um review will summarize a conversation and then um so you can free up some context space. Undo
will undo a turn of Codex so it goes back to its previous turn. Um I don't know why um Codex also offers a get diff command because you can also just do
that with the get CLI. You can mention a file by typing mention and then for example at the moment I only have this um
[snorts] this um agent markdown and you can then refer to it for codeex but you can also just type o and then agent markdown or any file you want. So I
often just use the odd. Um then you also have this um status which will show you how your rate limits. Um
it will also show you your setting which you can return to which plan you have.
Um how you have a 5hour token limit and a weekly token limit. This often very often resets. I think um it's yeah it's
often resets. I think um it's yeah it's just the start of the new week so I still have 100%. You can often code for like I would say like a week for a few
hours or more a day on the plus plan. So
that's good enough. Um
what also should I tell you? Um MCP. Um
this can be important. I will go over it later, but you can see which MCP tools you've added. Um
you've added. Um um you can give feedback, exit, and quit. So um enough about the get
quit. So um enough about the get commands. Uh, however, I think it's also
commands. Uh, however, I think it's also important to set up a good config.
Instead of doing all your settings in this with the slash commands, there's also a config file that's even a bit more advanced. Um, I'm not going to
more advanced. Um, I'm not going to through all of this. You can set up specific profiles, for example. Um,
certain feature flags. Um, but there is two important things that I would recommend you to set up. Um so let's go to um
so you see um how my config look like looks like. You can set up a default
looks like. You can set up a default model but I just use the recommended one. You can set up a reasoning effort.
one. You can set up a reasoning effort.
I think the values are reasoning effort. You can set up like
reasoning effort. You can set up like low, medium, or high, but I would just set it up for medium because you don't want to waste too much tokens.
Some interesting command is, for example, this notify command. And what
this does is it allows you to invoke a script um every time that Godex CLI has completely finished its turn. And what I
basically did is I made some kind of uh Mac OS notification tool so that each time that um Codex finishes it his turn
um I get a Mac OS um notification. I
will show you a video of this because um when I'm recording with OBS these Mac OS notifications don't appear.
Um if you want to install this yourself um go to GitHub um GitHub repos
um and and puff and um yeah I made it for bender and you can just install it with with brew once again and um set it
up like this in the notify command and this will basically give you a Mac OS notification. if your agent has finished
notification. if your agent has finished so you can work on something else and when you hear this notification you know okay the task is done okay you see the
trust level of all your repos and then under features I would also turn web search request on because this will allow you to uh codex to use a browser
tool um so it can so it can for example search for documentation online or yeah do anything other online and this is very useful
Um I also created a profile um just to test things out and then you can just like do codeex profile and set up this
profile if you for some reason um want to use something other than the default and these notices are something set up by um Codex CLI by default. Okay, so
let's get started by with building our own project in the CODC CLI test. And
what we are going to do is we're going to build this browser game inspired by the um Google Chrome Dino game.
Dino run game. Um
let's see. Can I? Yeah, I basically want to build something like this because I've never actually tried this before this tutorial, but I think this should
be doable um with Quex CLI and show you how it works. So, the first thing we of course are going to set up is this agents.mmarkdown because this is like
agents.mmarkdown because this is like some kind of rules and prompt for the agent which it needs to follow and is a very good way to control it. So, let's
look at an agent markdown file of another project. them doing some
another project. them doing some investment app and um let's show you what's inside it. Um I basically describe the project a bit then I set up
some rules for example that um for all the frameworks I'm using for example for NexJS you can use client and server components but I say like don't ever use
server components um I say always write tests that can be executed um with the CI tool for styling use tailwind CSS um
avoid global installations don't do this don't write backend code um after creating a new feature do this and read
me do this so of course you can't copy and paste this from some kind of other project it's really dependent on what you want to do with your project and how
you want to build it and I often start with a very basic readme uh or agents markdown file and after a while um you start noticing that oh no codeex is
doing this and I don't wanted it and then you add it to the a agent markdown file and then it's you build this iteratively. So um let's start what we
iteratively. So um let's start what we want to do for our thing. Um uh how I'm going to call it um
Chrome Dino game run clone. Um
I'm going to describe it a bit. Um, this
uh is a I'm going to build a NexJS. This
is a NexJS project. That's um
um is a game like the Dino run from uh the Chrome browser. Um I'm going
to set up some rules.
So, let's type rules. Um rules NextJS um use only NextJS client components like to keep it simple um Tailwind CSS
use Tailwind CSS for styling um what I'm also going to tell it is um and that's also important
um that you explicitly often say um write tests for every feature you write because like always like it's also important for
general software engineering but especially for agents it's important that they can automatically run a test and um see if their code works and then
afterwards they can take this out input uh to fix the feature. Um, and that's also why you want some kind of um, see
some kind of pre-commit hooks. And with
um, if you don't know pre-commit, it's something like if you type get commit, it will first run these hooks. And these
can be things like a formatter, a llinter, um, your test suite. And this
way you know that your code is actually um, in a good state before pushing it to your uh, get repo. So I'm also going to
tell it like um CI use husky um for pre-commit hooks and um if you don't know husky
husky no not the dog uh JavaScript um yeah it's yeah a pre-commit hook like I
said but it works for JavaScript and if you're working in any other framework like Python And I would just use um pre-commit. I'm not going to explain all
pre-commit. I'm not going to explain all of this, but you should look um into husky or or pre-commit depending on the project. Um
project. Um okay. And I'm just going to copy
okay. And I'm just going to copy these commands for the other from the my other project.
um that after every feature that it should run get add and husky pre-commit and our agents on markdown file now
looks like this which is enough to begin with. Um, you can of course use codex
with. Um, you can of course use codex right in the CLI terminal of your editor, but this is not very scalable, I think, and especially if you run with
multiple agents in total. Um, I find it much more useful to just open a terminal window separately. And that also puts
window separately. And that also puts you into mindset of using a um a a separate um yeah, really having like a separate colleague available that's not
involved with the ID at all. Okay, so
let's get started. The first thing, of course, is um we need to install Next.js. So, I'm just going to give it a
Next.js. So, I'm just going to give it a task like um install Next.js.
Um you don't want to give it too much.
Um I could also say like also install husky but um yeah the more the better you break up these
features and small tasks the better um yeah the better the results will be. you
will see it will start executing some terminal commands and um even with the auto mode like network requests aren't
um aren't permitted by default but the thing is like and that is also important it wants to install um next react and react dom
once installed um just create next app and this is much easier so I'm just going to say um no
don't execute it and tell what to do differently. Can also already execute
differently. Can also already execute because I've already installed that one and there's a much better way of
installing index project. Um it will ask for permission. I say okay, proceed.
for permission. I say okay, proceed.
And that's also why you really need to often think about what it's doing. And
it's important that it asks approval for certain things because you don't want to let it run um completely wild and it will just um start making a mess of your
project and the all your installs. Um I
think it has created some files like a default next.js project. But I can see it initialized it um not in the root
folder but um it installed it um yeah not in the codex CLI folder but it created a separate folder for my app. So
I'm going to um say um that it needs to be moved to the root folder.
root folder should be and let's see what it's going to do now.
It's going to remove everything.
Okay, you can do that.
Um, you don't need What is it trying to do?
Um temp app. Yes. No.
temp app. Yes. No.
I don't know why it was going to create a stamp app because an agent of markdown is present.
Um okay, that could be. I don't know if that's really the case. Um,
just uh just um delete the agents markdown and put it back later. Remember the contents
though. Let's see if this works.
though. Let's see if this works.
So now let's install this.
Okay, nextJS with Tailwind is scaffolded. Um, it also gives some
scaffolded. Um, it also gives some things to set up next. And it also says because um I didn't set up husky yet and it sees this in the agents of markdown,
it will recommend it to do that. Um,
okay. Just okay. run the npm install.
You can also do these things yourself like npm install, but just for showing you the um yeah, how codeex cli works, I'm just going to let codeex cli do
everything. Um npm install. Um you can
everything. Um npm install. Um you can also just instead of just clicking yes, you can say yes and don't ask again for this command which can be useful. for
example, it can always run mpm install for me. So, I'm just going to do
for me. So, I'm just going to do dependencies are set up. Okay, sounds
good. Let's um see if it indeed worked npm rundev turbo pack.
Let's see. And this is indeed the next JS um starting template commit. And um
I'm just going to give it this free commit file for my other project. So it
knows what kind of things I expect. I
expect link stage, lint, and test.
Um okay. Now setup husky. Um I want the following hooks in the husky file
to have dependencies. Um it asks to install husky and lin staged. Okay, you
can install it as staff dependency. You
can see it often tries to do things like adding things to my git config which is necessary for husky but the Mac OS um yeah my my sandbox environment doesn't
allow this so I need to allow it manually.
Okay, so it came back to me. It says it set it up. So let's see um if it indeed
works. Um so let's do a um get commits.
works. Um so let's do a um get commits.
Okay. You can see it's run all the tests. Um
tests. Um see test files test post. What did it set up for husky pre-commit lint staged
um mpm run lint and mpm run test. Okay,
so for the next part we are going to start building and um let's just clear the context because all this setup it shouldn't have that in the context when
um doing the next feature and we are first I'm going to describe my project the codec cla and just ask his opinion about how we would set this up and because it's very important that before
you start implementing you start planning a bit with codec. So um let's subscribe. Um I want to build a um
subscribe. Um I want to build a um Chrome Aadino
run clone like Chrome internet. It should be very simple
internet. It should be very simple for a demo. How would you approach this?
So don't ask it to implement something.
Just ask him for his general opinion.
And it says um we can keep this lean. Um
I'm just going to ask um what do you think would be a good first feature to implement?
[snorts] It says like um a client page showing the dino ground line. Um
okay. And I'm just going to say okay, get start.
Okay, so I think I see um I see that it tries to do get at a lot, but I see it hasn't got a permission for
that. Um so sometimes it also fails to
that. Um so sometimes it also fails to ask you for permission. So you can just interrupt it by pressing the escape and
just um I see that get add is failing just ask me for permission and it will ask for permission. I will
say go ahead.
Okay. So it added some files and here is where the um VS code also comes in is
that you can easily um review what has been um added. Um
you should get really used to this really review all the components that have been um that have been added or what has been changed. If you don't
understand something um let's see where I can for example to this like um is you can ask like is this
necessary I often like to use the ask mode of cursor for this um if you don't understand something and
um it will give you why it's not or why it isn't. I'm not going to review all
it isn't. I'm not going to review all the all of this code for this um simple tutorial. Um, so what I'm going to do is
tutorial. Um, so what I'm going to do is go just go mpm rundev and run.
Okay, apparently already implemented a lot. So start run. Okay, start run isn't
lot. So start run. Okay, start run isn't working yet.
Um, I think it just set up a scene.
Plan for the next step to make it moving.
Okay, implementing the game loop seems very good.
Okay. Uh, let's go ahead and implement a basic game loop. You can see um it always makes a plan what it needs to do
to execute your feature. It will
implement it and then write tests like I instructed it and run the get add and then husky pre-commit. So all the tests also run.
Okay, it was stuck for a while but eventually it succeeded in running the tests. It's now going to run the get at
tests. It's now going to run the get at risky pre-commit updating plan. It's run
the test implemented the first moving version of the dino game.
Okay, let's see for myself. Um once
again, always check what is added. Um
inspect why um a certain thing was necessary or not. Um, let's run npm run def.
Okay, the Okay, the jumping doesn't work very well.
Like I can't Oh, should I hold it?
[laughter] Okay, I'm just going to tell it to ask like um the jumping
doesn't really work correctly.
It keeps getting stuck and I don't jump high enough to um to get over the first obstacle.
Okay, it says it adjusted the jump physics. So, let's see. But still, the
physics. So, let's see. But still, the the jumping works kind of like if I jump on the correct like if I jump on the correct Okay, no,
it's it's just very buggy.
I can kind of go through it, but it doesn't clearly indicate I think there is something going wrong
with the with either the rendering jumping high over the ground.
Okay, it is refined. Let's see.
Okay, Manu. Now he doesn't jump very high anymore.
Okay, normally it should work now and indeed this is very close to the dino game. Do these obstacles there is
dino game. Do these obstacles there is some kind of high score game and you can see how you can improve this by giving
cloud some more or giving codex cli screenshots of d noise you want um to make it more visually beautiful. Um so
after implementing a lot of features what I often like to do is like a lot of um code for example has been written. We
see here that um there's also some um some some errors I see. So we haven't really looked at it but it's written
every component in a very big game file for example. Um
for example. Um and this is like um after a while if you're used to software engineering you want to refactor um things for example.
So um what I then often like to do is um first off um let's see um
is the game getting a bit too complex?
Um, can we refactor the game TX in multip I can explicitly mention it with the odds uh game
TXS into multiple components.
Let's see. Let's first ask for his opinion. Maybe he has a good um reason
opinion. Maybe he has a good um reason for why this game TSX file needs to be that big.
Okay, this time he doesn't seem intended to give me any um reason. He will just go ahead with um with the refactor and
actually let's say I didn't want him I first wanted him to do like um to give me a plan but he has already made some changes. So what I can then easily do of
changes. So what I can then easily do of course because I committed everything is do get restore and instantly get back to the um yeah to the previous state and
then say first give me a refactor plan before you start implementing come back to me with it
or discuss if refactor isn't necessary.
Sorry, because I don't want to make him going ahead.
Um, keep game as the orchestrator, but extract all UI helpers in these components. Okay, scoreboard component
components. Okay, scoreboard component seems like a good component. Game
viewport seems like a good component.
Control button. So, go. Okay, sounds
good. Go ahead with this refactor.
So you see, you can't just autonomically set them off and make him do things. You
often have to really think about what is my agent doing? Is this all right? Is he
really building a good project? Should I
refactor it a bit? Ask him why, what's his opinion on how to refactor it? Then
make him implement. So there is still a lot of interaction between you and the agent itself. It's not like some kind of
agent itself. It's not like some kind of autopilot. So, it um composed it in
autopilot. So, it um composed it in multiple um separate components. Let's
see.
Okay, this also already seems much better. Like the game file is much um
better. Like the game file is much um smaller.
Um let's see if it also still works, which is of course important because all the tests succeeding maybe isn't good enough of a test.
So let's see. And it's exactly the same um game.
So that's excellent. So what can also be useful if you if you've implemented a lot of um changes is to do like a review
with the /re command. Um I don't have a base branch. Um I will just say like um
base branch. Um I will just say like um review a commit and then just uh um yeah last comment. Let's see if it's really
last comment. Let's see if it's really good.
Okay, he has done with this review. He
says everything is all right. So um this is basically it for this simple test.
But of course, like I said, this workflow of working on one feature at a time and um it's not the way that a really advanced Codex user um would use
it. So for example, not that I'm
it. So for example, not that I'm considering myself very ad advanced, but what I like to do is um um for example,
at the moment working on this um let's see where is it on this investment app and I have both a superbase back end and
a NexJS front end and I want to work on all of this in parallel and see all the things that are happening. So what I do
is I create a root folder um of both my back end and front end or maybe there are some other services that are included with your application and just
open my cursor at the root folder. So if
there are changes in one of the files for example if um one of the files changes um that easily shows up in my in my
source control at the root file. So I
can clearly see which of the files the agent um actually is modifying. Okay.
And of course then the really obvious parallelization is that you create that you put one agent in the um in the front end. So you
initialize Cordex in the front end um and you initialize Cordex in the back end and you ask Cordex um implement this
front end feature and implement this backend feature and um yeah let me ask something what is the next feature to implement.
Um, okay.
And in this way, you can basically easily work on two different projects at the same time. And when it's done, I get a notification and I come back to it.
Um, and I can also easily monitor the changes that are made with this um source control panel. and just working on um two different projects in parallel
is um still basic parallelization because what you can also do is get work trees and like if you're used to a some kind of git workflow you work on a um
feature in a separate branch and after you're done you merge this branch into the main branch. But of course um if you just do something like get branch and
then get switched to that branch the folder um all the files change to that specific branch. But what you can also
specific branch. But what you can also do and this is very useful for paralization is actually create a work tree. And what this does is it checks
tree. And what this does is it checks out your branch in a separate folder while keeping the original folder on the original branch. Um so let's say for my
original branch. Um so let's say for my back end I also want to have a good pricing system. So I'm just going to um
pricing system. So I'm just going to um add it um add a new folder called um back end
pricing system um and create a new branch for that which is called feature pricing system.
And [snorts] what you can see is I create I created a new folder right here. And this is just um my investment
here. And this is just um my investment app back end if I can see is still on the main branch while the other um while
the other folder is on that new feature branch. And by doing this and creating
branch. And by doing this and creating many um work trees for example I also
often um have one for my um let's see for my front end front end um let's call it test branch I don't
know feature um test and then I also have um one for my front end and what I then do so what I then do is I check out
each of the I go with my terminal in each of these folders. Initialize codecs
in each of them. Um work on as much branches as I can. Afterwards, if you're done working on your on your feature
branch, you just do get merge and do it.
But I don't have any changes yet. Then
you do get work tree remove. Um
let's see. back end pricing system.
Um so the mer work is removed from here and then you just do get a branch delete if you merged it. Um and you don't have
to branch anymore. And this is basically how I work. I try to create as much features um that have yeah independently as possible. work on them on their
as possible. work on them on their separate branch. Merge it into main,
separate branch. Merge it into main, delete them, and then I have like five or as much agents as I can juggle working on it in parallel. And that's
also where you can see some really um insane speedups if you can manage this well. And this is something like I can't
well. And this is something like I can't teach you this. You just need to work a lot with it and you will get better at it. Another cool thing I want to show
it. Another cool thing I want to show you um is um of course MCP. I do think this is a bit overrated. Um yeah, I
think it's very overrated actually at at least at the moment because first off, many of the tools that MCP offers are also offered just in your CLI. Many I
don't think there are a lot of things that you can't access over the CLI. So
doesn't really make sense to add another tool to it because the CLI is so powerful. But something that I did find
powerful. But something that I did find very useful. Um and also um it's very
very useful. Um and also um it's very easy to add MCP tools with um with Codex CLI. So um you can for example Figma. I
CLI. So um you can for example Figma. I
I don't use Figma but it can be useful.
But what I want to show you is Playright. And playright is an MCP
Playright. And playright is an MCP server that lets your um codeex control your browser.
So um it also has some install instructions right here. So the only thing we need to do is copy this um go to my terminal
um say codeex add.
So now a MCP server. Okay. It says booting MCP
MCP server. Okay. It says booting MCP server. Um, okay. So, you can see, um,
server. Um, okay. So, you can see, um, it got some additional tools like browser click, browser, close, navigate, etc. So, what I want to show you is, um,
I don't know if this is going to work.
Um, and I want actually to try if Cordex can indeed play um, my browser, the the game it coded itself. So, the Dino
runner. So, I'm just going to open um
runner. So, I'm just going to open um codeex um and say I want you
to try to get a high score in the game first. Um
first. Um start the next JS server.
Um then um open the browser to go to the correct page and start
playing. I don't know if it has a space
playing. I don't know if it has a space bar or or something like that, but if if it has a space bar tool, I mean, so let's see um what it does.
Okay, it says play right navigate. So it
completely opened the browser by itself.
Let's see what it wants to do.
It sure has to analyze a lot before it tries to play. It's going to say, "Don't analyze too much.
Just play.
I don't even Okay, it started the game, but it failed. I don't. Yeah, of course it isn't fast enough to react, but you can see it tries to play. Um, and I
think this is this will mainly be useful if you are um if you are um yeah developing a real website, not a game.
So, what I'm also going to um show you, I'm just going to tell it use playright
to use my browser. Search for a nice Apple Watch on Amazon.com.
Okay, you can see it opens Amazon [snorts] sees my page. It's going to search.
And I think this is a bit underexplored at the moment. Like Codex isn't just a coding agent. It's trained for coding,
coding agent. It's trained for coding, but it can basically do everything um on your PC, like even browsing the web. And
I think the possibilities with this aren't explored too much at the at at the moment. And it's really like you can
moment. And it's really like you can make this uh shop for you, use your browser, order food, basically everything um you want. I've also not
tried it out myself that much, but it can be nice to know that it exists.
And I think some I think some other advanced uses are there is even an entire codeex SDK library in Typescript.
So you can really just write scripts to invoke codecs um with and you can also run codeex in headless mode. In this
case, um, it doesn't wait for your, it isn't constantly reading your standard input, but you can just if you do codex
exact fix the CI failure. Um, yeah, it will just do that. And with this way, you can also integrate it in your um,
CI/CD setup if you would want to do that. So, I think I've showed you almost
that. So, I think I've showed you almost anything um, you need to know to get started with Cordex. I really at the moment believe like this is going to be
very important in the future that you are able to um use these tools and it's also not like you are instantly productive with it. Um the first times I
was using it I was constantly thinking like I can do this much faster if I just use a cursor or cod it myself. Um but
you should really explore running these agents in parallel. Um, and it's also, I think, if you're a new developer, kind of hard because you have to review a lot of code. You have to, um, instinctively
of code. You have to, um, instinctively know when it's time to refactor or what it's not doing very well and ask it to do that because it's easy to build a
whole lot of, um, yeah, unnecessary things and bloat the things and your codebase can easily explode, which um, causes the Codex CLI not to work very
well. Anyways, I hope you enjoyed this
well. Anyways, I hope you enjoyed this video. If um yeah, you liked it, make
video. If um yeah, you liked it, make sure to subscribe. I'm going to upload a video probably pretty soon in which I I'm using Codex to build a full-blown um
investment tracking app. So, if you're interested in that, make sure to subscribe and I see you in the next one.
Thanks.
Loading video analysis...