LongCut logo

Everything PMs Need to Know about ChatGPT’s New Codex (Masterclass)

By Aakash Gupta

Summary

Topics Covered

  • CLI Unlocks Agentic ChatGPT
  • Context Folder Powers PM Tasks
  • Templates Standardize Outputs
  • Socratic AI Builds Better PRDs
  • Test-Driven AI Prototypes

Full Transcript

If you're using chatbt in the browser, you're using it.

>> Carl Votti is the full stack PM.

>> He knows more about coding CLIs than most PMs do about PRDs.

>> And in today's episode, [music] he's giving you a complete guide to how PMs should be using OpenAI's new codec.

>> Codex is the most powerful way for product managers to build prototypes, but only if you know how to set it up correctly.

>> People are saying codecs alone justifies OpenAI's $500 billion valuation. Should

you try Sam Alman's new Claude Code competitor? That's what we impact today.

competitor? That's what we impact today.

We do a live demo of codecs for nontechnical people so [music] you can get the most out of it. How can a non-technical person ramp up on codec?

>> As long as you follow these steps and you're really willing to put in that work up front, it can be one [music] of the best ways to prototype that exists right now in 2025. And normally like if you asked GBT in the browser like, hey, here's this video, give me the

transcript. It wouldn't be able to do

transcript. It wouldn't be able to do it. It doesn't have the tools.

it. It doesn't have the tools.

>> How should you be using codeex? How does

codeex compare to [music] claude code?

What are the power user functionalities to get the most out of its prototyping and coding capabilities? That's today's

episode.

[snorts] Really quickly, I think a crazy stat is that more than 50% of you listening are not subscribed. If you can subscribe on

not subscribed. If you can subscribe on YouTube, follow on Apple or Spotify podcasts, my commitment to you is that we'll continue to make this content better and better. And now on to today's

episode. Carl, welcome back to the

episode. Carl, welcome back to the podcast.

>> I am so excited to be back. Our our last episode on cloud code was so fun and uh I'm really excited to cover codec.

>> Yes. Past Marty Kagan in less than a week. Amazing episode. How can a

week. Amazing episode. How can a non-technical person ramp up on codecs?

>> Yeah. So, let's just go ahead and get started here. Okay. So, to get started

started here. Okay. So, to get started with downloading codecs, you just go to this developers.ai.com/codexi.

this developers.ai.com/codexi.

So, today what we're going to be using is we're going to be using codecs from the terminal. Uh if you've never done

the terminal. Uh if you've never done anything in the terminal, I promise that it's going to be a lot easier than it sounds. Just the very first things you

sounds. Just the very first things you do have to set up directly from the command line. Okay, so on this website,

command line. Okay, so on this website, it kind of just gives you the exact commands that you need. So, uh you do need something called npm, but you just type npm install and then you just use

this exact command. And when we put that into the terminal, it'll go ahead and install codeex. So, I'm going to copy

install codeex. So, I'm going to copy this and then >> and for people who don't know that's a uh package manager for NodeJS JavaScript just because we want you to also

understand what you're doing here.

>> Yeah, exactly. That kind of is just the you'll use npm for many different things that you're doing from the web and this is just another one. So, we're going to go ahead and copy this into our terminal. If you're on a Mac or Windows,

terminal. If you're on a Mac or Windows, it'll be the same. And what this will do is it will go in and it will just install CEX for us. It's very quick.

Okay, so for me, uh, I had it already installed, so it it just added one thing. But if you don't have it

thing. But if you don't have it installed, you just run this one command and you'll be good to go. And then once you have it installed, all you do is we're still in the terminal here. You

just type codeex.

And now we see this interface and we are all of a sudden we are in codeex. Now

we've we've done the only commands in the terminal you're ever going to have to do at this point. In the future, all you'll ever have to do is open up the terminal and then type codeex because now we are officially in chat GBT

codeex. So just type hi and now here we

codeex. So just type hi and now here we are. So just if you've used chatgbt on

are. So just if you've used chatgbt on the browser that's all that's happening here is we're just talking to chatgbt.

Let's get acquainted with a couple of the commands. So the first one to know

the commands. So the first one to know is there is one just model so we can see what model of chatbt are we using. So

there's two main options. We can either just use chatbt5 uh which is what you're probably most used to or we can turn on GPT5 codec.

This is kind of an important first thing to know. Even though the tool itself is

to know. Even though the tool itself is called codeex, there's also a model that's called GPT5 codecs. This one is really specifically built for coding and we are going to cover it later, but it's

not great at other tasks where just regularbased GPT5 works the best. So,

we'll come back to GBT5 codeex, but for now, for most of this walkthrough, we're just going to use regular GPT5. So,

we're not going to use GBT5 codeex.

We're just going to use GBT5. And then

once you select it, there's a couple options. Basically, what thinking level

options. Basically, what thinking level do you want GBT5 to have? This is

something that I think a lot of people give OpenAI flack for just because there's so many options. But for this tutorial, we're just going to use medium. That's what I've used to do all

medium. That's what I've used to do all the work that we are going to demo here.

Uh, it's a good model because it will intelligently use more thinking power if it needs it or if it's a quick task, it'll just do it quickly. If you really feel like you have a task that it's not spending enough time on, you can switch

to high. But medium works pretty well.

to high. But medium works pretty well.

And then the other commands that are good to know, if you ever want to just start a completely new chat, you just do slash new and then that'll start a new chat. If you have having a long

chat. If you have having a long conversation with GBT5 and you're getting low on your context window, which you can see here, it tells you how much context you have left, >> then instead of doing new, you can do a

command called compact, which will basically have GBT5 just summarize where you're at in the conversation, but in a much shorter way, and then you can continue from there.

>> Very cool.

>> Cool. Okay. So, uh, for this demo, what we're going to use today is we're going to use I have a bunch of files prepared, and I'll make these available. Um, we

have a bunch of files that are kind of like as if you were working at a real company. So the the demo company here is

company. So the the demo company here is called Task Flow. It's a sort of Jira Asa competitor. So pretty generic, but

Asa competitor. So pretty generic, but it'll give you the idea. And then what we have here in the files is we have some data. So we have like uh user

some data. So we have like uh user interviews and then we have some docs like PRDS. So we'll be referencing these

like PRDS. So we'll be referencing these sort of throughout the entire podcast.

We have this folder which is where everything is. So, we're going to go

everything is. So, we're going to go ahead and open a new terminal at that folder and then do Okay, so now we have folder open here and we can get started.

So, I'm using a tool called Whisper Flow, which I highly recommend. It lets

you just sort of talk to your computer and it will transcribe. And so, that's if you see text just appear, it's either probably because of that. So, let's try it. Uh, please tell me what user

it. Uh, please tell me what user interviews we have completed. So, now

chatbt is going to start searching our our codebase or our whole entire folder, which isn't really code right now. Now

it's just text documents and we can see it exploring listing out the files.

>> This is the really power of a CLI way to access Chad GPT or cloud for that matter with codeex or cloud code because you're giving it all the context of this folder. It's a lot easier than setting

folder. It's a lot easier than setting it up on the web.

>> Yeah, exactly. So you don't have to provide these files. We're already in the place where those files are and we can just have it look for them. So right

now you can kind of So it did it. So it

was able to list, search, and then read.

And then it's telling us that we have these three interviews that were completed. Um, okay. So now what we can

completed. Um, okay. So now what we can do is it's actually already kind of predicting what we would want. Do you

want a quick synthesis of the common pain points? So let's say yes, please

pain points? So let's say yes, please put together the top three pain points across all three interviews.

Okay, so it's it really tells us like exactly what it's doing. So it's

summarizing and now here we go. So the

top pain points were voice input reliability, integrations, and then template workflow. So now we can kind of

template workflow. So now we can kind of already start to see uh like how it is able to use this information. And then

if we wanted, so right now we've seen it search for files and then read and summarize them. You can also create

summarize them. You can also create files. Please go ahead and create a

files. Please go ahead and create a single document that outlines all of these pain points along with direct customer quotes. So now what it will do

customer quotes. So now what it will do is it will it's already done some of this analysis, so I'm guessing this will be quicker than the first time I had to read it, but it will actually put together a new document for us. And then

we can do interesting things like have it pull in that those uh actual customer quotes into that document.

>> Very interesting.

>> Planning, document creation, simplifying. So we can also just kind of

simplifying. So we can also just kind of look at the UI here. So it tells us how much context we have left. So we haven't really done much. So still has 100%. Uh

GBT5 has a pretty large context window.

Um there's a lot of jokes around what context window really means. Uh, I think one of the biggest jokes is that a 200,000 token context window and a million uh token context window are the

same >> because the way that AI uses it. But I

will say that GBT5 it it feels like it uses its context window pretty well. It

fills up very slowly. So you can really go pretty far before it starts to like forget everything.

>> Awesome. the right place to handle like a whole folder full of all your meeting notes, all of your performance feedback, all of your goals, strategy docs, PRDS,

future results right up so that you have like a really nice context window.

>> Yeah, 100%. And for the tasks that we're looking at here, like classic PM tasks that are more like document based and not too much codebase, it is just Yeah, it is can really easily handle that stuff. It's really when you start doing

stuff. It's really when you start doing stuff with like lots of code and logs that that will fill up. But we'll be able to go very far before we have to clear it. So this is just basically it

clear it. So this is just basically it sort of showing all the the like the diff for this folder that it just created. Okay. So now it's telling us

created. Okay. So now it's telling us that we has created this in data user interviews painpoint summary and it's telling us the context the contents. But

the question right now that you probably have is well where can we actually see this? So, if we go into that folder that

this? So, if we go into that folder that we had open before, then we go into data and we go into where did it put it? It

said it put it in user interviews. And

now we see this new file, but we can't really just easily open it from here.

So, now is where we're going to switch.

We're going to leave the terminal and we're going to go ahead and go into an IDE. So, that's an integrated

IDE. So, that's an integrated development environment. And so, my

development environment. And so, my favorite to use is cursor because I also use it for other types of prototyping.

And if even if you don't pay for cursor, you can still use the IDE and it's pretty nice. So, we're going to go ahead

pretty nice. So, we're going to go ahead and close this terminal, but we will resume it soon. So, I'm going to open up cursor here, >> which is itself a fork of VS Code. So,

it inherited all the goodness that VS Code had built up over the years.

>> Yep. And if you want to use VS Code, you can also use that one, of course. Okay.

So, we're going to go ahead and open uh we'll do a new window here, and then we will go ahead and open. So, this folder is the Codex PM demo. So, we're going to open that folder. And what we can see

right away and I'll just go ahead and full screen is we can see that okay we go into data we go into user interviews and now we have that painoint summary and we can just view it directly in the

terminal here. Um so this is pretty nice

terminal here. Um so this is pretty nice and then within this of course this is just like raw markdown but you can always just rightclick and then go to preview open preview and then you can

see it very very nicely formatted. So

here's the output.

>> Yeah. Yeah. It's very nice. U and then you can double click on any of these and it will take you to that area so you can edit. But yeah, so let's just let's just

edit. But yeah, so let's just let's just quickly look at what GPT5 produced here.

So we had it look at those uh interviews and then open this. And now we have pretty good. It's it's telling us the

pretty good. It's it's telling us the the main pain points that it had from before with a summary and then it's also giving us direct quotes from those user interviews.

>> Today's episode is brought to you by Jira product discovery. If you're like most product managers, you're probably in Jira tracking tickets and managing the backlog. But what about everything

the backlog. But what about everything that happens [music] before delivery?

Jurro product discovery helps you move your discovery, prioritization, and even road [music] mapping work out of spreadsheets and into a purpose-built tool designed for product teams. Capture

insights, prioritize [music] what matters, and create road maps you can easily tailor for any audience. And

because it's built to work with [music] Jira, everything stays connected from idea to delivery. Used by product teams at [music] Canva, Deliveroo, and even The Economist. Check out why and try it

The Economist. Check out why and try it for free today at [music] atlassian.com/roduct-discovery.

atlassian.com/roduct-discovery.

That's a t-s.com/roduct-discovery.

[music] Jurroduct discovery. Build the right

Jurroduct discovery. Build the right thing. Trust isn't just earned, it's

thing. Trust isn't just earned, it's demanded. Whether you're a startup

demanded. Whether you're a startup founder navigating your first audit or a seasoned professional scaling your GRC program, proving your commitment to security has never been more critical or

more complex. That's where Vanta comes

more complex. That's where Vanta comes in. Businesses use Vant to establish

in. Businesses use Vant to establish trust by automating compliance needs across over 35 frameworks like SOCK [music] 2 and ISO 2701.

Centralize security workflows, complete questionnaires up to five times faster, and proactively manage vendor risk.

Vanta can help you start or scale your security program by connecting you with auditors and [music] experts to conduct your audit and set up your security program quickly. Plus, with automation

program quickly. Plus, with automation and AI throughout the platform, Vanta gives you time back so you can focus on building your company. Join over 9,000 global companies like Atlassian, Kora,

and Factory who use Vant to manage risk and prove security in real time. For a

limited time, my listeners get $1,000 off Vanta at vanta.com/acos.

That's v a nta.com for $1,000 off.

>> You did a pretty good job given that we had a pretty simple prompt.

>> Yeah. Yeah. That and that is a good point and that's something that we'll we'll look at in a second. So now that we're in t uh in cursor, we can open up a terminal in cursor. So the command is

uh control backtick. Um and then we're we're here. And then remember to start

we're here. And then remember to start codeex, we just type codeex. And so now we're the exact same thing that we had open from before, but just now we're in this we're in it sort of with the files themselves. So this is definitely if

themselves. So this is definitely if you're going to be working with one of these CLI tools, um, just open it up in an IDE because that's the easiest way for you to both do it and then also see like what it's what it's actually doing

as it works.

>> Okay, so we've seen it be able to search for files, we've seen it read files, we've seen it create files. Another

thing that you can do with codecs is you can have it search the web. So let's say we want to search the difference between cloud code and codeex and have do it do a summary for us. So we will just put

that in. So please go ahead and

that in. So please go ahead and summarize the differences between feature pricing ease of use. So it can also search the web and it's pretty good at it. But we're going to see one of the

at it. But we're going to see one of the first sort of limitations of codeex as it starts to do this work. One

difference between cloud code that um that's kind of interesting is so it tells you that it's working and then as it does stuff it'll tell you what it's doing but right away it'll start to ask for permissions. So it's like okay can I

for permissions. So it's like okay can I do this search and we'll say yes >> and I think cloud code was showing you like how much you were spending right it doesn't do that here.

>> Yeah exactly. So it's not actually showing us the the number of tokens it's using. Uh all it tells us is the

using. Uh all it tells us is the context. So, okay. So, one thing about

context. So, okay. So, one thing about codeex is that it just it will keep asking like every new website that it finds. Uh it wants to run a command so

finds. Uh it wants to run a command so that it can like view what's on that website. Uh but it will ask for every

website. Uh but it will ask for every single one even though I'm giving it the even though I'm saying you don't need to ask again because it's doing new commands it keeps asking me

versus on cloud code when you do it on cloud code. So, one thing that might be

cloud code. So, one thing that might be interesting while it's working here is we can actually also run this in cloud code. I love this. Let's put them headto

code. I love this. Let's put them headto head.

>> Okay. So, okay. So, we have Codeex running here and then let's go ahead and get Claude code running here. So, I have Claude on my computer as well. So, I'm

just typing claude. And now we have it open. So, let's give it the exact same

open. So, let's give it the exact same prompt. So, we're saying search the web

prompt. So, we're saying search the web to summarize the differences between the two.

>> Cloud code is smooshing.

>> Yeah. So, it's smooshing. This one uh chatgbt is working and this one is smooshing. Uh and so we see it like

smooshing. Uh and so we see it like right away it's doing like one thing that we're immediately seeing is that claude was able to run three separate web searches at the same time. And this

is not exactly it sub aents feature but it it is kind of like working in parallel whereas codeex is you know doing one at a time and it's asking for permissions for every single one.

>> Seems like they need to dog food their own product a little more over there.

>> Yeah. So uh okay so it's going to keep asking. So this is going to take a

asking. So this is going to take a really long time. So what I will show is my first sort of pro tip for codeex is so we're going to we're just going to exit and we're going to run codeex again but this time there's two options. So

one command we didn't look at before is called approvals. And so this is where

called approvals. And so this is where right now we're in auto where it will it is asking whenever it wants to run a command or make edits. We can put it on full access mode. And the first time you turn this on that'll kind of give you a

warning like hey are you sure you want to do this? because it gives codecs the power to just like kind of keep running commands. My advice and I know someone's

commands. My advice and I know someone's going to do this and who it might might mess some someone's up computer. But uh

I think if you do it as long as you're doing like normal stuff that's not sort of like interacting, you know, with like some sort of base level of your computer, it's pretty unlikely you're going to have any problems. And this

really helps you avoid just the the frustration. And there's like lots of

frustration. And there's like lots of memes about this where you run codeex and you give it this long task and it gets started and then you know you go to do something else and you come back and it's just needs permission for some like

small command and so if you run it with this full access mode then it will just keep going and it's it's much faster.

>> Yep.

>> Broken your computer so far.

>> I haven't broken my computer. Um and

like so we're in this directory so it won't leave this directory so we're not really doing anything that's too risky.

So, >> okay, >> that is a disclaimer is that if that happens to you, I'm really sorry, but I I that's how I use it. So, and then one thing that's funny is you can do codeex

and then you can do uh hyphen hyphen um yolo and then that will automatically launch it into this mode where it won't ask for any permissions.

>> Best hack ever. Codeex yolo, guys.

>> Yeah, we're in YOLO. the same version or the the version of that that exists in cloud code is um is d-dangerously skip permissions which I also use in cloud code um because it's just really it's

like it really feels magic when it can just keep doing things and doesn't have to ask you ask you for permission versus when you need to keep giving it permission. So, okay. So, cloud code

permission. So, okay. So, cloud code finished its answer over here pretty quickly. And um even though we weren't

quickly. And um even though we weren't in that mode, YOLO mode, it didn't ask for any permissions because it's just doing web searches. Um and then codeex is working over here. Uh and then yeah,

so looking at this this UI, it actually tells us like exploring GitHub API access um and tells you exactly what it's doing and is exposing if if you saw

what cloud code did is it kind of just ran we'll even just run it again just to just to show it while while codeex is going. Um let's do the same one. So

going. Um let's do the same one. So

we're going to do dangerously dangerously skip permissions.

So that puts it in this uh in this mode of >> Yeah. Yeah. basically yolo mode. So

>> Yeah. Yeah. basically yolo mode. So

let's just run the same command while codeex is running so we can see the differences.

>> Is cloud code generally faster or is it just on this query?

>> Um it's generally a bit faster. Um and

so that's one thing with with with I think uh with with codeex is it's just a bit slower for stuff. um even though it it also has like a different approach where it's like searching one website at

a time whereas with cloud code what we see is that it's doing three at once and it's just exposing a lot less information. Um in in cloud code it's

information. Um in in cloud code it's sort of showing us like the exact command and like what it's getting back whereas cloud code's kind of hiding more of that. And I think we're already kind

of that. And I think we're already kind of starting to see some of the the differences in the design philosophy between these two products where you know this one's mustering and then this one tells you it's looking for pricing

info. So Claude code I think kind of

info. So Claude code I think kind of tries to just do the right things in like a nicer simpler interface and uh codeex really wants to show you every

single thing that you're doing. And I

think it's that's kind of like the the engineer first I think is more on the codec side whereas maybe for product management work or document worst work work cloud code is like hides a bit more of it but still does it well.

>> It's almost like Apple vers Microsoft in that way.

>> Yeah. Yeah. I think that's a a good way to think about it. Okay. So cloud code finished and I think we had actually we had started it after codec. So it's

still working here but we'll we'll let it come up with it answer. We can uh one thing that you can always do and as you start to use these tools that becomes a bit exhausting but very fun is while one

is running if it's working on a task you can always just run another one. So

we're just going to open another codeex in yolo mode and then while the other while this first codeex is figuring that out we can just keep going completely new instance of codeex over here.

>> Yep. And you did some tab management there where you linked those two terminal windows. How did you do that?

terminal windows. How did you do that?

>> Yeah great question. So, all I did is you can just rightclick and then you can go ahead and split terminal and then that lets you have two terminals next to each other or or in easier ways you can just drag terminals to wherever you want them to go.

>> Nice.

>> Okay. So, let's go ahead and move on to the next thing which is just we're kind of just doing like a general overview here. So, um you can also have uh GBT

here. So, um you can also have uh GBT just view images. So, I have this image I created for LinkedIn just sort of like how technical do you need to be to use

um these AI tools. So, we'll just go ahead and screenshot that and then paste it in. So, as you would expect,

it in. So, as you would expect, >> infographic, >> oh yeah, this is a good one. Um and

then, okay, one thing that's good to know is that because I think all CLI tools have to have consistency across Mac and uh Windows in terms of their commands, control commandV, which

normally pastes on a Mac, doesn't work.

you have to do controlV like you would do on a uh on a Windows computer. So it

pasted it here. It kind of just shows us that it came from the clipboard in the dimensions. Please give me feedback on

dimensions. Please give me feedback on this image. How could it be more clear?

this image. How could it be more clear?

And so you can you can paste any kind of images in here, multiple images. It's

really helpful for like error messages or if you're trying to prototype something and you know the UI isn't right or you just want to you know improve something. you can just you can

improve something. you can just you can give it images and it won't display those images here but you can sort of trust that it does have them. Um cloud

code's implementation is slightly different where when you paste an image it just says the image. Um so again we're kind of seeing it in like another way where cloud just sort of it just

says just trust me like I got it right whereas with codeex it will say um you know just give you more information like here's where it came from and then here are the dimensions. So just very very different design philosophies for sure.

Mhm.

>> How would you contrast and compare the models themselves at document writing and coding? Like what would be the

and coding? Like what would be the scores for each?

>> Yeah. So this is actually a good example for what we just did. So let's look at our question for we asked it to compare uh codeex and cloud code. We asked them each to do that. And so what we can look

at is their actual answers here. So

cloud code is like a bit more formatted.

So like we get these like tags at the beginning of these different sections.

And it is it sort of really got the we asked for this to be for product managers and so it's asking for it for product managers and so kind of giving us what's the difference for PMs what's the pricing ease of use best use cases

for PMs and then let's look at codec so it sort of tells you you know their highle overview of features it does also have pricing also has ease of use also has best use case for PMs and quick

guidance so I think in terms of like how they actually write documents I like the way claude writes in general a bit more but But I think they're both like we and we didn't really give a lot of guidance and they had pretty similar answers. And

you can see that codeex is just a bit more it's a bit more straightforward.

Like if you ask for something it will just give you that and it doesn't even sort of have like the the pre-tags to these different sections. It just bullet points out the answers whereas Claude will kind of try to format it a bit more

nicely based on what you might want. Um

and so that's really like the difference. So, uh, it it really depends

difference. So, uh, it it really depends what you want and I would say if you know exactly what you want and you want it to follow like a specific format, then codeex is probably a bit better.

But if you want it to make some decisions because you're not exactly sure how you want something to be done, then that's where Claude will will try to guess. And it's pretty good at

to guess. And it's pretty good at guessing.

>> I like that differentiation.

>> And so that's also very interesting for coding, right? Because for coding, if

coding, right? Because for coding, if you are an engineer and you know exactly how you want something to be built, then that's why a lot of engineers really love codeex because it just follows the rules exactly. Whereas claude will it'll

rules exactly. Whereas claude will it'll do what you say, but it will also like usually do some extra stuff like it will create extra documentation without you asking. It'll add more comments without

asking. It'll add more comments without you asking. Um, so if you're if you know

you asking. Um, so if you're if you know exactly what you want, codeex is great.

But if you want it to do a little bit of the thinking for you, then I think that's where cloud code shines.

Makes sense. And when Saha Levvenia demoed Codex on the podcast, we found that it's very industrious, which I think we're seeing here as well. It

works for like 2 minutes 30 seconds.

It'll go off and do stuff versus the chat GBT in the browser most people are familiar with. You know, unless you have

familiar with. You know, unless you have that thinking time, you're not going to get that extended reasoning. And even

then, it's usually not going for quite as long as it seems like the CLI version would go.

>> Oh yeah, 100%. One thing with the CLI is you can give it, you know, very long running tasks and it will just keep going. Whereas if you're using it in the

going. Whereas if you're using it in the browser, you know, if you do deep research, it'll do something more, but it mostly will just, yeah, it'll just try to answer your question in, you know, under a minute without really like doing a lot of stuff. Whereas, because

we're in the, you know, we're in the folder structure, it can go into the files, read them, write them, use the one that it just wrote, run some code, and just keep going.

>> Yeah. So, if you really want that industrious version of chat GPT, you have to use codecs.

>> Yeah, I that's definitely true. And

that's one thing that's just kind of worth like we're doing a lot of comparison between you know cloud code and codeex but a lot of people they're only they don't necessarily want to have multiple subscriptions or they work at a

company that might only pay for chatbt or access to chatgbt. So you know there is the comparison but also it's just good to know what can you do with codecs if that's your your only option.

>> Yeah.

>> Okay. So we're going to go ahead. So

this was just having it review that image and it did a pretty good job. Um,

next we'll move on to just we're kind of just looking at like the overall capabilities um in general. So we've

seen reading and writing files, we've seen um viewing images, we've seen web searches. One thing that is another

searches. One thing that is another really awesome thing and where we continue to differentiate from the browser is that you can run code. So of

course you can write code, but it's also helpful to just be able to run code. And

so for this example, we have an API already um installed, but I had CHBT install it. So you can just basically

install it. So you can just basically imagine this is how you can have your, you know, your LLM actually interface with the world and just basically do anything you can do with code but in the

CLI. So here I found an API that can

CLI. So here I found an API that can take a YouTube video and then get the transcript. And normally like if you

transcript. And normally like if you asked uh GBT in the browser like hey here's this video give me the transcript. It just it wouldn't be able

transcript. It just it wouldn't be able to do it. It doesn't have the tools. But

here we have already given it the tool to be able to get those transcripts using an API and so it can pull that in for us. So, we're going to say go ahead

for us. So, we're going to say go ahead and use this API to get the transcript for this video. This video, of course, is our Cloud Code demo from last time.

Uh, and then it will basically create a new it'll just put it into a new file for us. So, what we see is it's going to

for us. So, what we see is it's going to go ahead and search our codebase. It'll

find that we have that API and then it will run it for us. And so, this is where you can really over time continue to add tools to chatbt so they can use them for all different kinds of things

that you might want. Um, other examples are you could connect you could give it an API to connect to like your database in some way so that it could pull in data for you. Uh, really like any any

information that's outside of it. Now,

there's probably an API that would let it pull that information in and then do things with it.

>> This is a really important point. I feel

like I'm going to be changing a lot of my prescriptions about when people should be using Jad GPT to be saying no, you should actually be using JGBT codec because if you want access to tools, if

you want to make your LLM actually agentic codeex is the way to do it. If

you want to connect to your database, codeex is the way to do it. That's a

really important point.

>> Yeah, exactly. So, we see it we see it running here. Um, search it found the

running here. Um, search it found the files and now it's basically running the the script. Okay. So, as we did before,

the script. Okay. So, as we did before, we'll go ahead and let this run. And

while we do that, we can go ahead and do another instance of codeex here.

Okay. So, for this one, all right, so earlier what we saw is we saw that we were able to give it these interview transcripts and then have it summarize them. Let's say that we had meeting

them. Let's say that we had meeting notes and we want them to be summarized.

If we just say, "Hey, summarize this meeting note with action items and decisions." it will do that, but it will

decisions." it will do that, but it will always do it a little bit differently um every time. And so

every time. And so >> what you really want, you know, if you're working somewhere, there's probably some sort of format that you want that's like consistent. So, you

know, when you you send it out to people every time, it's not different. So, in

uh this, and this is where we kind of start to see a little bit of divergence from from cloud code is there are no commands. So, you cannot add com custom

commands. So, you cannot add com custom commands to codeex. So you can't say you know in in cloud code you can say you can create a command that's like summarize meeting and then you can just

point it to those files. All that the commands in cloud code really are though is they're just other files that have instructions. And so here we have

instructions. And so here we have already created a template for meeting note summary. And so of course this

note summary. And so of course this could be anything that you wanted but here I'm saying that the the template that I want is discussion points action items risk blockers and next steps for

next item. uh and then some instructions

next item. uh and then some instructions for the summarizer. So even though we can't uh actually like have a new command, what we can do is we can say um

summarize the meeting notes from and then um we haven't actually covered this yet, but if you want to sort of target a specific folder or uh file, you just do the at sign. So

meeting notes.

Okay. So, we're we're adding the whole folder and then we're saying using and then this what we have is this template is called meeting notes summary template.

>> This is what reference folders and files. So, if you had a PRD template,

files. So, if you had a PRD template, you could apply this to a PRD template.

Basically anything.

>> Yeah, exactly. And uh we'll we'll do a nice little deep dive into how to use this for PRDs as well. Um, okay. So,

we're going to just we said here's where those meetings are, uh, those meeting notes, and then use this template. And

what it will do is it will follow those instructions exactly. So, this is how

instructions exactly. So, this is how you can sort of really start to define that behavior over time for the LLM. And

it's nice because let's say that you know you run this and then you get some feedback from your manager that your meeting notes are too long or or the key decisions weren't clear enough or you want to make sure that the date is like very specifically called out then you

can just edit this template and you can either run it again for those same uh meeting notes or you can just have that going um forward. And again this is another thing that's just pretty hard to do in the browser. And this is also a

nice way where, you know, when you're on Twitter or you're on X and you you see a really cool prompt that would, you know, help you think through something or, you know, there's all those like mega prompts that people post, but you never know what to do with them. You could

store them in here as part of your templates or prompts and then trigger them super easily so you actually have access.

>> Ah, this could be the best place to build your prompt library, too.

>> Yeah. You know, I think uh I don't know if there if anyone has really built like a good like open-source product management prompt library. That might be a good project that you could pull into these and then use whenever you wanted.

>> Yeah, it'd be huge.

>> Today's episode is brought to you by the experimentation platform Chameleon. Nine

out of 10 companies that see themselves as industry leaders and expect to grow this year say experimentation is critical to their business. But most

companies still fail at it. Why? Because

most experiments require too much developer involvement. Chameleon handles

developer involvement. Chameleon handles experimentation differently. It enables

experimentation differently. It enables product and growth teams to create and test prototypes in minutes with prompt-based experimentation. You

prompt-based experimentation. You describe what you want. Chameleon builds

a variation of your web page, lets you target a cohort of users, choose KPIs, and runs the experiment for you.

Prompt-based experimentation makes what used to take days of developer time turn into minutes. Try promptbased

into minutes. Try promptbased experimentation on your own web apps.

Visit chameleon.com/prompt

to join the wait list. That's k a m [music] e l e o n.com/prompt.

Today's episode is brought to you by NI1. In tech buying, speed is survival.

NI1. In tech buying, speed is survival.

How fast you can get a product in front of customers decides if you will win. If

it takes you 9 months to buy one piece of tech, you're dead in the water. Right

now, financial services are under pressure to get AI live. But in a regulated industry, the roadblocks are real. NI1 changes that. Their airgapped

real. NI1 changes that. Their airgapped

cloud agnostic sandbox lets you find, test, and validate new AI tools much faster from months to weeks, from stuck to shipp. If you're ready to accelerate

to shipp. If you're ready to accelerate AI adoption, check out NI1 at nia.com/ashos.

nia.com/ashos.

That's n a y a n.com/

a kas.

Today's episode is brought to you by the AIPM certification on Maven run by McDad Jaffer [music] who is a product leader at OpenAI. This is not your typical

at OpenAI. This is not your typical course. It's 8 weeks of live

course. It's 8 weeks of live cohort-based learning with the leader at one of the top [music] companies in tech. OpenAI just doesn't stop shipping

tech. OpenAI just doesn't stop shipping and this is your chance to learn how.

Run along with product [music] faculty and Mo Ali. The course has a 4.9 rating with 133 reviews. Former students come from companies like OpenAI, Shopify, Stripe, Google, and Meta. The best part,

your company can probably cover the cost. So, if you want to get $500

cost. So, if you want to get $500 [music] off, use my code AA25 and head to maven.com/roduct-f

faculty. That's mavn.com/prct-facy.

[music] >> Okay, so we see that it's working. Let's

go back to that first one with the YouTube transcripts. So,

YouTube transcripts. So, >> so to review here, codecs can connect to API. So, we have connected to the is it

API. So, we have connected to the is it the YouTube API here?

>> Uh, yes. or it's not the YouTube API.

It's it's one it's one that someone else built called the YouTube transcript API.

Um and it's using it's using that.

>> Cool.

>> Okay. So, save this transcript to data YouTube. So, let's go into that data

YouTube. So, let's go into that data YouTube. I should be charging $999 for

YouTube. I should be charging $999 for this cloud code tutorial. And then where is this? Okay. And then we have that

is this? Okay. And then we have that full entire transcript here. So, it

worked basically. So, that's cool. that

this would be very hard information to get into, you know, a file that you could then use because now, of course, we could have it summarize this or do interesting things with it um without us having to copy and paste it in or something.

>> Yeah, you literally can't get Chad GPT to go pull a YouTube transcript like this otherwise. Um it like struggles

this otherwise. Um it like struggles with the browser use and everything like that. So, this is a way to connect it to

that. So, this is a way to connect it to things that you can find via APIs, which is very powerful.

>> Exactly. Cool. Okay. And then it created the summaries. So meeting notes

the summaries. So meeting notes summaries. So putting them into meeting

summaries. So putting them into meeting notes summaries. So meeting notes

notes summaries. So meeting notes summaries. Okay. And then we see that we

summaries. Okay. And then we see that we have all those meeting notes according to that sort of exact template that we gave it before.

>> Very cool. So it connected to APIs here.

It used our template to create meeting notes. These are some crazy use cases

notes. These are some crazy use cases compared to regular chat GPT.

>> Yeah. And now what we'll do is we'll kind of take these things that we've used and we'll put them together in a pretty powerful way. So what we'll do is we'll go ahead and launch codeex again

and let's go through a quick um sort of tutorial on how you can really use so so far what we've seen is we've just been giving chatbt like basic like you know

things that we for sure know it can do just summarize this meeting transcript um use this code but one thing that was really useful about uh you know having it in this interface is that we can

actually have it like have a conversation with chatbt before we actually create the document. So if you need to do some thinking, GBT can be a a really good thinking partner where it can ask you questions about the PRD that

you're trying to build before then actually creating it for you. Because if

you just and if you if you just ask an AI to build a PRD, it will do it, but it won't be very good. And most of it because most of the way that you write a good PRD is with good thinking,

sort of good upfront thinking as a product manager. So that all of that

product manager. So that all of that thinking sort of is embedded in the document.

>> Yes. Exactly. When your company gives you that template, they're actually asking you to think about it, not just prompt AI for it. AI will just make some weird assumptions.

>> Exactly. And and here's where you can do like some cool stuff is you can you can have in your uh like in your folder basically company context. So here we have in our docs we have business info.

So this tells the AI like what is what does our company actually do? Especially

if you work at like a startup or a specific even a specific part of you know the company that you work for you can say like what are you actually trying to accomplish so that when you're when it's writing that PRD it has so

much more context than if you just say write this PRD without understanding kind of all the other stuff and this really helps it sort of gives it the context for questions to ask you about how you can improve your PRD. Mhm.

>> So, first what we'll do is just to make sure that it kind of understands its context is we will say, "Okay, read this doc for business info and give me a summary." And then next what we're going

summary." And then next what we're going to do and and this is completely different than anything we covered in the cloud code tutorial is um and this is something I've been playing around with that has been pretty effective is

in these um so in our templates we have something called Socratic questioning.

So this is just one example of how when you find one of those like really interesting prompts online or something that you think would be helpful, how you can actually use it. So, in this case, I saw one about Socratic questioning,

which is basically the AI will ask you like really open-ended questions that don't have a right or wrong answer, but they are the types of questions that if you're in a meeting with your manager or, you know, some execs and they ask,

"How do you know that this is really the right problem to solve?" or "Why do you think we should do this when our competitor is doing this?"

um it will kind of guide you through those types of questions both to kind of help you think about them you know that's the most important thing and then also think through those problems and then ultimately actually embed those in

the PRD when it goes to write the PRD >> love it >> so what we're going to do is we're going to say please read the Socratic questioning uh prompt and then give me some questions about and then in this case we're going to go through an

example where we're creating a feature for a voice chat with your to-do list.

So that might be useful for, you know, someone who's on the field and they have a bunch of stuff that they have to mark off as they complete it, but it's hard for them to like get to a computer interface or something. So right now we see it, it's opening up that questioning

template. It's reading it and then it

template. It's reading it and then it kind of understands the the context from our company from what we gave it before.

And then it will ask us some questions.

And of course, this is where when you started, you'd probably have like a lot more ideas for it. And um one tip that I definitely have and and this is something that people talk about commonly is with tools like Whisper Flow

where you can just you can just kind of talk and just explain your idea without having to worry about formatting at all because it will all get accurately transcribed and LLMs are really good at

just taking brain dumps or you know any sort of amount of context and understanding what you're actually trying to communicate and they can use it. So, it's a really good way to just

it. So, it's a really good way to just start your PRD is just explain like the whole feature and what you're trying to accomplish and then it will use that to ask you good questions.

>> And you can even just put a list of like things to remind yourself as you're speaking into Whisper Flow, you know, the problem, the solution discovery you've done so far, other features that you've launched, and that can just really help you create this right

context.

>> Yep. Just kind of stream of consciousness is sometimes a good way to go. Okay. So, the the mode that I have

go. Okay. So, the the mode that I have it in right now is it's asking like multiple choice questions. And one thing I'm going to do to just make this quicker because the the exact output here isn't too important. We're going to

go to like a lower model just to keep it um quick. But basically it'll say okay

um quick. But basically it'll say okay here's a question. So so basically so it's first question here is asking us like why do we think that this voice to-do list feature would be helpful? So

it's saying do we have quantitative? So

do we have data for this? Is it

qualitative? We've heard from user interviews or is it strategic? We just

feel like we need to have this feature.

So, we looked at our uh our our data from before and one of them we just saw that our user interviewers were pointing to this being a good feature. So, we'll

go B and it will just ask us a few questions. And again, the the exact

questions. And again, the the exact answers here aren't aren't that important. So, I'm just going to go

important. So, I'm just going to go ahead and put them in, but sort of shows the idea of how it can really ask you like pretty in-depth questions to really challenge your thinking.

>> Um, >> okay.

>> Underate the LLM as thought partner use case, but here, especially because we gave it the context, it's pretty powerful. Yeah, exactly. And it's asking

powerful. Yeah, exactly. And it's asking it's actually asking us like a lot of questions here. It's saying it's

questions here. It's saying it's following up on a lot of those questions that we just asked. So, it's saying when is the best time to launch it. What has

to work reliably for V1? How should we?

And then it's also asking about some like the MVP scope. What should we do when we have only like really low confidence transcripts or what fields do we need to start with only for V1? Can

it just be title and due date? Does it

also need an assigne? So these are the types of things where you know hopefully you're already thinking about them as a PM but there's a there's a lot that you have to manage you know as a PM and every product can be much more complex

than you expect. So having this LLM like really challenge you on all these different aspects both kind of to remind you and then give you some options to think through makes it so that once you do present your work to you know your

engineers or your designer or your manager it's much more thought through.

>> Yep.

>> Okay. So what we're going to do is we're just going to go ahead I don't I don't I don't want to answer all these questions for this f fictional product. So we'll

kind of move on to the last step here is so we imagine we're in this world where it understands our business context.

We've been talking to it you know for like an hour answering all these questions and now we're ready to just have it actually write the PRD. So we

can do a similar thing to what we did before where we can say hey I have this PRD template I like. And then another cool thing that you can do is you can say like let's say you're working at a company where there are some like really

killer PMs who write amazing PRDs. Maybe

they're also using chat. Uh maybe

they're also using codecs. Um but you've just seen them and they're great and you want to model yours after theirs. You

can also point to those existing versions and then have it sort of create another version for you that of course it's not copying any of the language but just the style um is something that you

can pretty easily do with this.

So this kind of last thing just pulling it all together is we're going to say um please review three PRDs from this folder and then um create the the final

PRD. So please use and then we're going

PRD. So please use and then we're going to go ahead and add PRD template.

>> So the ideal PRD structure first inform it of the context in the folder have it review the template have it do the Socratic methods with you and then you have it review the example PRDS and

finally create it.

>> Yes. Exactly. Um, perfect. And then

let's go ahead and just put in this last thing in docs example PRD. So let's say that you know like let's say your manager before they got promoted they had like a really killer PRD and that also makes a lot of sense for the

feature that you're building. And so

we'll say that maybe that's this uh mobile workflow editor.

So, please use this template and then this doc as inspiration to write our PR based on everything we discussed above. Uh, and put this into a

discussed above. Uh, and put this into a new MD file.

>> And it's crazy how much faster it is dictating.

>> Yeah. Yeah. Every time I start typing, I'm like, "Oh." Also, I have like a a weird weakness where I I'm not a super slow typist, but whenever I'm screen

sharing, I am just the worst absolute worst typist anybody has ever seen. So,

uh, that's like a thing that like I've actually people have commented on at work. Like, Carl, do you like know how

work. Like, Carl, do you like know how to use a keyboard?

So, that's why I'm I'm very extra grateful for, uh, voice dictation in these these podcasts for sure.

[laughter] >> I think the 16k views in seven days begs to differ, but a lot of dictation did help there.

>> Yeah, for sure. Um, okay, great. It's

and it's pretty fast at this point. Um,

and it's putting it into our whole format here. So this is a way that the

format here. So this is a way that the output that you can get from a PRD just going through this process versus really anything you can do in the browser is just it goes from being a sort of kind

of there PRD that you're going to have to edit a lot to a really good PRD that like is almost right out of the box going to be like almost everything you want because because you've done all

that actual thinking with it and then it's just embedding that versus it makes it and then you have to kind of think and respond. Um, and I think that's like

and respond. Um, and I think that's like the classic thing where does AI really speed you up if it creates something but then you have to spend the same amount of time fixing it?

>> Um, cool.

>> Yeah. No, definitely not.

>> PRD workflow.

>> Yes. Uh, okay. So,

sometimes you always just have to ask like your your AI like where what what did you do? Um, oh, open. Okay. Please

move it there.

Uh, also one thing I didn't really proofread what I said before. So I said to call it new MD. So it it did that instead of new space MD. Oh, but here it

is. Okay. So let's just look at this.

is. Okay. So let's just look at this.

Okay. And now we have this very nice PRD customer problem, business impact goals with our metrics. And these are all tied to things that the company actually cares about from that folder we had

before. Target users. like this is a

before. Target users. like this is a this is a really really nice PRD. Um

>> a lot better than what you're going to get if you just say write me a PRD and give it some context.

>> Yep. Exactly. Okay. So, we are getting pretty close to the end of what you can do uh with with with codecs in the CLI.

So, we've seen we've kind of seen most of the stuff. We've seen it be able to search, read, write files. Seen it be able to create new documents. Seen it be able to be a thought partner. seen how

you can store prompts and then use them.

Um, and so in some ways it's a bit nice because it you don't have to like there's no like super complicated workflows and we'll talk about cloud code and how that compares in a second, but really it's just just files on your

computer and then you can save prompts in those files, you can save templates in those files and you just sort of reference those files when you need to use them. Um, and so that's why, like we

use them. Um, and so that's why, like we talked about already, there's no there's no custom commands in codeex, but all a command really is even in cloud code is

just basically pointing to a a a markdown file. So there's some like it

markdown file. So there's some like it it can feel a bit less fully featured, and there are some areas where it is less fully featured, but I actually think that there's some niceness to how they haven't like overbuilt a bunch of

these features that are just basically pointing it to files.

>> Um, okay. So, we already covered that there's no slash commands. Um, another

thing that Cloud Code has that is pretty nice and does not exist in codeex is planning mode. So, really quick, just as

planning mode. So, really quick, just as a reminder, um, when we're in cloud code, this is sort of the opposite of the YOLO mode is planning mode. So, that's like,

hey, do not do any you're not like literally it can't make any changes to any files when it's in plan mode. Um,

and so what Cloud Code will do is you'll have a conversation with it and then it will basically tell you what the plan is and you can modify the plan before it starts. Uh, and then it will actually

starts. Uh, and then it will actually kind of show you the checklist. And

they're starting to do some very cool things with that feature where if there are like gaps in the plan or there's decisions that have to be made, it will give you like multiple choice options uh for what like, hey, how do you think we

should handle this edge case? And then

it says you could do this or do this or you can type your own thing. Um, so

that's pretty nice. But otherwise, this type of feature doesn't really exist in codecs, although it does in the sense of when we go to those approvals. So the

slash approvals command um instead of so we started with auto that was kind of where we started this podcast and then we moved into yolo mode, which is this full access never ask for permissions.

you can put it in readonly mode and then that's just exactly what we were talking about where it can't change files or start doing anything because LLM's they're built to serve. They just want to start doing stuff. So this stops them

from doing that. But I would say it's not really like a it's not as fully featured as cloud codes. So if you need to do a plan and you don't want to start changing things, then you just tell it like what you're trying to do and it it

can't start coding, which is helpful. Um

and so this is the closest that you can get with it right now.

>> Got it.

Um and then okay so the big thing the very most important thing that codeex is missing that cloud code has is agents.

So when you do something in cloud code and you it will you can ask it to spin up sub aents and what it does is it will basically like create clones of itself

with different tasks and then it can parallelize work very powerfully and codeex doesn't have anything like that sort of out of the box. Um, and so for like we saw kind of before. So what

we're going to do is we'll use an example here where we say um, hey, use that YouTube transcript API that we talked about from before. Um, I want you to search for the best ways to use codec

CLI. I want you to get three videos. I

CLI. I want you to get three videos. I

want you to run them all in parallel and then create a summary document. If we

told that to cloud code then kind of out of the box it knows how to do it. There

is a kind of hacky workaround that we can use to do that in codeex.

How do we do that in codeex?

>> So, as we've seen, we you can have codeex running multiple times. Like we

can run it here and give it a task and then we can go to another terminal and we can run it here and give it a task.

What you can do is you can have codeex do that on its own. So, it basically will make terminals that you can't see in the background and then give those tasks and then they can run in parallel

and then it can get the outputs from all of them and put it back together. So it

and functionally that actually works pretty well and it is similar to cloud codes agents but it is definitely a workaround. It's not sort of like a

workaround. It's not sort of like a first class feature like it is in cloud code.

>> So how do we do that?

>> Yeah. So you can actually basically like with most stuff you can just kind of tell it to do it. So we'll start a new chat here and then we're going to give it that exact command that I just said.

So we're going to say find these three.

So we're going to say I want you to find three recent videos for a codeex CLA on YouTube and then create summaries in parallel. Here's the approach. And so

parallel. Here's the approach. And so

the first time you do this kind of approach, it will have to sort of like write the the script for it, but then going forward it can use that more easily. So search YouTube and give me

easily. So search YouTube and give me three video URLs, then write a script that you know runs them in the background, uses YOLO mode so you don't have to worry about permissions, gets a transcript, creates a summary, and then put them all back together. And so let's

actually, this is a good one. Let's do

it in let's show the difference between cloud code and codeex for this.

>> So, okay. And then let me make sure that

okay. And then let me make sure that we're in the right mode. Okay, good.

They're both in YOLO mode. So, I'm going to change it a little bit for cloud code because it doesn't need to do that. Um,

we'll just say instead of writing the script, instead of writing the script, we'll we'll just have it um spin up three agents in parallel to create summaries of each video.

Okay. So now we we'll really see the difference between how these these two approach this for codeex. There's some

things that has to figure out like it'll have to figure out exactly how to do this. Whereas um we see cloud code just

this. Whereas um we see cloud code just starts working. So first it's finding

starts working. So first it's finding it's it's searching on YouTube to try to find those videos that we can use for the API. So it already found actually

the API. So it already found actually yeah it it'll keep thinking. So it did a search didn't find anything but it'll keep thinking and um yeah it's interesting. Cloud code can be a bit

interesting. Cloud code can be a bit inconsistent here. We actually saw it.

inconsistent here. We actually saw it.

There aren't really Okay, so cloud code got confused. it couldn't actually find

got confused. it couldn't actually find those videos, whereas it looks like Codeex actually already found those videos. So,

videos. So, I'm not sure if it's because Cloud Code has like some sort of limitation for searching YouTube. I

think that's definitely possible. Um,

there's a lot of websites these days that are kind of trying to block agents from getting into them because of like all the data extraction. But let let's see let's see when we run this again if it finds them. Okay, so now with with

codeex we saw that it did successfully find three videos and now it is creating that script to run itself in parallel and so it's planning that and then okay great. So this time uh cloud code did

great. So this time uh cloud code did find it.

>> So yeah sometimes that's one thing with just working with LLMs in general is they're sort of they're never deterministic which is the beauty of them and kind of the the pain of them as well.

>> Yep. And the fun of live demos.

>> Yeah. And the fun of live demos. some of

some of this stuff like I have kind of on the rails and then some stuff is like okay like I I'll have tested it on my own before but you never know exactly what will happen when you actually run

it. Um okay so uh I actually would say

it. Um okay so uh I actually would say that so far I think Codeex is going to ultimately be slower but it's doing a better job. Um it found like more

better job. Um it found like more relevant videos where we see cloud code is kind of starting to do some like crazy stuff. um like it has found our

crazy stuff. um like it has found our original prompt was also to look at codec cli and then what we see here is it's kind of it's like going way off the rails where it's like pulling in medium

articles. It's like a it's like no

articles. It's like a it's like no longer on YouTube. Um it found a cloud code tutorial for building a YouTube research agent which is like obviously

way off what we want. I think it's I think what it's doing is it's looking at these websites and it's trying to find the YouTube videos and it's trying to find three. So, it did okay. It did

find three. So, it did okay. It did

eventually get there. It didn't go like completely off the rails, but it definitely took it a lot longer than I would have expected to find them.

>> Yeah. But it's cool also at the same time that it did like go through the process of, oh, this isn't going to work for me. Let me go find something else.

for me. Let me go find something else.

>> Yeah, definitely. There there's a there's that saying uh that you can feel the AGI. I think it's some of that stuff

the AGI. I think it's some of that stuff where it navigates its own problems that you kind of start to feel the the AGI. I

think for me, like a big one um was I was working with cloud code the other day and it committed its work before it started and then it tried to do something and it realized like what it

did wrong and then it went back to that commit and just erased its work but it's like okay let me go back and do it the right way and then it the second time through it got it right. So these tools are getting they're getting pretty

smart. they're doing some uh like sort

smart. they're doing some uh like sort of surprisingly human behaviors in some ways because that's exactly how if you're working with this tool, you're like, "Oh, okay." Like we we were doing it wrong, but we figured it out. Let's

go ahead and go back and try it again the right way. It's starting they're starting to do that on their own.

>> Okay. So, yeah, I think Claude Code had a slow start, but it now started the tasks to run the YouTube transcripts, and all three agents have compiled their summaries in parallel. Now, let me

combine them into one doc. So I think Claude Code at the end of the day is gonna win this this agent race which makes sense because it's so much more sort of baked in in first class. Mhm.

>> And then while this is happening, kind of the last good thing to talk about in comparing these is so this is just agents where it's sort of like a sub instance of the main thing that it can

give its own rules to. But there are actual dedicated sub aents which we saw in cloud code where you can sort of define like how it works and what tools

it has like a executive reviewer or a design reviewer or an engineer that uh exists and is pretty first class in cloud code but there's really no version of that in in codeex. I think the

closest you could get was is you could make a file like everything else that sort of defines how it works, but in in cloud code like they you can give them specific tools and just much more configuration. So I wouldn't be very

configuration. So I wouldn't be very surprised if we saw something like that coming out of um OpenAI for codec soon, but as of right now there's just no parallel.

>> So cloud in general cloud code is more agent.

>> Yes, for sure. Okay, so cloud code is just about done here that it actually it looks like it did finish. Okay, so Cloud

Code one, but Codeex is not too far behind. Um, it's now at the step of it

behind. Um, it's now at the step of it has all the transcripts and the summaries and now it's putting them all back together. So,

back together. So, that is basically everything to cover in terms of like main PM tasks for these two tools. Okay, Codeex, honestly, it

two tools. Okay, Codeex, honestly, it was pretty pretty close race overall despite the completely different approaches. So I would say I wouldn't be

approaches. So I would say I wouldn't be too surprised if Codeex um in general cloud code just kind of like leading the whole CLI interface. I think they were the first ones to really come up with

the idea of trying it. And so you know at this point they are pretty far ahead and other teams are playing catch-up.

But I wouldn't be that surprised if once uh OpenAI sort of gets to these sub agents if it could have probably done this task faster than than Cloud Code.

>> And which one was more quality?

>> Yeah, let's look. Uh okay, so this one.

So codeex uh the cloud code one is in YouTube code data. Okay. So this is cloud code.

Okay. So it's kind of showing each video step byep highlights key takeaways video two video three.

>> Pretty good. Kind of verbose.

>> Yeah they're pretty verbose. Let's see

let's see how this compares to what um Codeex did. So yeah pretty verbose. I'd

Codeex did. So yeah pretty verbose. I'd

imagine that these are probably pretty good summaries. It's really calling out

good summaries. It's really calling out like some very specific stuff. The

reverse deepseek moment. So, it it looks quality. I don't think we want to read

quality. I don't think we want to read through all of them right now, but let's just look at like high level how they look providing document location.

Okay. Final report output. Okay. Here we

go. And then here is the codeex one.

Open the preview here. Okay. Okay, so we definitely see like a totally different style, right? Like with the the cloud

style, right? Like with the the cloud code one was like sort of broken up into sections and then gave bullet points for each one and then kind of gave like a summary of everything at the end.

Whereas if this this codeex one it looks like it is >> formatting error.

>> Yeah, some weird formatting.

>> We could probably ask >> not very good. Yeah, I think that the code the codeex one it obviously of course chose different videos.

>> It's got a lot of stuff like get instant access to all premium courses and stuff.

It doesn't seem like it did a good job of actually.

>> Yeah, this doesn't look good at all. So,

I think there may have been something that got messed up in those like the execution of the sub aents. So, I think for right now, parallel parallelizing tasks in the same way that you can do in cloud code, just really not that great in codec.

>> Mhm.

>> Okay. So one thing though and this is kind of where we'll transition to the next kind of like last thing here is we've been a bit unfair to codeex so far because we are comparing on sort of

classic you know PM document based tasks and overall I would say that cloud code's probably a bit faster it's has a little bit more features that are really first class but what we haven't talked

about and what codeex is really actually built for is coding and so for this last part let's go ahead and just kind of uh cover some of the main things that you can do with codec. So the really really

great thing about codeex is that it is like especially if we use and we're going to switch over to the GPT5 codeex model. It is really good at doing like

model. It is really good at doing like much more complicated features in ways that uh cloud code or really no other coding agent has achieved yet. and we'll

kind of talk about how can you take advantage of this sort of super big brain that will do an incredible amount of work but make sure that that work is actually quality.

>> And so what we're going to do is we'll go ahead and so we're kind of transitioning from classic PM task into prototyping and I will give an example of something that I have built. So we're

going to go ahead and move out of this repo. So that kind of covers everything

repo. So that kind of covers everything we had for that and we're going to go ahead and open a new project here.

Okay. So, this is a thing that I've been working on. It is not related to the the

working on. It is not related to the the um task flow at all. This is something different. So, it's called the Tik Tok

different. So, it's called the Tik Tok recipe bot and I will show how it works.

So, here's the prototype that I've built and we'll use it to kind of give a give some examples here. But, basically, uh right now, so let's say you are on Tik Tok and you find one of these really

good-looking videos of something that you want to make. Then if you have this recipe, then they can be really frustrating to actually create because you look at them

>> and uh they sort of show the steps, but they don't like ever really give you the actual recipe or okay, this person kind of classic engagement tactic, right?

Comment recipe and I'll DM it to you. Um

I think we're all familiar with with that tactic, but it's like how do you actually make this recipe? And so this is something so this is something I built for my girlfriend where she's, you know, she goes to actually make them and then there's like they're very

frustrating to make. And so what this does is you can give it a uh a Tik Tok video and it'll extract the recipe. So

what's happening here in the back end is it is downloading the video uh using there's like APIs that already exist for that and then it is giving it to Gemini and Gemini I think right now is the only

model that can consume video which is actually pretty awesome. So you can give it a video and it will actually watch it and understand the the sound and everything that's happening and it also grabs the the caption in case there's

instructions there and then it formats it into like this really nice recipe.

>> Wow.

>> And so this was something that I did build with Codeex and I learned a couple very important things about how you can use um these coding agents to do much bigger amounts of work at once. Oh, and

then you can also download the PDF. So

this is I'm pretty happy with this feature. I feel like it it turned out

feature. I feel like it it turned out really well and I learned a lot about sort of coding with AI from it. And so

the main the most frustrating thing about when you're using these uh when you're using these these agents to code, especially if you try to have them one shot uh you know where just you give it something and then it actually gets the

output in a way that you actually want.

It's very it can be very frustrating because you tell it to do something and then they'll it'll either you know get completely messed up somewhere in there and then pretty much no matter what they will say that they did it right. They

will tell you oh I I completed the work the feature works perfectly and then you go to test it and it looks bad and it doesn't perform well or it does usually like it might not even work at all or the features completely don't work. And

so what we'll kind of cover here is how can you create like really really good specs up front so that when AI does go to do the work it's much higher quality.

>> So there's kind of two aspects to it.

One is the design. So one thing that one thing that sort of comes up a lot is that AI is like it kind of can do UI but it will never be like that good or that consistent. We'll cover how you how can

consistent. We'll cover how you how can you make your designs of like what the AI actually builds look good and then how can you actually have it test itself. And that is one thing that I

itself. And that is one thing that I think really starts turning a, you know, a vibe coder or a product management builder or a full stack PM from just someone who's building like these

prototypes and they're pretty hacky to someone who's like actually doing some like real vibe engineering and that's what we'll cover um here. So let's start

with the design stuff. So I have a couple slides here. Okay. So really what you want in order for AI to be able to design stuff well is you need a design system. So if you you know I feel like

system. So if you you know I feel like most product managers are very familiar with how the concept of a design system.

So you're at your company and you want to build a new feature. When your

designer is going into Figma and they are creating the new designs for that thing, they are probably not completely starting from scratch, right? Like

they're not going into Figma and then drawing a new rectangle and then rounding the corners and then figuring out what color they should use by like using the eyedropper tool. All of this stuff is is already defined like they

have some sort of component library. And

so what you want when you're designing with AI is you basically want to give it the exact same types of things that a designer at your company has. So you

have variables that it can use for those design decisions. So like it already

design decisions. So like it already knows what the colors are and like how sort of the spacing and different things like that work. Um you want to have a component library. Um, and then what

component library. Um, and then what we're going to look at here and I'll demo is um, the best way to to sort of store these things and you know a lot of PMs might actually like their engineers are probably using it at their company.

It's called storybook. And then you can combine all of that into this component registry. And so when the AI goes to

registry. And so when the AI goes to build something, it already knows like what are all the variables that it can play with, what are the components that it has access to. um you can see those

in storybooks so that you can like you can actually see them before the the AI goes to build anything which is exactly how things are built regularly right your designer creates the figmas and

then you review them in your team meetings and then you end up you sort of agree on what you're going to build and how it's going to look before the engineer ever like actually starts building it. And so that's kind of the

building it. And so that's kind of the mindset shift if you really want to move from just being a vibe coder to really doing some vibe engineering is you're probably already familiar with the process because of how you're doing it at your own company.

>> Yeah, totally.

>> Um, okay. So, first let me just show storybook. So, what this is and if you

storybook. So, what this is and if you if you have like your uh your existing thing that you've already built, then you can just ask your AI to like add storybook. But what we're going to do

storybook. But what we're going to do here is we'll so npm run storybook. So I

already have it set up, but when I set it up with codecs, uh, GPT5 codecs, it just literally like oneshotted it. So

this is like a very standard thing in like the web development world. So AI is quite good at it. Okay, so what it's doing here is it's it's kind of running like almost like a secondary website for um, just showing these components. And

so if we look again at what this extractor looked like, we have this section where you put the URL. We have

this section that actually shows the the recipe. And then we have this section

recipe. And then we have this section for the different sort of like creators that are already good. So each of these things is a component. And if we go into our component library, what we see is we

already have these like created here. So

that when the AI goes to build something, it already kind of knows what the pieces um are. And it even knows like what are the different uh variations of those components. And

similar for the recipe display, we already have it built here. And this is just this just gets rid of so many frustrations of vibe coders because normally you say, "Hey, I want it to

look like this." And then it will build it and then I'll be like, "Okay, uh, now can you actually make, you know, the title pink so that it goes better with the rest of the format?" But let's say I was on my website, I would have to like

rerun it would make the change and then I have to rerun this and like let the LLM extract it and then I'll see it and it's like the wrong shade of pink or something. Um, but what's nice is when

something. Um, but what's nice is when you have this component library, you can make edits to how things are going to look without having to kind of worry about any of the actual, you know, like backend code and you can see it live

right here. Um, so like as an example,

right here. Um, so like as an example, we can show this. So this component is called recipe display. So what we can do is we can say we can tell the AI. So

we'll use codeex here.

Okay. And then I think yeah so please update title. And this is actually a good one where I'm gonna before we run this I'm going to go ahead and actually this is an example where sometimes when you're doing something

completely new you would want to actually use that that planning mode that we talked about just because sometimes you like want to make sure it knows what it's going to supposed to do and where to where it is in the codebase

before you let it try. Um, I just know because I ran this this sort of like practice part of the demo earlier and it did some wrong stuff, but when you ask

it to find it first. So, please update the color of the title in recipe display

component to be pink. Um, please locate the code first. Your plan.

This should be reflected in storybook when I see it. Okay. So now codeex is going to go in here and it's going to oh actually okay last thing we still have it on the weaker model from earlier. Um

so let's go ahead and turn on codeex now. So we're going to put on codeex in

now. So we're going to put on codeex in medium and then we'll run that same command. So now is the first time for

command. So now is the first time for this demo that we're having it actually use this the version of itself that is built for code.

>> And the main differences here are tends to be the post trading is really optimized for code and large tasks.

>> Yeah, exactly. And uh that's a lot of people just they really like it because it is so optimized for code. Oh, here we go. So, this is the first time we've

go. So, this is the first time we've actually seen it like sort of display its checklist. But if you and I and I

its checklist. But if you and I and I wonder if it's because we're in codeex mode. I'm actually not sure, but because

mode. I'm actually not sure, but because we didn't see it the rest of this whole demo, but this is where it actually like shows its plan, which is a very cloud code like feature. Okay, so it said it found the rest of it way component.

Okay, great. Yeah, this is exactly what we wanted to do. So it found like basically where that color is defined as gray and then we're going to turn it to pink. Um, okay. So now we're now now we

pink. Um, okay. So now we're now now we trust it. We we believe that it kind of

trust it. We we believe that it kind of is doing the right thing. So we're going to go ahead and go back to yellow mode.

And then okay, please make only that change. And now we'll go ahead and do

change. And now we'll go ahead and do it. So now a natural question that you

it. So now a natural question that you might ask is like how do you get components into your library? So first,

oh great. Okay. So now we're back here and we can see the design was changed to this pink color um >> without us having to to like run it in the actual code. So now next time when

we run this it will be pink. So we can we can trust that that's happening. And

that is just a really good way to iterate on designs quickly without having to like worry about generating them each time.

>> Okay. So the question might be like how did how did I get this stuff in here? Um

so there's a couple different approaches. one is you can just tell the

approaches. one is you can just tell the AI like hey I need a component for this put it in story book and then you'll see it in here and you can like iterate on it but most of the time for most of the

things that people are building you're not really reinventing the wheel and like if you're building something more complicated like let's say you need a calendar UI in your product that's actually that can be a very complicated

feature like if you wanted something where you can select a range of dates and you want like you know keyboard to work and you want it to be accessible all of that stuff can be like just unnecessarily difficult to build because

there are already really nice component libraries that exist. So the most common one these days is this shad CN. That guy

is also really funny on on uh on X. Um

and he's built and I don't know if he has a team or exactly what, but he has this amazing website where there are all of these different components that are just pre-built for all different kinds of things. So the calendar I just

of things. So the calendar I just mentioned, instead of building one on your own from with AI, you can just use these sort of like pre-built components that are built

like in the exact way that AI wants where it's called being headless. So

they all like it kind of gives you the structure of it, but all of the like actual design elements, you can grab this and then change all the design elements to be what works for your product or what colors you use rather

than these colors. But all of the logic and all of like the the difficult things you kind of get for free if you add these to your component library.

>> Very cool. So you're basically like defining these components using this system built on top of React that allows you to interface with AI better and have it adopt your design system.

>> Yep. Exactly. And so when it goes to build something and you say, "Hey, we want to add a calendar use this component." it will that both gives it

component." it will that both gives it like enough direction for like how to use an existing thing and it gives it like enough flexibility so that it works for your specific code. And so here's just this example where there's all

these different variants of it. Um

there's you know this range or yeah a range calculator uh month year selector if you want it like that like all these different versions that would make sense for your product. Um you don't really need to

product. Um you don't really need to reinvent them. there's already like

reinvent them. there's already like these really nice looking places you can get them. And so, as an example, and

get them. And so, as an example, and then once you find one that you like, you just you can either just copy and paste the the URL right into the LLM, and it it knows how to do it. Um, or you can run your own command that will add

it, but this just will like instantly add it to your own. Um, for what the feature that we're going to demo here in a second, um, I just pulled in this badge one just to get like a bunch of variations of the badge. And then what

we see here is we have a bunch of this variation of the badge for this these tags that we're going to add to recipes.

And that that was like literally like one minute of work.

>> Wow.

>> Okay. So that is how you can you know you build out your your design system and then only when you need something new then you build it. And that's kind of like the the philosophy is that you try to build things with the components

that you have and then use shaden. I bet

I wonder if I'm saying it right. I bet

there's another way to say it. Maybe

people can let us know in the comments.

And uh yeah install them as you need them. add the story book and then your

them. add the story book and then your your system just kind of grows over time and then you can trust your AI to like use those components as it's building things and you can kind of guarantee that they'll actually look pretty nice.

Okay, so that's how you get a nice design system. And then the last thing

design system. And then the last thing that we're going to cover today is sort of how can you get AI to test its own code and it's very interesting because it's kind of similar to for the design

process. The way it works is you just

process. The way it works is you just kind of emulate like the existing design process in companies that we're already familiar with and works really well. You

do something similar for coding. And so

what this is is it's called test driven development. And this is this is really

development. And this is this is really where you start moving from just being a vibe coder who doesn't understand what's going on to being a vibe engineer who's really understanding what's happening and building like actually really

quality products. And so the main sort

quality products. And so the main sort of philosophy behind uh test-driven development is this uh it's this RGR blow. And so the way this works is red

blow. And so the way this works is red means that before you even write your feature, then you have before you build your feature at all, you write tests for that feature that you know are going to

fail because the feature doesn't exist at all yet. And so what we're going to do as an example is for this um product, what we'll do or for this feature right now, we don't calculate macros. So we'll

build something that actually adds like protein, calories, carbs, and fat to the the recipe so that when you see it, you can get that. Um, and so as an example of some of the tests that you would

write that would fail for that is you would say, okay, when we use the API to get calories back, that should fail. And

then basically what that does is it it helps the AI be honest about uh how it's starting because it's saying like we wrote these tests that are failing and then it has to make those uh failing

tests pass. And that is like really

tests pass. And that is like really critical because it goes from one thing that AI will do and loves to do is it it just wants to make progress and it wants to make you happy so badly that

sometimes it will just do crazy stuff when it's writing tests. Like there's a bunch of examples where people have shown it'll just, you know, try to make a test work and then it won't be able to get it to work and it will just set the value to true so that it it passes.

>> Oh my gosh. Devious.

>> Yeah. LLMs can be very devious. You you

like you have to you have to keep them honest because they are they are so competent in so many ways, but also they they really just they want to they want to make progress. And so doing it this

way where you have a test that fails and then it also has to pass then like the option of just saying that you know you can set this variable to be true to make it pass that would never work because we

had like good tests to start out with.

And so that's like the red green element. That's the most important

element. That's the most important element. But then what codeex is really

element. But then what codeex is really good at um is this like kind of last step which is called refactor. So

basically, it's written code that failed and then now it is actually working. And

now you can kind of have the the AI take one more pass at the code and say, "Okay, now that you've built it and it's working, would there be a way that you could build it better?" And this is really nice because you already have the

the the passing tests. It will like try to change some things that would make the code simpler or easier to read or more performant. And then it'll make

more performant. And then it'll make sure that they still pass. And if they don't pass, it can always fall back to what it already had. And so this is what real like real engineers do this exact

process of test-driven development. And

if you can have it AI do the same thing, then you'll just end up where it'll say that a feature is done and you can actually believe that it it made the feature work.

>> Amazing. So this is how you really get the strength of codeex which is staying on the rails to graduate you into actually shipping a medium-sized feature as a PM potentially.

>> Yes. And that is also why this is really important with codeex specifically because as awesome as codeex is at a as a model it's really it's really slow

like we kind of saw maybe some speed differences between GBT5 not codeex and uh sonnet 4.5 when we use codeex and and cloud code right next to each other

codeex like GBT codeex it is it is pretty slow so the more you can set it up successfully earlier on then like you don't have to have the pain of like h it

like did this small mistake and now I have to restart it or or ask it something else because it is it is pretty slow. This makes it so that you

pretty slow. This makes it so that you can just give it something give it this really good spec you know open up another terminal and codeex or go have lunch and then when you come back and it says it's done you can trust that it

like actually is done >> there. So, and then just kind of the

>> there. So, and then just kind of the final thing. So, when you're figuring

final thing. So, when you're figuring out how to write these tests, um it's very helpful to have some intuition around like what basically how does your product work? And this is where like you

product work? And this is where like you really do have to understand your product. You can ask the AI to write

product. You can ask the AI to write tests. And AI is very good at writing

tests. And AI is very good at writing tests, but it's still helpful to like understand the concept. So, basically,

you're just you're just saying like, okay, um in the happy path, if we have chicken Alfredo, then it ends up returning, you know, the calories and the the the protein. Um so like

basically in the happy path we give it the ingredients it puts them all together and we get them correctly but what are the other what are the things that could go wrong. So one thing is that you get an empty recipe you know uh

it we try to do the extraction and then um we get no recipe. So what like what is the system going to do in that case?

Let's say we're only we have all the data but we're just missing like one piece of data. Um let's say we get invalid data. So, for example, what

invalid data. So, for example, what you'd probably want to have happen is if we ran this and then it didn't get fiber, we would just either want it to say NA or we would not we want to just not display it, but we wouldn't want it

to like halt the whole flow. And so,

this is where if you think about that stuff up front and you kind of imagine that you're a QA engineer, then you're just actually defining all that behavior so that it can write the test to make sure that like what ends up happening is

is exactly what you want. Yeah. So

basically it's just helping you think through all those edge cases and and there's a there can be a lot like there can be things that you know if you're just vibe coding and you're especially if you're newer to like development then you just won't think about like what if

you for example like we use we use Gemini to grab these videos and extract them what if it just doesn't work like what if it what if it doesn't get anything back or the API fails then you probably want some kind of like retry

logic um which is and that's where there's like a lot of these things that real production grade products have that if you're just vibe coding, you're probably missing them. Like you're

probably missing retries. You're

probably missing a bunch of edge cases.

Even if you think it's a really simple feature, they're usually, every PM, I'm sure, knows this in their in their core, they're usually a lot more complicated than they seem. And so going through this process both kind of builds your

engineering skills and helps you build much higher quality products.

>> And don't we always feel bad when we give the PRD, but we didn't think of all these edge cases? I think this is the benefit of actually building some of it yourself. It helps you see those edge

yourself. It helps you see those edge cases and deliver. When you deliver something like this to the engineers, it's so much closer for them to take it to production. Even if they do need to

to production. Even if they do need to make some significant changes to the code or code quality or things like that, this is giving them a much more detailed spec.

>> Yes. And um I I think you had a tweet about this once or maybe you quote tweeted like uh Pat the PM or something, but uh this stuff like going through this as a product manager just gives you

so much both more appreciation for what engineers do and just like an actual understanding. Like I've always felt

understanding. Like I've always felt like I was a pretty technical PM and I could work with engineers pretty successfully and I have a I have a engineering background not in computer science. So I felt like I I understood

science. So I felt like I I understood this stuff but there is a lot more to it that like if you understand it you have more appreciation for what engineers are doing and it just it just hands down makes you a much better PM because you

are understanding these technical concepts for you know ultimately we are building software. So it might not be

building software. So it might not be the most important thing but having like some intuition around how this stuff works is really really helpful as a product manager. So go beyond this

product manager. So go beyond this podcast, pause for a second and build one of these yourself so you really develop that intuition because if you do it once, you'll remember it forever versus where you're at now or you've

just watched us or listen to us. You'll

forget after a few weeks.

>> Yeah, 100%. And you can always just fork uh an existing product and then have the AI explain it to you. Codeex is really good at that as well. And then try to make some small changes and you'll make some mistakes. And I've talked a lot

some mistakes. And I've talked a lot about some of the frustrations that you can experience with coding with AI and you will hit up against those. But it's

like as you do it, you just get better and better um until you're like really becoming like kind of like a really legitimately technical person. Okay. So

uh this this kind of final thing here um the for this example feature what we're adding is it parses those. We already

had the way it worked before is it would already grab the ingredients. So now

it'll just use the USDA API to get calories, divide by servings, and then display in the preview. And so what you end up with is you end up with this kind of like super okay, two things. One is

um these design rules from what we were talking about before, which is basically like you're you're somewhere in your code, you're defining um exactly how sorry somewhere in your um in your app,

you're defining like what are the design systems for this tool. So like here's the component library. Here's how

they're installed. And all of this was written by AI, but it's helpful to just have so that when you when you want to add a new component, you can say like, you know, use these design rules as you add it to our thing so that it knows like all the variables. And then the

other thing and what we're going to do here is um I really wanted to demo this feature live, but um when I had Codex run it, it took 30 minutes. So I don't think we we want to sit here for 30

[laughter] minutes and watch it. Um, so

I already built this feature, but what what I'll show you is the implementation spec that I gave it before we started.

Um, so let's go ahead and Okay, I'm going to do something I'm going to do something bad here, but I'm I'm going to use claude because it's a bit faster.

[laughter] What are the branches? Okay,

again, some it's funny. I've I've given it the same command and it did know that the the macro calculator branch was the one I was talking about, but sometimes not. U, yes, switch over to macro

not. U, yes, switch over to macro calculator. Um, sometimes whisper flow

calculator. Um, sometimes whisper flow is kind of cool. it will like I'm not exactly sure how it works but it does have some understanding of where you are on your computer and so what happened there was I said switch over to macro

calculator but it knew that we were in this um file and so it added a similar sounding file because they're like what I said didn't make sense so that that was a mistake but it's kind of

interesting to see how that worked. Okay

I apparently uh just go ahead and switch to that branch.

Okay. So now we are in the the right folder and so the implementation spec that I gave to um to this thing at some point we committed a implementation doc

or can you try to find the implementation doc that we had in this project and then this is the this will be the last thing we look at.

>> Cool.

>> So now kind of putting all of this together we have our design system and we have this testing philosophy. And so

when I actually had Codeex build this feature, uh I use a just super in-depth spec like this one where if we if we look at it here, uh it just has like everything. It shows the architecture,

everything. It shows the architecture, the data flow, the API that we're going to use, all the different kind of like responses that we could get, implementation details, and um it really

is like laying out the the the testing and exactly how each of these are expected to work. So when I gave this to Codeex, it was pretty awesome because it would just, you know, implement it and then run tests and then the test would fail and then it would run again and

then um I gave it like a URL to test with and it would run that and it would be able to see like exactly what it was supposed to get and um like when it wouldn't get that, it would run it again. And so it took about 35 minutes,

again. And so it took about 35 minutes, but I didn't have to touch it at all once it started. And it literally like for one of the first times because I feel like this is the dream of of this

sort of like agentic coding is that you can just give it like a a medium-sized feature and it can just build it. That

has not been the case. But this was the first time when I was building this feature that I actually saw it. And so

if we want to just kind of see what this looks like kind of round us out here.

Okay. So we will give it a video again.

Maybe we can use the same one we used before. Um and again the other thing

before. Um and again the other thing that I will call out is we had our our design system with those badges for um those different sort of information that we can get back. So we'll see if those

were implemented correctly as well.

Tada. So what it did is and there was actually that was a relatively complicated feature because it had to you know get these ingredients run them each by the USDA API get them back

calculate the totals and then divide by the number of servings and then display them with that new thing that we had in our design system and we see it it worked like it was literally able to get

all those macros and then put them in here and so this is you know >> yeah it's awesome and that's the type of feature where you know normally if I was just like coding it in a in a normal way

without trying to like have this really super good doc to start out with. You

test it piece by piece and you're really in the loop. Whereas this way you're very in the loop at the beginning and you're making sure that that spec is good. And as your engineering skills

good. And as your engineering skills improve and mine are still improving as well, then you get to the point where you can actually have it sort of do these things much more powerfully. And

that's why like even the nature of like most engineering work is is really starting to change for engineers who have started to to pick this stuff up where now you can spend a lot of time upfront on the architecture and the

implementation details and the documentation and then you can imagine you know we had this feature running but what if there was another feature I wanted to be building and another feature I wanted to have be building

where you it's this new kind of way of working where you do a lot of upfront thinking and planning and then you can just have all these agents working in parallel and I think we're only starting to see the beginning of what that's

really going to look like but with agents like codeex that reality is is coming pretty fast.

>> Amazing. If we had to summarize what are the top use cases for PMs of codeex?

>> Yeah. So I would say the top use cases for PMs of codeex one is if you just only have access to chatbt from your company or something then it's it's a very good tool for all the things that

we covered today which are uh you can use it to basically do anything sort of like that you would normally think of about for an LLM. So the top use cases

are you can use it to get summaries of any document. You can use it to do

any document. You can use it to do analysis across a bunch of different documents. You can use it as a thought

documents. You can use it as a thought partner for any sort of thing that you want to work through it, you know, uh to think through a problem before you actually present it to anyone. LLMs can

be really good at challenging your thinking and just any type of thing where you are working on something where you need to be able to pull in information from anywhere else. You can

use codeex to do that for you and then keep it all in one integrated develop environment where you can do all of those things. So really like

those things. So really like documentation, anything related to documentation is a is a really good use case for product managers and for product managers that want to do prototyping. As long as you follow these

prototyping. As long as you follow these steps and you're really willing to put in that work upfront, it can be one of the best ways to prototype that exists right now in 2025.

>> Amazing. And then there are all sorts of other use cases that are just valuable for you as a person, I think, to use codecs. building presentations,

codecs. building presentations, automatically creating change logs, working with audio files, extracting image files, changing the format on a video file, cleaning up messy invoices,

clearing space on your computer.

Basically, anything you want an LM to do on your computer where when it comes to working with files where you'd have to go and search it on Google and install something as an app before, now you can just have codecs do it.

>> Yeah. And I I'm so surprised because I really started using cloud code, so like anything in the CLI for the first time.

Like I had used it before our podcast, but as we were kind of, you know, as I realized we were going to be doing it for like the month before, I really started using it for like just more and more stuff. And now I almost just live

more stuff. And now I almost just live in the terminal, which is crazy. And I

never would have thought because I'm not an engineer, but now it's just so useful that most of my like working stuff that I do on my computer, I actually just start in the terminal.

>> That could be you, too, guys. If you

want to learn more, check out Carl at Carl the PM on Instagram. Love his meme page. It's literally one of the funniest

page. It's literally one of the funniest things I wake up to every day. Check out

his newsletter, the full stack PM. If

you want to become a full stack PM, follow him on LinkedIn or X. He's

everywhere. and we'll probably have to have him back for another podcast as well. So, let us know what we should

well. So, let us know what we should have him back for. Carl, thank you so much.

>> Yeah, thank you so much. This was a lot of fun. I learned a lot and I hope that

of fun. I learned a lot and I hope that everyone watching learned a lot, too.

>> Bye, [snorts] guys. So, if you want to learn more about how to shift to this way of working, check out our full conversation on Apple or Spotify podcasts. And if you want the actual

podcasts. And if you want the actual documents that we showed, the tools and frameworks and public links, be sure to check out my newsletter post with all of

the details. Finally, thank you so much

the details. Finally, thank you so much for watching. It would really mean a lot

for watching. It would really mean a lot if you could make sure you are subscribed on YouTube, following on Apple or Spotify podcasts, and leave us

a review on those platforms. That really helps grow the podcast and support our work so that we can do bigger and better productions. I'll see you in the next

productions. I'll see you in the next one.

Loading...

Loading video analysis...