The ONLY guide you'll need for GitHub Spec Kit

By Den Delimarsky

Summary

## Key takeaways - **SpecKit: GitHub's New Spec-Driven Development Toolkit**: GitHub's SpecKit is an experimental open-source toolkit designed to streamline Spec-Driven Development, helping users build new projects and evolve existing software by defining clear specifications. [00:05] - **Combat Vague Coding with Spec-Driven Development**: Spec-driven development helps developers avoid rabbit holes caused by imprecise 'vibe coding,' leading to more scalable and well-defined software solutions. [02:02] - **Bootstrap Projects with SpecKit CLI or Templates**: SpecKit offers a CLI for scaffolding new projects or provides downloadable templates from releases, supporting various agents like Copilot and Cursor without mandatory installation. [02:40] - **Define Project Principles with a 'Constitution' File**: The 'constitution' file in SpecKit establishes non-negotiable principles for a project, such as always requiring tests or specific framework versions, which can be encoded and used by the LLM. [07:54] - **Separate 'What' and 'Why' from 'How' with Specs**: The SpecKit 'specify' command focuses on defining the 'what' and 'why' of a product, detached from technical implementation details, allowing flexibility to switch technologies later. [15:34] - **Iterative Development with AI-Generated Tasks and Plans**: SpecKit breaks down development into manageable tasks and plans, utilizing AI to generate these artifacts based on the established specification and constitution, allowing for iterative refinement and rebuilding. [24:29]

Topics Covered

Stop "Vibe Coding": AI Builds Scalable Software
Define Non-Negotiables: The Project "Constitution" File
Experiment to Find the Best AI Model for Your Task
Decouple "What" from "How" for Agile AI Development
The Human Developer Still Edits AI Code

Full Transcript

Hey friends, I am Dan Delamarski. I'm

one of the maintainers for SpecKit, the

new project from GitHub. It's that

experiment that you've been hearing all

about on the YouTubes and the Tik Toks

and the Instagrams and wherever else.

But seriously, look at the numbers. Look

at this.

16.3,000

stars. I I was just celebrating

yesterday that we hit 15,000 and it's

16.3,000 already a week after release.

This is wild to me. Absolutely wild.

Thank you. Thank you to all of you who

are trying this out, who are

experimenting, who are filing issues,

opening pull requests. It helps

tremendously. And boy, do we have a lot

of stuff in store for you more around

specit. But actually, that's the point

of the video. We're going to be talking

about specit today. We're going to be

talking and actually looking at how this

works. And by the way, like I'm also one

of those people that looks at these

issues fairly regularly. So if you have

any feedback, go and submit. Go and

submit your feedback right here in

GitHub. Open an issue. Uh do not open a

pull request that rewrites the entire

thing as an MCP server. I know I like

MCP. I'm going to reject that request

because you haven't talked to anybody

about this. So if you have any big

changes, make sure you talk about this

first in the issue. And what is what is

the the issues that I see sometimes are

being open like Dr. D? What is this? I

mean serious. Yeah, do not do not open

issues. They just say DR like give me

give me some context for how we can

improve this. But anyway, so specit what

it is is a toolkit that helps you get

started with specri development. At

Microsoft, we are very big on spectrum

development lately. And one of the

things that we decided to look at is how

can we actually simplify the process of

you using spectrum development to build

real software. It's one of those

important things that you don't really

think about until you actually try it.

One of the conversations that I had

recently was around vibe coding, right?

Like you hear all these people that talk

about vibe coding some SAS app and

there's a lot of this imprecision

happening. You end up in these random

rabbit holes where the code is not quite

what you wanted. The implementation is

not quite what you wanted. The design is

not quite what you wanted. So spec

driven development is something that can

actually help you get out of that rabbit

hole into a little bit more of a

scalable solution for your software. So

uh let's take a look here. It's again

it's a GitHub repo. It's all open

source. It's free. You can take a look

at stuff that we have here. And uh, of

course, one of the things that I did

recently is push the specify CLI

reference spec kit includes this

wonderful thing which is called specify

CLI. It actually scaffolds things for

you. So if you're a developer and you're

thinking like, wow, what do I need to

get started with Spectrum Development?

CLI is right there. So you there's a

reference for it now as well. And uh I

use uvx and you can install it directly

from the GitHub repo because I have not

yet published this to um the Python

package repository. It's coming. I'm

working on it. I saw that people already

filed issues on this which is great. But

um we're going to use it directly from

the repo as is. So I'm just going to

copy this command with UVX. Shout out to

the folks at Astral. Uh this is

fantastic. UVX. I I use UV and UVX all

all the freaking time. Um, also if you

do not want to use a CLI, that's totally

fine. I understand you don't want to

install things. You don't want to

install UVX.

Shout out to Astral folks. Your product

is great. I love UVX. Um, this is not

sponsored by them. But if you do not

want to do that, you can go to the

releases and just download the templates

yourself. We support several agents. And

just today, actually, I launched support

for Cursor. So, if you're one of those

people that uses Cursor, you're in luck.

You can use one of the releases here. So

we have a bunch of them. Uh there's for

example for copiloud, we have them for

PowerShell and shell scripts. There's a

bunch of them that you will see shortly

in our demo. Uh depending on what

operating system you're running in. So

if you're running on Linux, you might

use bash scripts. If you're running on

Windows, you might use PowerShell. No

judgment here. Use whatever fits your

scenario. But you can download these.

We're going to say grab the zip file for

the PowerShell script for uh our copilot

and put it into downloads. and we're

going to save it here. And let's go and

take a look at it. Let's open it.

And you'll notice that I have the

specify folder that has some metadata

for specify things like memory, scripts,

and templates. And scripts are, of

course, PowerShell because that's what

we downloaded. These are helper scripts.

We have some prompts in the GitHub

folder that we'll get to in a second.

So, you can just grab this, put it in

your project. You do not need to use a

CLI. I promise you, no installation

required. But installation actually

makes it much easier. So I do recommend

using it. I just copied the command and

I'm just going to go to the terminal,

zoom it in, and we're going to paste it

here. And I'm going to bootstrap a new

project. Let's see what is a thing that

we want to build today. And let's say I

want to build a podcast website. I am

big at podcasting. If you have not

listened to the work item, check it out.

But let's say I do not have a podcast

website and I want to build one for

myself. So naturally what I want to do

is just bootstrap the spectrum process

for the website. I'm going to call it

pod site and we're going to press enter

and wait for specify to launch. Now here

is where things get interesting. So we

have different options. I can choose

different agents for which I am

bootstrapping these custom commands,

custom prompts. Sometimes I use copilot,

sometimes I use cloud code. I'm not that

big on Gemini CLI and cursor, but I know

that other people use them. So, they're

here. You can use them through the

specify CLI. And because it runs in a

terminal, you can always just, you know,

navigate with your arrows, use your

keyboard, be comfortable with your

keyboard, and pick the agent that you

want. I'm going to pick copilot. Now,

I'm also picking the type of scripts

that I'm going to be using for my helper

system, right? Because when we're

running a bunch of these prompts, what

they're also going to do is they're

going to run a bunch of scripts that are

going to bootstrap things like get

branches and make sure that the uh JSON

content that we use to kind of link

metadata is the same. And for that, the

easiest way to do this is with scripts

because it's deterministic. You don't

actually have to rely on the LLM, the

large language model to go in and figure

out how to piece things together. Things

just work. So it defaults smartly to the

operating system that you're running in.

In this case, I have PowerShell selected

because I'm running this on a Windows

terminal. It's Windows PowerShell,

right? But if I so desire, I can switch

to shell scripts. If you're running WSL

2 or maybe you have an Ubuntu VM or a

Fedora VM, you can use that there as

well. And by the way, this just launched

today. It's hot off the press. You can

actually now support PowerShell. You

don't need WSL 2 on Windows to run this.

You can just run it native Windows. I'm

going to select PowerShell, right?

Because that's that's what we want. We

want to run this on Windows as is.

PowerShell it is. And we'll see some

status changes. This is basically the

specified CLI going and downloading a

bunch of stuff and extracting the

templates locally. As I mentioned

before, like this is just a convenience

layer. Like this is not something that

you have to do. You can just download

the zip yourself from the release for

the agent and the shell script type that

you want and extract it and put in your

project. Just as easy.

So now you'll notice that there are some

instructions here for me. I can navigate

to the folder. I can open in Visual

Studio Code and use some slash commands.

There are specify plan and tasks. These

are going to help me to actually

bootstrap my project. And I can update

the constitution file. The constitution

file is actually something that you

probably have not heard about before

because it's relatively new. the the the

idea of the constitution is that it

establishes a set of non-negotiable

principles for your project. So if you

have things like I always have to have

tests. I always got to make sure that

I'm running Nex.js of a specific

version, you encode that in the

constitution. That's where the stuff

goes. So this is where the constitution

comes in and it becomes super super

helpful for any of your projects. Uh now

we're going to jump back here to our

terminal and I'm going to launch VS Code

because I'm going to be iterating on

this project inside VS Code of course

and we'll notice that the podsite here

and this is because I opened the wrong

folder and this is how I know notice

that I have podsite here and these are

the GitHub folders. But if I go into

agent mode and VS code here and if you

see if you start typing in /specify

oh I can't type slsp specify right and

you will not see the command there that

means you're in the wrong folder you are

one level too high up from where you

need to be to do this we're going to

close VS go back to the terminal and

we'll go to cd pod site

that's why I put this instruction here

in green that you will see here in the

terminal. It's called cd podside. Go to

the folder to be able to use these

commands. So, we're going to go there,

right? And now we're going to launch

code from here.

Fantastic. Now, we're in VS Code. Now,

we see the GitHub folder and podsite is

at the root, which is fantastic. That's

exactly what I wanted. And now I see

that there is the GitHub folder with a

bunch of prompts. These are going to be

our slash commands. And slash commands

in VS Code is nothing other than custom

prompts. That's that's all it is. It if

you look at any of them, it basically

outlines a set of instructions for the

LM to follow to establish specific uh

conventions around what we're building.

So in our case, it requires running a

specific script. We see some

instructions for things to do. Use

absolute paths. It's nice. Same for

specify. Same for tasks. And we again

we'll see the purpose of these commands

just just in a second here. and also

have the specify folder. Again, shout

out to the community members who

suggested this because before if you

have like memory and a bunch of these

helper scripts and the templates, they

would just land directly in the

repository route, which is not super

helpful if you already have a project.

So, these simplify things a little bit.

So, you can just keep them inside this

specify folder. And if you so desire,

you can even get ignore it, which I

don't know why you would do that, but

you can. So

now that we have these baseline pieces

here, let's go ahead and use our slash

commands. I'm going to maximize the chat

box here. I'm going to use GPD5, but you

can also use different models. You can

actually experiment with different

models that exist in Copilot or any of

the agents that you're using to see what

output is better for the things that

you're building. The output will vary.

So uh for example I like GPD5 and cloud

sonnet 4 personally but depending on

what you're building and the scope of

things that you're building you might

want to customize that and as I

mentioned there's this constitution

document so if you go to memory and

there's constitution right now is just a

blank template but what we can do is

establish this constitution with the

help of the LLM. So I'm just going to do

that uh and we're going to ask the LLM

to go in and fill this out for us. So

fill the constitution

with the bare minimum

requirements

for a static

web app

based on a template because we want to

follow the template. the Constitution.

If you also look at this, these

principles and examples, it's all about

making it easy for the LLM to go and

fill this out for you. This can

bootstrap it. It doesn't mean that you

cannot do this manually. It's more of

this bootstraps it for you and saves you

some time. So, we're going to press

enter here. Bare minimum requirements.

And we'll see what it comes up. We're

going to use GPD5. I GBD5 is good like

for these kind of things. If you use

things like sonnet, I see it go off the

rail sometimes and it does a bunch of

stuff and it starts editing the

constitution file and then it starts

creating more files it's going to refer

to. That's not what we want to do here.

So, I'm just going to rely on the

constitution. It's it's fine. And it's

going to look at the existing content

and existing template and I hope that

it's going to fill this out in the way

that I want to because if I'm going to

be writing this out manually here, it's

going to take us half an hour of this

video and nobody wants to watch me type.

So we're going to give it another second

here. Um I've also noticed that GP5

sometimes can take a little bit longer

than sonnet. So sonnet this in the

context of copout is just going to start

iterating on this and just kind of you

will see the changes go gradually into

the files. GPD5 is a little bit more uh

I want to say thoughtful where it's

going to start thinking and thinking and

thinking and then just spit out the

whole thing for you which again can be

nice can be not so nice depending on the

scenario you're trying to tackle. So,

uh, pick your battles on the models you

want to use and try them out. The only

way to find out what produces the best

results here is through experimentation.

Just like spec kit, spec kit, the

spectrum development stuff that we're

shipping here is an experiment. We are

here to learn. I just want to remind you

that this is not a production scenario.

Like, there's a lot for us to learn. We

want your feedback. We want your input.

So if something breaks, if something

doesn't work, if something that it

produces is garbage, let us know. I want

to see that garbage. I want to

understand what worked and what did not

work because that will help us make it

better and by proxy improve the product

for everyone else that might be running

in similar situations as you. So it

looks like GBD5 is still thinking. It's

going to update the constitution file by

updating by replacing the placeholders

with concrete static site requirements.

So, we're going to wait for it to finish

this process

a few moments later.

All right, looks like it finished. So,

I'm just going to keep the changes here

and let's take a look at what it

actually said. You know, no static first

delivery, no serverside execution. The

site ships HTML, CSS,JS, and static

assets via CDN. That makes sense.

Simplicity over tooling. Prefer vanilla

HTML, CSS,JS. I like that. I think that

makes sense. Accessibility and SEO

baseline also makes sense. Performance

budget, you know, I I I I don't

necessarily care for these things as we

are prototyping right now. For

production scenarios, you absolutely

should care about performance security

for now. I just do not want to deal with

that. Um, also things like requirements

here, development workflow, quality

gates. Um, yeah, this makes sense, but I

think like again for for what we're

trying to do right now, this a little

bit a little bit too much. So, we're

just going to remove these pieces. And I

think a lot of the the three principles,

the three articles of our constitution

make sense. Static first delivery,

simplicity over tooling, and

accessibility, and SEO baseline, which

is good, right? So, now now we have a

constitution. Um, now I can use the

specify command that I talked about to

define the baseline specification. So

let's do that. Let's use slashsp

specify. And again, because I'm in the

right folder now, I see slpspecify slash

specify. And this is where I define like

a true product manager the what and the

why. We're not focusing on a technical

requirements. We're not focusing on

saying use Nex.js and this database and

so on so forth. Like we we don't care

about that at this point. This is all

about making sure that we are outlining

the motivation for the product and what

actually needs to be built. This is

helpful because for somebody reading

this, it is going to be completely

detached from the implementation. This

is important. The benefit of the spec

here is that it is completely detached

from the implementation. So if at some

point, you know, we're going to be

building this with Nex.js, but at some

point you switch to Hugo or any other

static side generator. You use the same

spec. The spec is written, the

requirements are there. You just toss it

into an LLM and ask it to write you

three, four, five, six variants based on

the same spec. So kind of neat. a side

effect of what what this is about. So,

we're going to go back here and use

specify and this is where I'm going to

say what my requirements are. So, I am

building a modern podcast website

to

look sleek. We want to use the term

sleek. I I think the youths use this

term these days, sleek. I wanted to be

sleek. I um something that would stand

out

should have a landing page

with one featured episode.

There should be an episodes page,

an about page,

and a

let's see an FAQ page.

uh should have 20 episodes

and the data is mocked. You do not need

to pull anything from any real feed,

right? So, we're we're just establishing

the requirements. We're detaching

ourselves from the technical details.

We're not thinking about any technical

details. We're just saying like this is

what you're building. So, I think this

is good enough for me to start. Again,

for a real production scenario, the more

detailed the prompt, the better for us

because we're looking at this very

baseline context. Uh, we're just going

to use this and I'm going to use

specify.

And what's going to happen here is it's

actually going to go ahead and use our

prompt file, right? Like this is going

to use file instructions and specify

prompt MD because that's the prompt file

that drives the slash command, right?

It's going to read the script that is

referenced there. It's very important.

It's going to read the spec template

that is going to be used as a baseline

because look, there's a templates

folder. And it's going to ask us if it

wants to run this PowerShell script. And

uh yeah, let's actually enable auto

approve yolo mode for the rescue. And

because I'm running this inside a VM,

I'm not really worried about this. I'll

just say allow run this command. It's

going to run a PowerShell script, a

helper script. Great. It switched to a

new branch because, as I mentioned, this

is git based. So as you're running this

in your project is going to be using

these custom branches to help you

organize your work. That way you're not

damaging anything in production and you

can always roll back changes that you do

not like. Again side effect of the

spectrum process the fact that it forces

you to think about a lot of these things

where as you're experimenting and as

you're iterating you not want to

interfere with the existing

implementation. Right? So the spec

establishes the baseline. You work

through it, you experiment, and then you

merge it into your main branch on an as

needed basis. So, looks like it's

reading some of the information from the

spec file that it helpfully created in

the specs folder. Uh, the file right now

is empty because it has not yet inserted

the template, but we'll see GPT5 soon

plug in all the required information,

which is going to be nice.

Wonderful. We now have a specification.

I'm going to keep it because I just

trust it that much. I I really don't

don't don't trust the LLM to do

production software for you. Just check

it. You have to verify it. Check the

software statement that is being made in

the spec. So, uh we're going to look at

the spec here. It created things for a

modern podcast website. Great. That's

that's what I asked it to do. Um let's

take a look here at quick guidelines

which is you know again it is the

guideline for the spec. We have to

respect them. The LLM has to respect

them. Focus on what users need and why.

Great. Um for AI generation all the

stuff that's again helpful baselines

that we need to maintain and respect. So

user scenarios and testing. All right.

There's some user stories and this if

you're a PM this will sound mighty

familiar. We have some acceptance

scenarios. We have some edge cases,

right? Like it accounted for things like

epsert ordering not specified and so on

so forth. There's also a bunch of

functional requirements that are also if

you're a PM, you know, functional

requirements, they go into the spec.

This is, you know, it's done for you.

Um, there's a bunch of things that are

here, but also notice that there's a

review and acceptance checklist. This is

key here. I cannot emphasize this

enough. If you are having a

specification, if you're writing a spec,

you got to make sure that you have an

acceptance checklist and you got to make

sure that the acceptance checklist is

actually filled out. So things like no

implementation details, right? Like

languages, frames, APIs because that's

not what we're focusing on. We are

focusing on the what and the why, not on

the how. Uh I also have details about,

you know, nontechnical stakeholders,

mandatory sections completed. One thing

that is not checked is no needs

clarification. And that is actually

something that is missing here right now

because if you look at there's items

that needs clarification like episode

list ordering should be reverse

chronological newest first confirm order

requirement. Now because we're

prototyping and because we're

experimenting with this I can ask the

LLM to take a best guess. So we're just

going to do that for things that need

clarification.

Use the best guess you think is

reasonable.

Update

acceptance checklist

after.

And we're going to have the LM basically

fill this out for us because I do not

want you to think about this prototype.

I just wanted to kind of vibe code, I

guess, but it's not really vibe coding

because I I am structuring this a little

bit better than just vibe coding. So

once the spec is established, mind you

this is very easy to share with your

team. So if somebody comes in and says

how did you build this website, look at

the spec, look at the rationale. You as

a human in the loop can go in and edit

this. People make the mistake of

thinking that oh the LM produced this. I

can only manage this with ALM. No, it's

a markdown file. Go in with your hands

and start typing and entering

requirements that you feel are, you

know, required for your product. So if

you feel strongly that the landing page

should have a logo centered at the very

very middle and some gradient that looks

like a rainbow, you absolutely can do

just go and add another functional

requirement that you can do this

manually. You don't need to ask the LM

to do this. Um again and this is

especially important if you're running

into you know enterprise environments or

environments where it's more controlled

where you need to actually add specific

requirements. Sometimes the LLM cannot

guess for you which it's fine. you know,

manual work is still there and we as

developers need to go in and do that

from time to time. So, we're just going

to wait for GBD5 here to think a little

bit and

fill out our clarification items.

We now have the updated checklist. uh

allegedly. So let's scroll back here. So

all right, looks like we're good. No

needs clarification anywhere in the

code, which is our spec. It's not

actually code. It's a markdown file.

We're going to look here. It looks

great. I think we're ready to go to the

next step. The next step is we're going

to be using the slash plan command. And

this is where we actually specify the

technical requirements. So I'm going to

use

next.js JS

with static site

configuration.

No databases

and what to make sure that the site is

responsive

and ready for mobile because 2025 we

still have mobile phones. we need them.

So this is good. It gives us a baseline

for what to think about. So we're going

to use again the plan prompt. It's going

to run some helper scripts as well. So

we do need access to the terminal here

in VS Code or whatever agent you're

using. If you're using cloud code, we're

just going to use cloud code because

cloud code is very good about running

things directly in your terminal. So

it's also going to bootstrap some

additional content here that we'll see

shortly in the repository. things like

the plan, the contract and notice most

importantly that it does consult the

constitution like there is then read the

spec constitution and plan template. So

constitution is in play the the the set

of non-negotiable things that we talked

about earlier is still in play very

important. So, we're going to run the

script. We're going to allow it. YOLO.

Not quite yolo because I still had to

approve it for some reason. But, um, we

see the JSON here, which is the output

of the script because if it's JSON, it's

very easy for the LM to parse and

understand it. Notice that it read the

constitution because these are

non-negotiable principles. We got to

respect them no matter what you're

doing. So you have this plan step that's

going to read it and then it's going to

go ahead and fill out the actual plan

and a bunch of additional metadata

around it that's going to help us

establish a good project baseline.

Notice that because all the stuff also

lives in a dedicated folder. So 001 I am

building because it just picks up from

your prompt and it generates a name for

the for the feature. All the stuff is

grouped here. So later on if you decide

to rebuild the entire feature like you

see what it built and you're like you

know what set 5 or sorry set 4 did not

quite do the thing that it expected to

do. I want to switch to GPT5 or I want

to switch to uh maybe GPT41

and try out how this works for maybe

Gemini Flash. You essentially have the

spec. You have all your artifacts. You

can just delete the source. The spec is

still there and then use a different

model. So you use the model switcher

here in the chat and just reimplement

it. That's that's it. You can just

rebuild it. You can add additional

requirements. If you see that the logo

is generated the wrong way or it's kind

of the the layout of the page is funky.

There's no header and footer. Go and

edit the spec and just rebuild the

source. Once the source is created, of

course, you can iterate on it

differently. Like you can add another

spec for another feature, for another

component. you the the one spec that we

have here in this what we refer to as

green field project because we're

creating a new thing is just that is

just the bootstrap of the project. You

can do this exact same thing for any

other features. If you want to add

support for a Spotify player for example

in your podcast page just use specify to

essentially create a new feature right

like that's kind of it. you use plan to

go and create another set of technical

details for that feature because that's

what it's going to do. It's going to

create new subfolders that you can then

iterate on. So here we're going to wait

a little bit for GPD5 to go and think

through. But notice that it actually

started thinking about things like

contracts because it's going to look at

data contracts that exist within the

pages. But uh I'm not going to disturb

it. I'm going to let it think and do its

job. And we're going to get back here

once we see the actual outcome of the

process.

No. God,

no. God, please. No. No. No.

No.

All right. We now have the plan. The

plan is good. We can take a look here.

We see plan. I'm going to keep all of

the changes. It created a bunch of other

metadata here. So we can scroll through

the plan. We can minimize the terminal

just a little bit. We see that it has an

execution flow for the plan. Great. Uh

technical context, language version,

JavaScript, TypeScript, and Node.js.

Great. Primary dependencies, Nex.js,

static expert, SSG, right? Because we

asked for a static site. It's great. It

thinks about that. Testing Lighthouse

and some other libraries. Target

platform static hosting over CDN versify

GitHub pages. Azure static web apps.

That's that is right. I actually made

some good assumptions. I did not specify

this. Uh keep in mind that if I wanted

for example to host things on say

Cloudflare or Azure, I can specify that

in the constitution. I can say that

everything that you do has to be

oriented for Cloudflare or Azure or

anything any any other provider whatever

you want uh or any other technical

requirements by the way. So um we have

the project types some constitution

checks and there's a gate that again it

must pass the constitutional check it

must pass the requirements that we have

established for things like use

framework single data model avoiding

patterns yep architecture static first

dependencies minimal yep makes sense

makes sense there's some outline for

outline and research all right some

research artifacts created and by the

way copilot in this case did not do the

actual research it used its training

data to come up with this research.mmd

file. Other agents like copilot, sorry,

like quad code can actually go and do

research. So this you you'd get the the

freshest information from the internet.

Um but it actually made some pretty good

assumptions here. So we have the

context, we have some of the outline on

the phases. Uh and now look, we're

missing the task generated. We're going

to run the task command now. Uh but

before we do that, let's take a look

here. So there is a research file like I

mentioned. So it looked through its

training data because all this is

generated by the LLM basically, right?

Like it's all whatever is embedded in

the training data. That's what it's used

for its inspiration instead of actually

going out into the web. Uh we can use

something like the beast mode from our

good friend Burke Holland uh to force it

to do certain things, but I just use

standard agent mode here because I did

not use custom modes now which which is

fine for our prototype. It's okay. So we

have this. We have some data model

outline for you know fields for an

episode which is again great. It has the

context. We're building a podcast

website some details about the site some

validation rules derived views. Um we

still have our spec. We have our quick

start that gives some idea of what the

site is about and the prerequisites that

are required for the site to actually

run. Again super super nice to have this

in one place. the spec and all the

artifacts become that piece of

executable context that you can pass to

your team and have them work on it. But

uh we have the plan u now as the plan

says we got to jump to our task. So we

can use tasks and say break this down.

Right? So once again, it's going to run

some helper scripts that are going to

guide it through and it's going to break

the work down into manageable chunks

that the agent can tackle one by one

because that is super nice. That is kind

of the ability of seeing exactly what it

needs to do instead of it assuming that

it needs to partake in certain actions.

Right? So if I want to build the

testdriven developmentbased approach

where I want to have tests first, all

that stuff, you know, can be broken down

into individual tasks. It's going to do

that first. It's going to make sure the

tests are passing and then jump to

implementation of the data model and so

on so forth. So uh we're going to uh

have it do some more work here.

Going to check task prerequisites. All

right. So it's going to use a test

template.

Going to set things up for us. And we're

going to get another tasks.md

file, another markdown file in our

folder that we can use to actually

browse through the task. Right now,

because it's all marked down, it's all

available to you as a developer. You can

use the LLM here in this chat view and

just guide it through all the things

that you need. Or you can just go into

the markdown and start tinkering with it

because it's so easily editable. It

doesn't need a proprietary editor of any

kind. You can just open this and you

know anywhere else where you've

integrated copilot you know it doesn't

actually need to be VS code I just

happen to use VS code right um and again

specit and specify are compatible with

many other agents and more agents are

coming like I'm working right now on

open AI codeex uh QN uh we're looking at

adding that as well uh open code root

code like all those wonderful wonderful

projects from the community are going to

be coming in so uh again I'm going to

let chat uh or GPD5 five in this case.

Think a little bit and create our task

list.

Look at that. We have our task file. So,

if I go here again, we're going to keep

all the changes as is. And we see a

bunch of t Let's Let's actually reduce

the size here, right? So, you can

actually see a little bit. We're going

to reduce the size of this as well. Um,

and we see the the the sections here,

right? like it it just chunk this into

individual phases. We have setup

initialize next.js app skeleton. Yep.

Yep. That sounds good. Test first must

fail before 33. All right. Yeah, because

we do want to use test-driven

development here. Uh, okay, that sounds

good. Core implementation only after

tests are failing, right? Because we're

setting up the test first. Makes sense.

About page, so on so forth. Integration

refinement, run lighthouse. And this is

where I can, you know, you can you can

tweak, you can remove things you don't

like. You know, do you want to run

Lighthouse on a prototype? Maybe, maybe

not. Uh, and of course, polish,

responsive images, documentation, all

good. Accessibility polish. Sounds good.

I think we're ready. I think we're good

to implement this. So, I'm just going to

go ahead and switch my model to Cloud

Sonnet 4 because I like this the most

for code. GBD5 is good at setting up the

the kind of the spec scaffolding for us,

but for creative output, Sonnet 4 is

still unbeatable to me. So, I'm just

going to say this and say implement

the tasks for this project

and update the task list as you go.

And

now we're going to let the agent run

wild and go and implement our website.

[Music]

[Applause]

Heat.

[Music]

Heat.

[Music]

Heat. Heat.

[Music]

Thank you.

[Music]

[Music]

Tada. Looks like it did it. It It

finished the work. It did the things

that it was supposed to do. Now, I have

not seen the output. This is going to be

a surprise. So, let's keep all the

changes. Let's make sure that we keep

it. It updated all the tasks. It failed

on some Lighthouse test, but that is

because I don't have Chrome installed.

But we can actually run npm

uh I like to run num npm run build.

Let's build our static site

and then we're going to run npm npm

rundev to actually see it in action once

it actually builds. So we'll see we'll

see what it looks like.

All right.

Selecting build traces being the most

timeconuming thing apparently.

All right. And then npm run

dev.

Okay. Localhost 3000.

See, let's see what our podcast site

looks like. It's loading.

This can be horribly bad. But actually,

like look, it's it's not bad. it, you

know, master the art of podcasting.

Featured episode looking good. Has all

the details. Pod side. I have my about

page. If I go here, there's a nice

description. All right. FAQ.

Okay. Yeah, there's there's an FAQ.

All right. Great. Episodes. Let's take a

look here. We have individual podcast

episodes with links to the podcast

platform. And keep in mind that in this

context, what I can also do is if I plug

in MCP tools like Figma MCP, I can link

to actual design. So I can actually get

it to build things that fit the design

system of my organization. So I don't

have to randomly assume that it's going

to build the right thing. So this is

kind of nice. Like it did create all

this stuff. Now you might actually ask

yourselves like, well, is is this really

better than vibe coding? Like I could

have vibe coded this entire pod side,

right? But the thing is is that now that

I have the spec, now that I have the

artifacts here in my spec

implementation, I can easily customize

it. I can now tweak it. I can add things

like I want to make sure that the color

is certain way. I want to make sure that

a specific design decision is being

made. And that makes it easier for me to

then reimplement and rebuild things and

additively add features in a structured

way because that then can be used as

context. So something that my colleague

and friend John Lamb is working on is

optimizing some of these context

acquisition strategies. So that's going

to change in the future. But you can

imagine that as I have more of this

context in the memory of the system that

is being built, it actually makes it

easier to build consistent software. So

uh hopefully this is a great intro for

you to see like how spectra works, what

are the artifacts and how it produces

them and how you can then tweak them. I

really really hope that you go to spec

kit. You try it out. You get it running

on your box. You see what works and what

doesn't, especially as you start

compounding many many different features

and bug fixes and all these things for

your projects. Let me know. Go to

github.com/github/spec-kit.

Download it. Use it. Tell me what's

wrong. And then I will see you in the

next video where we're going to talk

about more complex things that are also

being accomplished with the help of

specd driven development. I hope you

enjoy this. I'll see you in the next

Loading...

Loading video analysis...