The ONLY guide you'll need for GitHub Spec Kit
By Den Delimarsky
Summary
## Key takeaways - **SpecKit: GitHub's New Spec-Driven Development Toolkit**: GitHub's SpecKit is an experimental open-source toolkit designed to streamline Spec-Driven Development, helping users build new projects and evolve existing software by defining clear specifications. [00:05] - **Combat Vague Coding with Spec-Driven Development**: Spec-driven development helps developers avoid rabbit holes caused by imprecise 'vibe coding,' leading to more scalable and well-defined software solutions. [02:02] - **Bootstrap Projects with SpecKit CLI or Templates**: SpecKit offers a CLI for scaffolding new projects or provides downloadable templates from releases, supporting various agents like Copilot and Cursor without mandatory installation. [02:40] - **Define Project Principles with a 'Constitution' File**: The 'constitution' file in SpecKit establishes non-negotiable principles for a project, such as always requiring tests or specific framework versions, which can be encoded and used by the LLM. [07:54] - **Separate 'What' and 'Why' from 'How' with Specs**: The SpecKit 'specify' command focuses on defining the 'what' and 'why' of a product, detached from technical implementation details, allowing flexibility to switch technologies later. [15:34] - **Iterative Development with AI-Generated Tasks and Plans**: SpecKit breaks down development into manageable tasks and plans, utilizing AI to generate these artifacts based on the established specification and constitution, allowing for iterative refinement and rebuilding. [24:29]
Topics Covered
- Stop "Vibe Coding": AI Builds Scalable Software
- Define Non-Negotiables: The Project "Constitution" File
- Experiment to Find the Best AI Model for Your Task
- Decouple "What" from "How" for Agile AI Development
- The Human Developer Still Edits AI Code
Full Transcript
Hey friends, I am Dan Delamarski. I'm
one of the maintainers for SpecKit, the
new project from GitHub. It's that
experiment that you've been hearing all
about on the YouTubes and the Tik Toks
and the Instagrams and wherever else.
But seriously, look at the numbers. Look
at this.
16.3,000
stars. I I was just celebrating
yesterday that we hit 15,000 and it's
16.3,000 already a week after release.
This is wild to me. Absolutely wild.
Thank you. Thank you to all of you who
are trying this out, who are
experimenting, who are filing issues,
opening pull requests. It helps
tremendously. And boy, do we have a lot
of stuff in store for you more around
specit. But actually, that's the point
of the video. We're going to be talking
about specit today. We're going to be
talking and actually looking at how this
works. And by the way, like I'm also one
of those people that looks at these
issues fairly regularly. So if you have
any feedback, go and submit. Go and
submit your feedback right here in
GitHub. Open an issue. Uh do not open a
pull request that rewrites the entire
thing as an MCP server. I know I like
MCP. I'm going to reject that request
because you haven't talked to anybody
about this. So if you have any big
changes, make sure you talk about this
first in the issue. And what is what is
the the issues that I see sometimes are
being open like Dr. D? What is this? I
mean serious. Yeah, do not do not open
issues. They just say DR like give me
give me some context for how we can
improve this. But anyway, so specit what
it is is a toolkit that helps you get
started with specri development. At
Microsoft, we are very big on spectrum
development lately. And one of the
things that we decided to look at is how
can we actually simplify the process of
you using spectrum development to build
real software. It's one of those
important things that you don't really
think about until you actually try it.
One of the conversations that I had
recently was around vibe coding, right?
Like you hear all these people that talk
about vibe coding some SAS app and
there's a lot of this imprecision
happening. You end up in these random
rabbit holes where the code is not quite
what you wanted. The implementation is
not quite what you wanted. The design is
not quite what you wanted. So spec
driven development is something that can
actually help you get out of that rabbit
hole into a little bit more of a
scalable solution for your software. So
uh let's take a look here. It's again
it's a GitHub repo. It's all open
source. It's free. You can take a look
at stuff that we have here. And uh, of
course, one of the things that I did
recently is push the specify CLI
reference spec kit includes this
wonderful thing which is called specify
CLI. It actually scaffolds things for
you. So if you're a developer and you're
thinking like, wow, what do I need to
get started with Spectrum Development?
CLI is right there. So you there's a
reference for it now as well. And uh I
use uvx and you can install it directly
from the GitHub repo because I have not
yet published this to um the Python
package repository. It's coming. I'm
working on it. I saw that people already
filed issues on this which is great. But
um we're going to use it directly from
the repo as is. So I'm just going to
copy this command with UVX. Shout out to
the folks at Astral. Uh this is
fantastic. UVX. I I use UV and UVX all
all the freaking time. Um, also if you
do not want to use a CLI, that's totally
fine. I understand you don't want to
install things. You don't want to
install UVX.
Shout out to Astral folks. Your product
is great. I love UVX. Um, this is not
sponsored by them. But if you do not
want to do that, you can go to the
releases and just download the templates
yourself. We support several agents. And
just today, actually, I launched support
for Cursor. So, if you're one of those
people that uses Cursor, you're in luck.
You can use one of the releases here. So
we have a bunch of them. Uh there's for
example for copiloud, we have them for
PowerShell and shell scripts. There's a
bunch of them that you will see shortly
in our demo. Uh depending on what
operating system you're running in. So
if you're running on Linux, you might
use bash scripts. If you're running on
Windows, you might use PowerShell. No
judgment here. Use whatever fits your
scenario. But you can download these.
We're going to say grab the zip file for
the PowerShell script for uh our copilot
and put it into downloads. and we're
going to save it here. And let's go and
take a look at it. Let's open it.
And you'll notice that I have the
specify folder that has some metadata
for specify things like memory, scripts,
and templates. And scripts are, of
course, PowerShell because that's what
we downloaded. These are helper scripts.
We have some prompts in the GitHub
folder that we'll get to in a second.
So, you can just grab this, put it in
your project. You do not need to use a
CLI. I promise you, no installation
required. But installation actually
makes it much easier. So I do recommend
using it. I just copied the command and
I'm just going to go to the terminal,
zoom it in, and we're going to paste it
here. And I'm going to bootstrap a new
project. Let's see what is a thing that
we want to build today. And let's say I
want to build a podcast website. I am
big at podcasting. If you have not
listened to the work item, check it out.
But let's say I do not have a podcast
website and I want to build one for
myself. So naturally what I want to do
is just bootstrap the spectrum process
for the website. I'm going to call it
pod site and we're going to press enter
and wait for specify to launch. Now here
is where things get interesting. So we
have different options. I can choose
different agents for which I am
bootstrapping these custom commands,
custom prompts. Sometimes I use copilot,
sometimes I use cloud code. I'm not that
big on Gemini CLI and cursor, but I know
that other people use them. So, they're
here. You can use them through the
specify CLI. And because it runs in a
terminal, you can always just, you know,
navigate with your arrows, use your
keyboard, be comfortable with your
keyboard, and pick the agent that you
want. I'm going to pick copilot. Now,
I'm also picking the type of scripts
that I'm going to be using for my helper
system, right? Because when we're
running a bunch of these prompts, what
they're also going to do is they're
going to run a bunch of scripts that are
going to bootstrap things like get
branches and make sure that the uh JSON
content that we use to kind of link
metadata is the same. And for that, the
easiest way to do this is with scripts
because it's deterministic. You don't
actually have to rely on the LLM, the
large language model to go in and figure
out how to piece things together. Things
just work. So it defaults smartly to the
operating system that you're running in.
In this case, I have PowerShell selected
because I'm running this on a Windows
terminal. It's Windows PowerShell,
right? But if I so desire, I can switch
to shell scripts. If you're running WSL
2 or maybe you have an Ubuntu VM or a
Fedora VM, you can use that there as
well. And by the way, this just launched
today. It's hot off the press. You can
actually now support PowerShell. You
don't need WSL 2 on Windows to run this.
You can just run it native Windows. I'm
going to select PowerShell, right?
Because that's that's what we want. We
want to run this on Windows as is.
PowerShell it is. And we'll see some
status changes. This is basically the
specified CLI going and downloading a
bunch of stuff and extracting the
templates locally. As I mentioned
before, like this is just a convenience
layer. Like this is not something that
you have to do. You can just download
the zip yourself from the release for
the agent and the shell script type that
you want and extract it and put in your
project. Just as easy.
So now you'll notice that there are some
instructions here for me. I can navigate
to the folder. I can open in Visual
Studio Code and use some slash commands.
There are specify plan and tasks. These
are going to help me to actually
bootstrap my project. And I can update
the constitution file. The constitution
file is actually something that you
probably have not heard about before
because it's relatively new. the the the
idea of the constitution is that it
establishes a set of non-negotiable
principles for your project. So if you
have things like I always have to have
tests. I always got to make sure that
I'm running Nex.js of a specific
version, you encode that in the
constitution. That's where the stuff
goes. So this is where the constitution
comes in and it becomes super super
helpful for any of your projects. Uh now
we're going to jump back here to our
terminal and I'm going to launch VS Code
because I'm going to be iterating on
this project inside VS Code of course
and we'll notice that the podsite here
and this is because I opened the wrong
folder and this is how I know notice
that I have podsite here and these are
the GitHub folders. But if I go into
agent mode and VS code here and if you
see if you start typing in /specify
oh I can't type slsp specify right and
you will not see the command there that
means you're in the wrong folder you are
one level too high up from where you
need to be to do this we're going to
close VS go back to the terminal and
we'll go to cd pod site
that's why I put this instruction here
in green that you will see here in the
terminal. It's called cd podside. Go to
the folder to be able to use these
commands. So, we're going to go there,
right? And now we're going to launch
code from here.
Fantastic. Now, we're in VS Code. Now,
we see the GitHub folder and podsite is
at the root, which is fantastic. That's
exactly what I wanted. And now I see
that there is the GitHub folder with a
bunch of prompts. These are going to be
our slash commands. And slash commands
in VS Code is nothing other than custom
prompts. That's that's all it is. It if
you look at any of them, it basically
outlines a set of instructions for the
LM to follow to establish specific uh
conventions around what we're building.
So in our case, it requires running a
specific script. We see some
instructions for things to do. Use
absolute paths. It's nice. Same for
specify. Same for tasks. And we again
we'll see the purpose of these commands
just just in a second here. and also
have the specify folder. Again, shout
out to the community members who
suggested this because before if you
have like memory and a bunch of these
helper scripts and the templates, they
would just land directly in the
repository route, which is not super
helpful if you already have a project.
So, these simplify things a little bit.
So, you can just keep them inside this
specify folder. And if you so desire,
you can even get ignore it, which I
don't know why you would do that, but
you can. So
now that we have these baseline pieces
here, let's go ahead and use our slash
commands. I'm going to maximize the chat
box here. I'm going to use GPD5, but you
can also use different models. You can
actually experiment with different
models that exist in Copilot or any of
the agents that you're using to see what
output is better for the things that
you're building. The output will vary.
So uh for example I like GPD5 and cloud
sonnet 4 personally but depending on
what you're building and the scope of
things that you're building you might
want to customize that and as I
mentioned there's this constitution
document so if you go to memory and
there's constitution right now is just a
blank template but what we can do is
establish this constitution with the
help of the LLM. So I'm just going to do
that uh and we're going to ask the LLM
to go in and fill this out for us. So
fill the constitution
with the bare minimum
requirements
for a static
web app
based on a template because we want to
follow the template. the Constitution.
If you also look at this, these
principles and examples, it's all about
making it easy for the LLM to go and
fill this out for you. This can
bootstrap it. It doesn't mean that you
cannot do this manually. It's more of
this bootstraps it for you and saves you
some time. So, we're going to press
enter here. Bare minimum requirements.
And we'll see what it comes up. We're
going to use GPD5. I GBD5 is good like
for these kind of things. If you use
things like sonnet, I see it go off the
rail sometimes and it does a bunch of
stuff and it starts editing the
constitution file and then it starts
creating more files it's going to refer
to. That's not what we want to do here.
So, I'm just going to rely on the
constitution. It's it's fine. And it's
going to look at the existing content
and existing template and I hope that
it's going to fill this out in the way
that I want to because if I'm going to
be writing this out manually here, it's
going to take us half an hour of this
video and nobody wants to watch me type.
So we're going to give it another second
here. Um I've also noticed that GP5
sometimes can take a little bit longer
than sonnet. So sonnet this in the
context of copout is just going to start
iterating on this and just kind of you
will see the changes go gradually into
the files. GPD5 is a little bit more uh
I want to say thoughtful where it's
going to start thinking and thinking and
thinking and then just spit out the
whole thing for you which again can be
nice can be not so nice depending on the
scenario you're trying to tackle. So,
uh, pick your battles on the models you
want to use and try them out. The only
way to find out what produces the best
results here is through experimentation.
Just like spec kit, spec kit, the
spectrum development stuff that we're
shipping here is an experiment. We are
here to learn. I just want to remind you
that this is not a production scenario.
Like, there's a lot for us to learn. We
want your feedback. We want your input.
So if something breaks, if something
doesn't work, if something that it
produces is garbage, let us know. I want
to see that garbage. I want to
understand what worked and what did not
work because that will help us make it
better and by proxy improve the product
for everyone else that might be running
in similar situations as you. So it
looks like GBD5 is still thinking. It's
going to update the constitution file by
updating by replacing the placeholders
with concrete static site requirements.
So, we're going to wait for it to finish
this process
a few moments later.
All right, looks like it finished. So,
I'm just going to keep the changes here
and let's take a look at what it
actually said. You know, no static first
delivery, no serverside execution. The
site ships HTML, CSS,JS, and static
assets via CDN. That makes sense.
Simplicity over tooling. Prefer vanilla
HTML, CSS,JS. I like that. I think that
makes sense. Accessibility and SEO
baseline also makes sense. Performance
budget, you know, I I I I don't
necessarily care for these things as we
are prototyping right now. For
production scenarios, you absolutely
should care about performance security
for now. I just do not want to deal with
that. Um, also things like requirements
here, development workflow, quality
gates. Um, yeah, this makes sense, but I
think like again for for what we're
trying to do right now, this a little
bit a little bit too much. So, we're
just going to remove these pieces. And I
think a lot of the the three principles,
the three articles of our constitution
make sense. Static first delivery,
simplicity over tooling, and
accessibility, and SEO baseline, which
is good, right? So, now now we have a
constitution. Um, now I can use the
specify command that I talked about to
define the baseline specification. So
let's do that. Let's use slashsp
specify. And again, because I'm in the
right folder now, I see slpspecify slash
specify. And this is where I define like
a true product manager the what and the
why. We're not focusing on a technical
requirements. We're not focusing on
saying use Nex.js and this database and
so on so forth. Like we we don't care
about that at this point. This is all
about making sure that we are outlining
the motivation for the product and what
actually needs to be built. This is
helpful because for somebody reading
this, it is going to be completely
detached from the implementation. This
is important. The benefit of the spec
here is that it is completely detached
from the implementation. So if at some
point, you know, we're going to be
building this with Nex.js, but at some
point you switch to Hugo or any other
static side generator. You use the same
spec. The spec is written, the
requirements are there. You just toss it
into an LLM and ask it to write you
three, four, five, six variants based on
the same spec. So kind of neat. a side
effect of what what this is about. So,
we're going to go back here and use
specify and this is where I'm going to
say what my requirements are. So, I am
building a modern podcast website
to
look sleek. We want to use the term
sleek. I I think the youths use this
term these days, sleek. I wanted to be
sleek. I um something that would stand
out
should have a landing page
with one featured episode.
There should be an episodes page,
an about page,
and a
let's see an FAQ page.
uh should have 20 episodes
and the data is mocked. You do not need
to pull anything from any real feed,
right? So, we're we're just establishing
the requirements. We're detaching
ourselves from the technical details.
We're not thinking about any technical
details. We're just saying like this is
what you're building. So, I think this
is good enough for me to start. Again,
for a real production scenario, the more
detailed the prompt, the better for us
because we're looking at this very
baseline context. Uh, we're just going
to use this and I'm going to use
specify.
And what's going to happen here is it's
actually going to go ahead and use our
prompt file, right? Like this is going
to use file instructions and specify
prompt MD because that's the prompt file
that drives the slash command, right?
It's going to read the script that is
referenced there. It's very important.
It's going to read the spec template
that is going to be used as a baseline
because look, there's a templates
folder. And it's going to ask us if it
wants to run this PowerShell script. And
uh yeah, let's actually enable auto
approve yolo mode for the rescue. And
because I'm running this inside a VM,
I'm not really worried about this. I'll
just say allow run this command. It's
going to run a PowerShell script, a
helper script. Great. It switched to a
new branch because, as I mentioned, this
is git based. So as you're running this
in your project is going to be using
these custom branches to help you
organize your work. That way you're not
damaging anything in production and you
can always roll back changes that you do
not like. Again side effect of the
spectrum process the fact that it forces
you to think about a lot of these things
where as you're experimenting and as
you're iterating you not want to
interfere with the existing
implementation. Right? So the spec
establishes the baseline. You work
through it, you experiment, and then you
merge it into your main branch on an as
needed basis. So, looks like it's
reading some of the information from the
spec file that it helpfully created in
the specs folder. Uh, the file right now
is empty because it has not yet inserted
the template, but we'll see GPT5 soon
plug in all the required information,
which is going to be nice.
Wonderful. We now have a specification.
I'm going to keep it because I just
trust it that much. I I really don't
don't don't trust the LLM to do
production software for you. Just check
it. You have to verify it. Check the
software statement that is being made in
the spec. So, uh we're going to look at
the spec here. It created things for a
modern podcast website. Great. That's
that's what I asked it to do. Um let's
take a look here at quick guidelines
which is you know again it is the
guideline for the spec. We have to
respect them. The LLM has to respect
them. Focus on what users need and why.
Great. Um for AI generation all the
stuff that's again helpful baselines
that we need to maintain and respect. So
user scenarios and testing. All right.
There's some user stories and this if
you're a PM this will sound mighty
familiar. We have some acceptance
scenarios. We have some edge cases,
right? Like it accounted for things like
epsert ordering not specified and so on
so forth. There's also a bunch of
functional requirements that are also if
you're a PM, you know, functional
requirements, they go into the spec.
This is, you know, it's done for you.
Um, there's a bunch of things that are
here, but also notice that there's a
review and acceptance checklist. This is
key here. I cannot emphasize this
enough. If you are having a
specification, if you're writing a spec,
you got to make sure that you have an
acceptance checklist and you got to make
sure that the acceptance checklist is
actually filled out. So things like no
implementation details, right? Like
languages, frames, APIs because that's
not what we're focusing on. We are
focusing on the what and the why, not on
the how. Uh I also have details about,
you know, nontechnical stakeholders,
mandatory sections completed. One thing
that is not checked is no needs
clarification. And that is actually
something that is missing here right now
because if you look at there's items
that needs clarification like episode
list ordering should be reverse
chronological newest first confirm order
requirement. Now because we're
prototyping and because we're
experimenting with this I can ask the
LLM to take a best guess. So we're just
going to do that for things that need
clarification.
Use the best guess you think is
reasonable.
Update
acceptance checklist
after.
And we're going to have the LM basically
fill this out for us because I do not
want you to think about this prototype.
I just wanted to kind of vibe code, I
guess, but it's not really vibe coding
because I I am structuring this a little
bit better than just vibe coding. So
once the spec is established, mind you
this is very easy to share with your
team. So if somebody comes in and says
how did you build this website, look at
the spec, look at the rationale. You as
a human in the loop can go in and edit
this. People make the mistake of
thinking that oh the LM produced this. I
can only manage this with ALM. No, it's
a markdown file. Go in with your hands
and start typing and entering
requirements that you feel are, you
know, required for your product. So if
you feel strongly that the landing page
should have a logo centered at the very
very middle and some gradient that looks
like a rainbow, you absolutely can do
just go and add another functional
requirement that you can do this
manually. You don't need to ask the LM
to do this. Um again and this is
especially important if you're running
into you know enterprise environments or
environments where it's more controlled
where you need to actually add specific
requirements. Sometimes the LLM cannot
guess for you which it's fine. you know,
manual work is still there and we as
developers need to go in and do that
from time to time. So, we're just going
to wait for GBD5 here to think a little
bit and
fill out our clarification items.
We now have the updated checklist. uh
allegedly. So let's scroll back here. So
all right, looks like we're good. No
needs clarification anywhere in the
code, which is our spec. It's not
actually code. It's a markdown file.
We're going to look here. It looks
great. I think we're ready to go to the
next step. The next step is we're going
to be using the slash plan command. And
this is where we actually specify the
technical requirements. So I'm going to
use
next.js JS
with static site
configuration.
No databases
and what to make sure that the site is
responsive
and ready for mobile because 2025 we
still have mobile phones. we need them.
So this is good. It gives us a baseline
for what to think about. So we're going
to use again the plan prompt. It's going
to run some helper scripts as well. So
we do need access to the terminal here
in VS Code or whatever agent you're
using. If you're using cloud code, we're
just going to use cloud code because
cloud code is very good about running
things directly in your terminal. So
it's also going to bootstrap some
additional content here that we'll see
shortly in the repository. things like
the plan, the contract and notice most
importantly that it does consult the
constitution like there is then read the
spec constitution and plan template. So
constitution is in play the the the set
of non-negotiable things that we talked
about earlier is still in play very
important. So, we're going to run the
script. We're going to allow it. YOLO.
Not quite yolo because I still had to
approve it for some reason. But, um, we
see the JSON here, which is the output
of the script because if it's JSON, it's
very easy for the LM to parse and
understand it. Notice that it read the
constitution because these are
non-negotiable principles. We got to
respect them no matter what you're
doing. So you have this plan step that's
going to read it and then it's going to
go ahead and fill out the actual plan
and a bunch of additional metadata
around it that's going to help us
establish a good project baseline.
Notice that because all the stuff also
lives in a dedicated folder. So 001 I am
building because it just picks up from
your prompt and it generates a name for
the for the feature. All the stuff is
grouped here. So later on if you decide
to rebuild the entire feature like you
see what it built and you're like you
know what set 5 or sorry set 4 did not
quite do the thing that it expected to
do. I want to switch to GPT5 or I want
to switch to uh maybe GPT41
and try out how this works for maybe
Gemini Flash. You essentially have the
spec. You have all your artifacts. You
can just delete the source. The spec is
still there and then use a different
model. So you use the model switcher
here in the chat and just reimplement
it. That's that's it. You can just
rebuild it. You can add additional
requirements. If you see that the logo
is generated the wrong way or it's kind
of the the layout of the page is funky.
There's no header and footer. Go and
edit the spec and just rebuild the
source. Once the source is created, of
course, you can iterate on it
differently. Like you can add another
spec for another feature, for another
component. you the the one spec that we
have here in this what we refer to as
green field project because we're
creating a new thing is just that is
just the bootstrap of the project. You
can do this exact same thing for any
other features. If you want to add
support for a Spotify player for example
in your podcast page just use specify to
essentially create a new feature right
like that's kind of it. you use plan to
go and create another set of technical
details for that feature because that's
what it's going to do. It's going to
create new subfolders that you can then
iterate on. So here we're going to wait
a little bit for GPD5 to go and think
through. But notice that it actually
started thinking about things like
contracts because it's going to look at
data contracts that exist within the
pages. But uh I'm not going to disturb
it. I'm going to let it think and do its
job. And we're going to get back here
once we see the actual outcome of the
process.
No. God,
no. God, please. No. No. No.
No.
All right. We now have the plan. The
plan is good. We can take a look here.
We see plan. I'm going to keep all of
the changes. It created a bunch of other
metadata here. So we can scroll through
the plan. We can minimize the terminal
just a little bit. We see that it has an
execution flow for the plan. Great. Uh
technical context, language version,
JavaScript, TypeScript, and Node.js.
Great. Primary dependencies, Nex.js,
static expert, SSG, right? Because we
asked for a static site. It's great. It
thinks about that. Testing Lighthouse
and some other libraries. Target
platform static hosting over CDN versify
GitHub pages. Azure static web apps.
That's that is right. I actually made
some good assumptions. I did not specify
this. Uh keep in mind that if I wanted
for example to host things on say
Cloudflare or Azure, I can specify that
in the constitution. I can say that
everything that you do has to be
oriented for Cloudflare or Azure or
anything any any other provider whatever
you want uh or any other technical
requirements by the way. So um we have
the project types some constitution
checks and there's a gate that again it
must pass the constitutional check it
must pass the requirements that we have
established for things like use
framework single data model avoiding
patterns yep architecture static first
dependencies minimal yep makes sense
makes sense there's some outline for
outline and research all right some
research artifacts created and by the
way copilot in this case did not do the
actual research it used its training
data to come up with this research.mmd
file. Other agents like copilot, sorry,
like quad code can actually go and do
research. So this you you'd get the the
freshest information from the internet.
Um but it actually made some pretty good
assumptions here. So we have the
context, we have some of the outline on
the phases. Uh and now look, we're
missing the task generated. We're going
to run the task command now. Uh but
before we do that, let's take a look
here. So there is a research file like I
mentioned. So it looked through its
training data because all this is
generated by the LLM basically, right?
Like it's all whatever is embedded in
the training data. That's what it's used
for its inspiration instead of actually
going out into the web. Uh we can use
something like the beast mode from our
good friend Burke Holland uh to force it
to do certain things, but I just use
standard agent mode here because I did
not use custom modes now which which is
fine for our prototype. It's okay. So we
have this. We have some data model
outline for you know fields for an
episode which is again great. It has the
context. We're building a podcast
website some details about the site some
validation rules derived views. Um we
still have our spec. We have our quick
start that gives some idea of what the
site is about and the prerequisites that
are required for the site to actually
run. Again super super nice to have this
in one place. the spec and all the
artifacts become that piece of
executable context that you can pass to
your team and have them work on it. But
uh we have the plan u now as the plan
says we got to jump to our task. So we
can use tasks and say break this down.
Right? So once again, it's going to run
some helper scripts that are going to
guide it through and it's going to break
the work down into manageable chunks
that the agent can tackle one by one
because that is super nice. That is kind
of the ability of seeing exactly what it
needs to do instead of it assuming that
it needs to partake in certain actions.
Right? So if I want to build the
testdriven developmentbased approach
where I want to have tests first, all
that stuff, you know, can be broken down
into individual tasks. It's going to do
that first. It's going to make sure the
tests are passing and then jump to
implementation of the data model and so
on so forth. So uh we're going to uh
have it do some more work here.
Going to check task prerequisites. All
right. So it's going to use a test
template.
Going to set things up for us. And we're
going to get another tasks.md
file, another markdown file in our
folder that we can use to actually
browse through the task. Right now,
because it's all marked down, it's all
available to you as a developer. You can
use the LLM here in this chat view and
just guide it through all the things
that you need. Or you can just go into
the markdown and start tinkering with it
because it's so easily editable. It
doesn't need a proprietary editor of any
kind. You can just open this and you
know anywhere else where you've
integrated copilot you know it doesn't
actually need to be VS code I just
happen to use VS code right um and again
specit and specify are compatible with
many other agents and more agents are
coming like I'm working right now on
open AI codeex uh QN uh we're looking at
adding that as well uh open code root
code like all those wonderful wonderful
projects from the community are going to
be coming in so uh again I'm going to
let chat uh or GPD5 five in this case.
Think a little bit and create our task
list.
Look at that. We have our task file. So,
if I go here again, we're going to keep
all the changes as is. And we see a
bunch of t Let's Let's actually reduce
the size here, right? So, you can
actually see a little bit. We're going
to reduce the size of this as well. Um,
and we see the the the sections here,
right? like it it just chunk this into
individual phases. We have setup
initialize next.js app skeleton. Yep.
Yep. That sounds good. Test first must
fail before 33. All right. Yeah, because
we do want to use test-driven
development here. Uh, okay, that sounds
good. Core implementation only after
tests are failing, right? Because we're
setting up the test first. Makes sense.
About page, so on so forth. Integration
refinement, run lighthouse. And this is
where I can, you know, you can you can
tweak, you can remove things you don't
like. You know, do you want to run
Lighthouse on a prototype? Maybe, maybe
not. Uh, and of course, polish,
responsive images, documentation, all
good. Accessibility polish. Sounds good.
I think we're ready. I think we're good
to implement this. So, I'm just going to
go ahead and switch my model to Cloud
Sonnet 4 because I like this the most
for code. GBD5 is good at setting up the
the kind of the spec scaffolding for us,
but for creative output, Sonnet 4 is
still unbeatable to me. So, I'm just
going to say this and say implement
the tasks for this project
and update the task list as you go.
And
now we're going to let the agent run
wild and go and implement our website.
[Music]
[Applause]
Heat.
[Music]
Heat.
[Music]
Heat. Heat.
[Music]
Thank you.
[Music]
[Music]
Tada. Looks like it did it. It It
finished the work. It did the things
that it was supposed to do. Now, I have
not seen the output. This is going to be
a surprise. So, let's keep all the
changes. Let's make sure that we keep
it. It updated all the tasks. It failed
on some Lighthouse test, but that is
because I don't have Chrome installed.
But we can actually run npm
uh I like to run num npm run build.
Let's build our static site
and then we're going to run npm npm
rundev to actually see it in action once
it actually builds. So we'll see we'll
see what it looks like.
All right.
Selecting build traces being the most
timeconuming thing apparently.
All right. And then npm run
dev.
Okay. Localhost 3000.
See, let's see what our podcast site
looks like. It's loading.
This can be horribly bad. But actually,
like look, it's it's not bad. it, you
know, master the art of podcasting.
Featured episode looking good. Has all
the details. Pod side. I have my about
page. If I go here, there's a nice
description. All right. FAQ.
Okay. Yeah, there's there's an FAQ.
All right. Great. Episodes. Let's take a
look here. We have individual podcast
episodes with links to the podcast
platform. And keep in mind that in this
context, what I can also do is if I plug
in MCP tools like Figma MCP, I can link
to actual design. So I can actually get
it to build things that fit the design
system of my organization. So I don't
have to randomly assume that it's going
to build the right thing. So this is
kind of nice. Like it did create all
this stuff. Now you might actually ask
yourselves like, well, is is this really
better than vibe coding? Like I could
have vibe coded this entire pod side,
right? But the thing is is that now that
I have the spec, now that I have the
artifacts here in my spec
implementation, I can easily customize
it. I can now tweak it. I can add things
like I want to make sure that the color
is certain way. I want to make sure that
a specific design decision is being
made. And that makes it easier for me to
then reimplement and rebuild things and
additively add features in a structured
way because that then can be used as
context. So something that my colleague
and friend John Lamb is working on is
optimizing some of these context
acquisition strategies. So that's going
to change in the future. But you can
imagine that as I have more of this
context in the memory of the system that
is being built, it actually makes it
easier to build consistent software. So
uh hopefully this is a great intro for
you to see like how spectra works, what
are the artifacts and how it produces
them and how you can then tweak them. I
really really hope that you go to spec
kit. You try it out. You get it running
on your box. You see what works and what
doesn't, especially as you start
compounding many many different features
and bug fixes and all these things for
your projects. Let me know. Go to
github.com/github/spec-kit.
Download it. Use it. Tell me what's
wrong. And then I will see you in the
next video where we're going to talk
about more complex things that are also
being accomplished with the help of
specd driven development. I hope you
enjoy this. I'll see you in the next
Loading video analysis...