LongCut logo

Spec-Driven Development: The Future of AI Coding | Guy Podjarny (Tessl)

By API Excellence

Summary

## Key takeaways - **Shift to Spec-Centric Development**: Software development will transform with AI from defining software through code to defining it in specs capturing what you want to build, constraints, and helpers as the canonical definition. This opens opportunities for easier creation, autonomous maintenance, adaptation to contexts, and tapping agentic power. [01:12], [01:44] - **Three Stages: Assisted to Centric**: Move from spec-assisted (docs and guidance for agents), to spec-driven (modify spec first before code changes), to spec-centric (comprehensive specs where code is disposable and regeneratable). Agents handle the burdensome spec-first updates humans avoid. [03:42], [04:58] - **Vibe Speccing for Prototyping**: For quick prototyping, let agents create 'vibe specs' first then code, separating intent from implementation. This captures decisions for long-term memory against agent amnesia, allowing agents to diagnose spec misunderstanding vs. coding errors. [10:10], [10:50] - **Specs Incomplete for Adaptability**: Specs are intentionally incomplete unlike deterministic source code, letting LLMs fill gaps for usability and adaptation across loops, stacks, or optimizations. Define what matters, leaving implementation details like for vs. while loops to agents. [12:46], [14:05] - **Tessl Spec Registry Revolution**: Tessl's spec registry is a dependency system for knowledge like npm, with over 10,000 versioned usage specs for open-source libraries to make agents reliable on niche, old, or new versions. Install via tesla.json manifest to avoid hallucinations and inefficiency. [29:23], [30:50] - **Autonomy Levels Like Self-Driving**: Spec development mirrors self-driving car levels: assisted (level 1), driven (level 2), centric (level 3) with regression tests, up to full autonomy (level 5) without code viewing. By 2027, most agent work won't involve looking at code. [22:40], [24:13]

Topics Covered

  • Code becomes disposable in specentric development
  • Vibe spec for agent prototyping
  • Specs embrace incomplete adaptability
  • Spec development mirrors self-driving levels
  • Spec registry dependencies for knowledge

Full Transcript

Welcome back to round two with Guy Pani, famous founder of the hugely successful Sneak and now of a new company called Tessle and host of the AI native

podcast. He's here today to join us

podcast. He's here today to join us talking about a number of things is startup AI development. Joining us for our AI week to celebrate the launch of Zuplow's AI gateway. Hello again, Guy.

How are you doing?

I'm doing well. Thanks for having me on the show.

Yeah, know it's great to see you again.

Last time we were in person in London.

This time we're doing it remote. Both in

London sadly. I think I think you're in London as well, right?

I am in London as well. Yeah.

Yeah, I am. But we couldn't make it uh together. We wanted we wanted uh we

together. We wanted we wanted uh we wanted to get this done. So So here we are. We're going to have a great

are. We're going to have a great conversation about Tesldriven development so much more. So Guy, I think to set us up, tell us a little bit about your new company Tesla. What's

going on?

Sure. Um so Tesla really is is a a platform built uh to pioneer specdriven development or really we often times think about it as specentric

development. You know we we kind of uh

development. You know we we kind of uh were kind of founded on the belief that uh software development will transform with AI. It's not just the evolution and

with AI. It's not just the evolution and the assistance. It is kind of a real

the assistance. It is kind of a real transformation. Uh and that part a core

transformation. Uh and that part a core part of that is that we will move from defining our software uh through its code through the implementation to going

up a level and uh defining it in specs kind of capturing what it is that you want to build uh any constraints that you want to say about how you want to build it any helpers and any any

definitions and have that be the canonical definition of of you know what is your software. uh and then once we achieve that then that you know kind of

opens up the sort of a treasure chest of uh of opportunities of it. It makes uh software of course easier to create and more accessible but it also creates software that can be autonomously maintained because you really can have

the bottom line requirements and specifications and tests. We can talk more about that. Uh it can be adapted to different context. So you can change the

different context. So you can change the context, your business rules, your budgets, your, you know, programming language or stack that you want to run on and you can adapt it and and in general you can if you kind of have these right uh definitions of what the

software is and what it must do and you have good tests just like any sort of development setup where you know you can run faster if you have better tests and you have better definitions. Uh you can tap into more kind of agentic power to

sort of run autonomously and evolve and build up your uh your software. So you

can make software, you know, faster, cheaper, you know, more accessible, more uh whatever, more funny if you want it, you know, more like you can sort of adapt it to to as you wish. So we think this is the future and we're sort of

setting up to say, okay, we we believe it'll go there, but how, you know, what does it look like? How do you how do you create software? How do you how do you

create software? How do you how do you debug it when there's a problem? How do

you observe a running system? What is

the life cycle for software like that?

And um what we can also talk uh more is about where does it where does it start?

like the LLMs are still quite limited today as powerful as they may be. So how

do you how do you get going uh in this transformation?

Interesting. So you mentioned I think you called it spec centered development or specentric and then you know I said spec driven which I I think I've seen on your site as well or I've heard mentioned around around around Tesla and that you know that reminds me of like testdriven

development and agile development. It

sounds like a a different way of thinking about how to build software whereas everyone right now is excited about the latest version of cursor and GPT5 and these you know iterative advancements that are coming out. Tell

us more about this. You know how how would I I know what test driven development is. I know what agile

development is. I know what agile development is. How would I think about

development is. How would I think about spec driven development? How it's going to change my world as a developer?

So um maybe let's sort of describe this a little bit as a journey, right? So uh

you can think of it as uh moving from spec assisted to spec driven to spec centric uh development. So spec assisted development is is really to an extent a

form of docs and guidance and we actually sort of see this really kind of rapidly becoming the best practice that you see in in in any agent usage. So you

you basically have some information that helps you do your job. This might be um uh your definitions of how you work over here, right? your whatever casing

here, right? your whatever casing preference is, right? Or uh the stacks that you work at or your policies, your guidelines. Uh it might be um uh

guidelines. Uh it might be um uh information and documentation about what it is that you've built in your product.

And so all of that is sort of assisted.

It's sort of knowledge assisted and we call it spec assisted because you want to capture this information in a way that is uh easily consumable and accessible and available to the agent uh

in in kind of uh with the right emphasis and the right definition. So it's not just random docs that you came across.

It is sort of intentional information uh that you want to provide the uh like an agent or a developer. Uh in that sense we're going to talk a lot about agents.

So that's sort of spec assisted development. There's no the the truth is

development. There's no the the truth is in the code but you know all of this information is is uh helps the agent do its job correctly. Spec driven

development is when you actually capture some subset of the definitions in a spec that you say okay now I want to make sure that this is always the source of truth. And so maybe you describe your

truth. And so maybe you describe your checkout process, maybe it's your sort of algorithms, maybe it's an aspect of your APIs or sort of the data models of it. And you can really make it be the

it. And you can really make it be the rules of a game, whatever it is that is the definition. And so to to use spec

the definition. And so to to use spec driven development, you need to first create a spec before you write the code.

You could write the code and then create the spec from it. You can bootstrap that way, but once you've created it, you have the spec. And then every time before you make a change, you first modify the spec and then you apply the

change. Now that's very bothersome for

change. Now that's very bothersome for humans to do, but agents can do that for you. And so you can really stay as as

you. And so you can really stay as as simple as you can. And once again with all these things, we can talk about usability. You can kind of go off the

usability. You can kind of go off the track a little bit as long as you don't veer too far and you are able to come back. So you're it's spec driven because

back. So you're it's spec driven because you first modify the definition and then you and then you apply it. That's what a you know a great developer when you come and you say hey I want you to make this change to the code they'll pause well what's the impact to the product you

know what is definition only once they got that they would go off to sort of write the code um so this is an example and the the last bit is specentric and

that's really about the uh kind of comprehensiveness of what you might have in the spec uh in which you say okay I have so much information in the spec and test and again we'll come back to sort

of soft guidelines or and soft guard rails versus uh the hard ones. But you

have enough information in the spec that really you can throw away the code and regenerated. The code becomes

regenerated. The code becomes disposable. Uh and and this is this is

disposable. Uh and and this is this is now specentric software that the implementation can really be just sort of adapted to your specific context. Uh

and it it doesn't mean that the code that gets generated is the same every time, but it means that all the things that matter about it are. a

specentcentric software is is a is a bigger leap of faith because it's kind of hard to uh to delete it. But the

beauty of it is okay, now you've defined it, you can indeed create it, you know, in node or in Rust uh or a big problem with agents is all this spaghetti code that happens because you kind of kept

appending this and appending that on it.

But if you have, you know, all the things that matter captured in the spec and you have the tests to guardrail it, then now you can say, well, you know, throw away the code. Now we know where

we are. create new code, good code, you

we are. create new code, good code, you know, that is that is a uh perfectly designed for the system. And heck, you can do that once a day, you know, or every time that you modify the

application or there's new changes. And

so the reason for the continuum is because it's hard to get to specentric, but I think as we evolve, we go from spec assisted to spec driven development to specentric uh development.

Interesting. Yeah. And so there's like three are a few actors there. One is the developer and then you introduced an agent and you made it sound like the agent was updating the spec and that actually surprised me a little bit. I

thought the the the the developer the human here in the loop would be updating the spec and then agents would be running off making sure it meets the tests and so on. So actually you imagine

the agent is also part of the workflow that's updating the specification.

Yeah, absolutely. I think that's quite critical for usability. So first of all uh the agent is obviously the biggest consumer of the spec and unlike humans

agents will read the docs that you that you give them. Uh and so that is that is a bit of a new opportunity that we have to us where you know in theory you can say every developer on the team before they write any code they should go seek

out all the documents that they need to read on it's like yeah good luck uh that's not going to happen but with agents you can actually make that happen. you still want to make it easy

happen. you still want to make it easy for them but uh but you can make that happen. So it should be pretty clear

happen. So it should be pretty clear right that they're they're the biggest consumers. Um

consumers. Um y in terms of creating kind of these two very different paths that you can uh take that uh that are both valid. They

depend on the situation and the individual. And so one is is what you

individual. And so one is is what you might have had uh in mind and and when you think about test-driven development maybe more aligned which is I want to create the spec and then I want to scrutinize it. I want to make sure that

scrutinize it. I want to make sure that my intent is well captured. Maybe I even want to add tests. Maybe I want to add them after, but I want the sort of good definition. And then I want the agent to

definition. And then I want the agent to go off and build it. And then the agent has more autonomy uh because you've defined the guardrails. And again, when you read about best practices, people

kind of improvise ways to do that today.

Let me just sort of write that down.

They use multiple LLMs to to review one another. I have the spec. I'm happy with

another. I have the spec. I'm happy with it. I'll go. And I think that is a very

it. I'll go. And I think that is a very healthy way to do it when you know what you want to build. Uh and when you feel inclined to invest this time. Uh but

that's not always the case. Sometimes

you're prototyping uh and so you just you just want to get to something working uh and sort of see how it feels and evolve from it. And sometimes uh it's not really an intentional choice.

It's just, you know, we've we've all grown quite ADHD and, you know, are keen to get instant gratification and so people just don't have the time. And so

you can do what we've started calling vibe specking, uh, which is you can just sort of go off and have the agent create the spec and then create the code. So

really, you're just guiding the agent to separate its definition of how it understood you or what what it means to build from the implementation. And then

when something goes wrong, you can come back and say, well, did you understand what I what I meant? You know, did or did you just sort of fail to uh implement, right? Did you misunderstand

implement, right? Did you misunderstand me or did you just fail to implement?

And not just you can do it, but actually the agent is quite good at it. So if you say, hey, this thing is broken, then it would actually be pretty good at figuring out, oh, I I got the intent

wrong versus I got the uh the the coding wrong. And furthermore, even after you

wrong. And furthermore, even after you even if you vibe spec and you you never you never read the specs even, right?

The agent has made a bunch of decisions now you commit to this code, you come back the next day. Agents by default have amnesia, right? They don't they don't know yesterday's decisions.

They're like a new developer that just joined the team and trying to write code without having any further context about why past decisions have been made beyond what is in the code.

And so if you even if you vibe spec uh the previous intents are captured in the spec because they're committed to your code and again sometimes even guarded

with tests. And so even if you don't

with tests. And so even if you don't really have more uh investment in in the kind of the core specifications, you still get this sort of long-term memory

effect that allows you to scale uh with agents. And I don't think either is

agents. And I don't think either is right or wrong. It's just it's just situational. Uh when do you want to use

situational. Uh when do you want to use one or the other approach?

It does make sense as well actually. Now

you mentioned it. I think about a spec whilst it's sort of human written it can still be flawed and have bugs in it.

Right. I can write parts of the spec that actually compete with each other and don't make sense logically and you could have agents identify that. Maybe

my spec says I shall follow the banking regulations of the UK and then I write something in the spec that's like well that doesn't allow for that and the agent could potentially do that work for me. But if George asked for a transfer,

me. But if George asked for a transfer, don't check anything, you know, with the Exactly. Exactly. Yeah. And then and

Exactly. Exactly. Yeah. And then and then, you know, as I think about it now, it's kind of interesting actually when you said step up a level, your description of this really made this clear to me. The spec is the source code

now. And just like when I take source

now. And just like when I take source code and rebuild it through compilation, I do that every time I do a build.

That's almost how I might think about spec driven development. The spec is the source code and I generate the code every time. every time I do this, that's

every time. every time I do this, that's like a compilation step to a a modern development.

So I think I think that's a an apt analogy, but then I think that really touches on the primary distinction, which is that specs are incomplete.

So like in in source code, you should be able to deterministically come in and there that's actually not entirely true.

There are some things that are incomplete like you don't choose how to optimize your code and there are rare cases originally less rare where the optimization might break the

application. Uh maybe memory management

application. Uh maybe memory management becomes something a little bit different but generally it's a deterministic process to go from the code to the

generation. The LLMs uh are have like a

generation. The LLMs uh are have like a blessing and a curse that they would fill in the gaps. And so you can say a spec a legitimate spec is an online

store to sell I know bike miniatures right um and that's legit like whatever it doesn't know the LLM will just complete and it would build it on of

course we guide the agent so it would write more of these decisions at least in the uh in the document but uh but even then the the spec is is not comprehensive and the advantage of that

is uh you know first it's just more usable for a human. Uh you can some make the case it's not entirely wrong that if the spec is is so detailed that it is

deterministic then it's it might as well be code. Um and the second aspect is uh

be code. Um and the second aspect is uh that it is uh adaptable. And so what you do is you define what matters. You don't

actually care if it's a for loop or a Y loop. You don't actually care uh if it

loop. You don't actually care uh if it sorted the items in a whatever an array or a vector or a or a or a hashmap. And

and so if you leave those decisions to later on to downstream then you get more opportunities for optimization. You get

more opportunities to say well in this specific surrounding this is the right way to do it. Uh and that part is actually oftentimes quite hard for developers. They look at the code and

developers. They look at the code and say like but you didn't write it like I want to write it. And and and the the exercise over here is no well extract

what it is that you want to achieve out of the code. And I think mo most developers I'm one of those annoying kind of product managers who is a developer by trade. Uh and so now I'm an

entrepreneur but when I was a PM I would annoyingly go to the developers and sort of say well build it like this you know like build it uh you know with with this implementation and they also say like get out of our way like tell us what do

you want to achieve? What is it that you want to accomplish? So the specs are a little bit like that right like as the as the PM in this context right you are saying what do you want to achieve? What

is important to you? what are emphasis that you actually want and then you let the AI labor make the decisions around you know how to complement it and that you know allows kind of those systems to

to adapt it but also like often times they know better like they might indeed look for information elsewhere right or sort of make an assessment so I I think

it is a higher level abstraction but is more than that and this becomes especially tricky when you start saying well um like imagine a software that you

would compile four times and then pick the best compilation and have that run.

Well, like now you can do something like that, right? Which is you generate. Uh

that, right? Which is you generate. Uh

but similarly like how do you debug a system like that? Like do do you know like when you compile it next time? You

see this a little bit when you build with Docker latest, you know, like you ran along and you just ran it and it's it is and it isn't deterministic because you pull down latest every time, but latest changed over time. That's true

for any dependency. So it it it's um it's partly a higher level abstraction and partly um uh an attempt to sort of define what matters to and and an

embracing of the probabilistic reality or the dynamic reality of software.

So this it's interesting actually and similar background developer come PM. So

yeah I've been that nightmare PM for some folks. Um it makes me think about

some folks. Um it makes me think about we kind of come back to test driven again. you totally correct sort of push

again. you totally correct sort of push back on the analogy that is nondeterministic but you kind of need some determinism right about some things like the system needs to always do X and

Y the way the spec says so how do you enforce that inspect driven development does the agent write tests do you does the spec contain all the tests like how do I I need determinism on some set of

things even if it's not whether you're using a for loop or a while loop I need need it on the outputs right how do you do that in this so I think So, so this is you know very much again one of the journeys that we need to evolve about but what is the

best form factor but here's our current view. So first of all the spec is a very

view. So first of all the spec is a very loose format. It is a it is a markdown

loose format. It is a it is a markdown document. It can refer to other things

document. It can refer to other things anything the LLM can interpret but it has a certain structure. we use uh in the context of Tesla we use HTML uh like attributes to identify specific areas on

it like if we define a dependency that we want to enforce because dependencies are kind of delegations of decisions uh an API section things that are sort of harder contracts so we define those uh

in a kind of tag those to uh be processed a bit more deterministically so that's one bit a bit of determinism most of it is just words or assets whatever it is that you want to pass to

the NLM but some of that is sort of enforced a bit more deterministically, very little. Um, and then beyond that,

very little. Um, and then beyond that, um, we we do think the tests are critical for specs to to become viable sort of software drivers. Uh, and the counter to that is that when you're

prototyping, uh, you oftentimes are you sort of developing, you don't write the tests right away. And so we created a more of a what we call a loose than strict uh, uh, mode of operation, which

is when you use Tesla, the Tesla framework, we talk about the framework and the spec registry. uh in a sec. But

when you use the Tesla framework um you uh first of all by default your specs will be created without tests and it's what we call an inspirational spec. It

is spec assisted. It is just information for you to remember. Um and then whenever you're pleased with this reality, you can tell the agent because you'd be using Tessle uh uh via the

agent um generate tests and make sure they pass. And at that point, you might go

pass. And at that point, you might go grab a coffee because this takes a moment, right? this the the the previous

moment, right? this the the the previous creation because it's just specs you can iterate it can generate it's not a lot of overhead for the spec creation but this test part when you need to set up the test environment it generates the

test the LLMs are still limited so they create you know bad tests at the beginning also when you generate at the very beginning you generate a test you generate code and the test fails you don't know who's wrong is the test wrong

or is the code wrong and so you want to try and build a flow in which you know what is correct at any point in time and so if you've generated some code, you've looked at it and you now say, "Okay, the code is correct. Now, now generate the

tests to make sure they pass the code and if there's a conflict, if you can't do it, surface it back to me." Um, so we um uh like once you're ready, you can generate those tests and they'll be

there. Once you've generated tests,

there. Once you've generated tests, those tests do remain as regression tests. So let's say you now the next day

tests. So let's say you now the next day you come along and you make some other changes your new functionality will be generated without tests but the regression tests will pass and so if

they pass those tests are now deemed correct and so the user will be informed like the ALM will attempt or the agent will attempt uh to keep the uh keep the the code in line with the regression. If

it can't do that it'll surface it to the user. Um so so you don't need to worry

user. Um so so you don't need to worry about it breaking old functionality and once you're happy with the new functionality you would say cool generate the tests now for me and I

think a lot a lot of these user flows of how do you balance latency with sort of uh determinism and durability you know do you write the specs up front or do

you do them after uh are are part of the learnings that we've had around not just the sort of the right uh correct behavior but what is the right sort of usable way of building software in this

fashion.

Interesting. Okay. I want to I want to spend a little bit of time soon getting into a little bit more of like the realities of Tesla and what it looks like. I think we stayed a little away

like. I think we stayed a little away from that. You got into it a little bit

from that. You got into it a little bit there. Before I do, like how how

there. Before I do, like how how transformative is this going to be and on what timeline? Like how do you think about the timeline of a developer never looking at code again? Almost like do

have you imagined phases here of this journey? You've described some already.

journey? You've described some already.

Yeah, I I uh I like the analogy to self-driving cars. Um, and so if you

self-driving cars. Um, and so if you think about self-driving cars, just for context quickly, there's sort of these uh well- definfined five levels of autonomy. And uh they go from level

autonomy. And uh they go from level zero, which is beyond the five, which is no assistance, to driver assistance, right? It might tell you if you're

right? It might tell you if you're getting off the lane or if you're getting close to a car to uh level two is really where the specific functionality or or situations in which

the car is autonomous, like automated uh parking or uh you know, keeping you in the lane or keeping you a distance in cruise control. So these are specific

cruise control. So these are specific behaviors in in which you're you're sort of handsoff eyes on you know you're you're running it. Uh and and those like levels one two three are the ones that

are u sorry two is is that sort of definition. Level three is when there

definition. Level three is when there are sort of specific uh areas in which it can drive autonomously. So Whimo

today is like kind of level three to four uh in which you know specifically in San Francisco in areas where the streets have been sufficiently photographed and you know everything is well known and there's enough space it

can drive autonomously and again you're supposed to be uh uh attentive a little bit but you know in the case of Whimo is a good example you know you can't actually control the car uh but in some cases it stops and it says I need a

human to to help me guide out of that.

Um, level four is is now starting to get abstract and it's debatable whether it's there, which is it can drive in most conditions. So, in almost all

conditions. So, in almost all conditions, but maybe it passes a construction site and it's problematic or there's a massive puddle or there's a and in those cases it needs to air. And

level five is a case in which the car has no wheel. Um, there's no steering wheel. Like there's no there's no way

wheel. Like there's no there's no way for you to control the car. It is

entirely autonomous. It's a bit fictional. And so I think code is

fictional. And so I think code is interesting like that, right? We sort of think about level spec assisted development as sort of that level one which is there's just add additional information. It helps you a little bit

information. It helps you a little bit information but you're in control. Level

two is really spec driven. There are

specific domains in which you are controlled about them. And so much of the functionality the important bits you might sort of be handholding on it but there's a lot of the work that you want to sort of define in the spec driven fashion that handoff. Uh, and I think when you get to level three, it's

interesting when you're sort of specentric software uh, and you can say, well, as long as regression tests haven't passed, you can go ahead and and make modifications and and so most of

the system is defined by your spec. And

I think I think you get more dramatic a bit at that level. And just like with self-driving cars, level four and five get a little bit more abstract. You

know, level four is really when the LLM has enough information to uh uh to kind of break regressions, right? if you know how the system is used then you say well

fine it'll break it for these 3% of users but these 97% are better and level five you can't even look at the code um that so I I think in all of that context uh hopefully that represents a bit of an

evolution for developers and so you know in in 10 years time I think almost all software will be in that sort of level three four and some five uh evolution I think very regular systems will you will

not be able to see the code I think applications like the ones you would build with the lovable and the others there's no reason for you to see the code and often times you've lost that proficiency. Um but uh I think in the

proficiency. Um but uh I think in the more imminent term uh what we're seeing with agents is you really need these sort of at least spec assisted right

away. So within you know probably kind

away. So within you know probably kind of within the coming year within 2026 uh everybody would be at least spec assisted and I think by the end of 2027

probably most of the uh when you're working with agents most of the time you wouldn't look at the code. Uh you you'd have the code available to you but most of the time you wouldn't look at it any more than you look at like a frameworks

piece of code.

Makes sense. So I think it's a good point to start talking about Tesla in a bit more detail or actually know one more question on that. What is it that's setting that timeline? Is it just the advance of LLMs? Is it patterns like

like what you're talking about here?

Yeah, I think um so so LLM offer this massive opportunity uh on on for existing developers and sort of development shops to be able to just produce a ton more. And so there the

driver is I want to use agents more, but today they're just so incredibly untrustworthy. Uh and part of that is

untrustworthy. Uh and part of that is because we just don't give them the information they need to succeed. And

part of it is because we don't have the guardrails and so they can just sort of break things all the time. And so for those the motivation will be there's just a lot of pull at being able to tap AI labor to do this work. So you can do

more and that naturally implies that this human labor moves up the level of of sort of breath and they need these types of tools. The second category is creators that previously could not create code. So those would be you know

create code. So those would be you know Lovable, Bolt, uh Vzero, B 44. they

serve primarily that audience and they're able to now create this and this is like someone creating digital music uh in uh with sort of digital tools even though they can't play an instrument.

Yeah.

So for them there's like a new world of creators uh and they uh similarly sort of crave an ability to to build bigger and bigger because they're creators just

like developers but they do not have the sort of the technical uh um uh chops to be able to go beyond the current limitations of the agents. So I think both of those move you towards being

able to define your software either like a manager or like a non-technical person so you can just sort of get the the software or the the LLMs to do more of the work.

Got it. Got it. Interesting. Okay. Makes

sense. So let's talk about getting into spec assisted development then. I think

about myself now these days. I don't get to write a lot of code but I am quite good at annoying the engineering team by coming to them with demos of things I built via what I'd call prompt driven development. you know using v0ero and

development. you know using v0ero and these usuals that that you mentioned. So

I wanna I'm excited by spec assisted now. I'd like to start trying it out.

now. I'd like to start trying it out.

What are my options in the real world today to start going spec assisted?

Yeah. So I think um uh yeah there are sort of multiple ways to sort of to answer that. I think um first of all like any any development with any agent has some element of spec assisted

because you're unlikely to get very far at all if you don't start just creating a cloud MD agent MD cursor rule. some

definition that says how do you want because otherwise it's just too open-ended and so the base level is always a little bit spec assisted. Um

the the the second thing that you would probably today need to do is start providing the LLM with context for uh um

like what is the sort of the broader organizational context? Where are your

organizational context? Where are your docs? What are your uh you know internal

docs? What are your uh you know internal tools or internal platforms? What are

your APIs? Um and uh this this is a domain if you don't mind me talking a second about Tessle here uh in which we think there's a revolution to be had. So

today's uh solution is either you kind of handcrafted all of this information into all of the claude MDs and and by the way often multiples you might use you know claude but also cursor and also

devon like we do that here at Tesla. uh

and and so you want you might even use an AI gateway to help with those transitions, you know, just per chance third guy. But anyway, please continue to talk about that is that is a good point, but that

is still like the sort of the engine interpreting a lot of these things, you know, oftentimes sort of uh still needs some sort of um consolidation and so so you you can kind of handcraft and sort

of write those to each of these config files, you know, uh and areas and and that's you know, clearly non-scalable.

The second option you have is they they kind of somewhat randomly choose to browse different domains and sort of uh uh they need to do two things. One, they

need to find the right information. Uh

and two is they need to uh successfully extract that information out. And then

it's also a little bit wasteful because they do that often. You know, they need to do that again and again. And so

there's a problem around discovery.

There's a problem about efficiency. And

there's also a problem about versioning.

you know, generally when you have your docs available, uh you know, what if you're using something that's a little bit older, right? Or or a little bit, you know, um uh less popular or you don't have a great resource. So, as we

believe, uh that uh all of those are like they're good tools when you are veering off the path, but much of the knowledge in a a project really should be part of the project itself. Like as

we think about becoming more and more sort of defined by your specs as a journey, you want to say well if there are libraries that I use, if there are practices that I apply, uh if there are

you know policies I need to conform to, those things should really be part of my project. Um and what we've done at Tesla

project. Um and what we've done at Tesla is we've created the Tesla spec registry which is basically a dependency system for knowledge. Uh this is it's a it's a

for knowledge. Uh this is it's a it's a registry just like npm or sort of uh pi or a bunch of others that we're well familiar with. It has versioned packages

familiar with. It has versioned packages and those packages contain specs. Those

specs are modular themselves. So they

are they're kind of in general structured so that agents can consume them well. And so they are multiple

them well. And so they are multiple files. They're linked or included in a

files. They're linked or included in a very kind of thoughtful fashion. Uh they

come with some steering for the agent to know when to load them. So they have some information at the top. It says,

"Hey, when you're about to do something that, you know, uses this feature of React, you know, here's a a chunk of information you might want to call and they are they are versioned and they're sort of related to the information that

you're consuming." Uh so for instance if

you're consuming." Uh so for instance if you're using a slightly older version of whatever spring uh then you can you can pull down kind of the relevant package

with the relevant specs the relevant spec pack as we call it uh to uh to inform you about how to use that library. And so you have these

library. And so you have these dependency systems you define them in your system uh in a manifest file tesla.json JSON uh just that we're very

tesla.json JSON uh just that we're very familiar with. Uh and then you pull or

familiar with. Uh and then you pull or you install and uh and you have these uh these specs available and they help the agent uh learn what it needs to know uh as it builds with those libraries. And

we've prepopulated this uh this registry with over 10,000 what we call usage specs. So these are spec packs that are uh built to help you

consume open source libraries. So this

is one of the common problems with agents today is they encounter if they react actually the latest React they're actually pretty good at. But if it's a library that's a little bit less

popular. If you're using you know a

popular. If you're using you know a previous major version if you're using uh something like that's a bit more niche like you know or or something something that's a bit more uh complex.

uh all of those things if it's too new, too old, too uh uh longtail uh then oftentimes the agents get into trouble and of course agents being agents they

are uh they don't accept that that they don't know and so they will start making things up and elucinating APIs and oftentimes fail and even if they don't fail they take a lot of cycles to to do

those and so we've precreated we spent uh an embarrassing amount of money uh sort of not quite sort of six figures but sort of uh respectively into the uh and a lot of sort of person time in

creating great specs to make them useful and then we populated them into the um into the spec registry and you can install uh Tessle and you can start

creating that manifest and give them the information they need and sorry I'll kind of ramble on with with one more bit over here I know I'm sort of going on a bit of a monologue over here but then the other thing that the registry lets you do and again all these things might

be intuitive as you think about the registry is you can also create your own packs you can publish them you consume them. So, it's a means like using

them. So, it's a means like using Tesla's spec registry will instantly make your agents better at using open source just make them better right away.

And uh and maybe more strategically over time you can choose other bits of information to pull down other practices. You can publish your own

practices. You can publish your own privately or publicly uh and start treating knowledge as kind of core pieces of information that uh that is a

part of your project.

Interesting. So it sounds like you've sort of bootstrapped your uh marketplace or you know what this this repository you've got of of specs. And so is the hope then that you know when Stripe

released the next version of their API that they will yeah everyone will basically publish their SDK specs packs into Tesla is kind of the the vision. So

then you have this huge library of knowledge. You mentioned people can

knowledge. You mentioned people can publish their own. So, you know, a long way down the path, might I have a spec that describes a particular business process at, you know, ktoso.com that can

be used by other developers at kontoso.com and sort of show some reuse of an understanding of how a backet cobalt system works or something like that.

Yeah, absolutely. So I think the agents today have intelligence but they don't have uh knowledge you know they and so if you if you ask them to sort of

tackle a task they have some probability of being able to do it but often times you don't want them to reinvent the wheel. You don't want them to uh create

wheel. You don't want them to uh create a new way to whatever log information every time. You want them to use your

every time. You want them to use your analytics system that is inside. You

want them to run a security scan conformant to your policy. And then

sometimes you you think they can but you don't think they should invest the inference cycles at figuring something out if you already have a good way to do it. And like using an open source

it. And like using an open source library is a good example of that. Uh or

in general like I I don't want you to think too hard. I want you to just follow this process. And again a lot of analogy to like your best developers on the team. You don't want them uh

the team. You don't want them uh reinventing everything every time. And

you also don't want them wasting their time thinking about solve problems. you want them to just take an off-the-shelf solution. There's a balancing act, you

solution. There's a balancing act, you know, in terms of what is too trivial.

You don't want to create a spec for how to reverse a string that's sort of trivial enough that you want to leave that to them. But at the end of the day, that's the beauty of a community is that people can publish like, you know, in

the npm world, you do have leftpad, you know, like you do have a uh uh libraries and that and I don't think they're right or wrong. I think there someone might

or wrong. I think there someone might choose to publish that spec and others might choose to consume that spec and and then we all kind of build on one another just in this case what we're

building on one another is is definitions. Um and yeah and as I'm

definitions. Um and yeah and as I'm getting a little bit on on repetitive here but uh there are a lot of subtleties that we forget about when we think about uh specs. Like when you say markdown it's like hey create something

that converts HTML to or like markdown to HTML. Oh that's cool. Which markdown

to HTML. Oh that's cool. Which markdown

do you mean? like markdown is not really one thing like oh wouldn't it be nice if there's a spec somewhere that defines which markdown do you mean okay that's what we mean by the standard which HTML

did you mean um okay let's sort of use that one so again in some cases you're fine to delegate to the agent but in many cases especially when you think about software when you think about repeatability when you think about

situations in which getting it right 19 times out of 20 or even 99 times out of 100 is not good enough you want that knowledge to be part of your project interesting Interesting. Okay, this is super super

Interesting. Okay, this is super super fascinating. I see sort of so many I

fascinating. I see sort of so many I can't help with analogy here and they're all leaky of course, but you know this idea now I have a spec, it's marked down. I have some equivalent of like

down. I have some equivalent of like importing these packages into this spec that I can then say use in this way or you know use in in that way. Um yeah,

it's it's it's really really fascinating. But I guess I'm curious now

fascinating. But I guess I'm curious now what is the product of Tessle? because I

get the the npm part, you know, or pi as our analogy.

Uh a bit like TDD isn't really a product, is it? It's more like a framework or a sort of guidance. So that

you've invented this markdown um pattern of usage that can be used.

What what is what is a business for Tesla? Like how how would this become a

Tesla? Like how how would this become a business for you?

So let me describe the two products for a sec first. So the the first one that we just discussed is the Tesla spec registry. Uh and just to note, it's

registry. Uh and just to note, it's available in open beta. It's free to use. Go check it out. Uh and that one is

use. Go check it out. Uh and that one is indeed, you know, a registry. And

there's actually a lot to think about the uh the the life cycle of these packages and all that. So there's a lot to evolve over there. It's not just the registry and the npm consumption.

There's like elements of GitHub and such in there that we'll start to build.

Mostly run sort of user feedback. GitHub

is might be an overkill. Um this the second thing we released that is in closed beta. You can join the wait list.

closed beta. You can join the wait list.

uh is the Tesla framework and we call it a framework because it's not an agent but it plugs into agents to kind of modify their behavior to be uh to be to

use specific development. So it's a combination of powerful tools to create specs to build those specs that they do we talk about it to edit files while updating the spec first. So all of those

are tools that are made available via MCP to the agent. Um and a bunch of steering to tell the agent, hey, use those tools, which is configurable. It's

uh by the way, those configurations get pulled from the registry as a versioned prompt so we can update them over time.

But um but they are uh they combine, you know, there sort of this combo now of the agent and the uh and the tessle tools to start creating software in the

spectrum fashion. And we uh we work like

spectrum fashion. And we uh we work like our tools use our own kind of cloud uh um services that invoke LLMs with our prompts with our agentive processes of

them increasing with our fine-tuned models to uh to make all of these different steps better. There's a bit of a a bit of a dynamic between what is it

that we want the agent to do versus what is it that Tesla can do very very well.

you know for instance the agent is very good at kind of locally assessing why a test has failed in this context and run a test locally while uh uh for instance Tesla's core competency is about

creating tests that are uh aligned to the spec that understand those and and so so one aspect of our business is you know we we provide a kind of a at the moment it's entirely free and over time

there'll be like a generous premium tier there around uh just invoking these sort of cloud services and giving you some of that wisdom uh that is uh excellent for spectrum and development systems and and

I want to point out like why it's a framework and not an agent like why didn't we build our own agent is unlike uh if you're in a team you can have one person on the team use claude and another one use cursor and another one

use uh codeex and you can be entirely happy with that uh because at the end of the day they're just producing code and you throw that in now even today that's a bit of a fallacy because indeed they need specs they need some guidance in each one of their systems and you have

to start duplicating it and that's a pain fortunately the spec registry is there for the rescue. But I think when you're talking about spectriven development, you want it to be used across the team. You can recover if

someone on the team doesn't use it, but you really wanted to drive it. And we

think over time there will be freedom of sort of choice and preferences and it's entirely legit that one uh like some developers on the team prefer one tool from the other that for certain cases

you want to use Devon or sort of an async um uh agent and some cases you want something that's in line and we need them all to operate in this sort of uh within the context of the framework. We

can't have them each invent their own spec framework or spec formats. we can't

have them, you know, create their own flows about what is okay and what is not okay. Uh, and so, uh, I don't think I

okay. Uh, and so, uh, I don't think I know of other agentic of other frameworks, you know, for agents or definitely not sort of SDDD spectrum and development frameworks, but that's how

we think about the framework we provide.

Um, and yeah, and I guess the last thing I would say is there will be a Tesla cloud service that helps you, you know, run these things in a collaborative fashion. and all of that that would also

fashion. and all of that that would also be an aspect of our business and that will come after make makes sense actually makes sense especially frame like that just out of curiosity what does agent mean to you

today is it like claude code is an agent is uh I think you mentioned Devon like what are some of the other agents you see people crewai an example uh crewi is a is is actually a it's an

agentic framework as in like a way to sort of orchestrate sort of agents so that's a slightly different uh a different type of framework uh we so we we integrate via MCP And so in theory

you can say any any agentic behavior that sort of supports FCP. Uh there are about eight agents that we identify amongst our users, our early users uh that are notable. Clearly at the top are

the ones that you'd expect. Claude, uh

Claude Code, uh Codeex, Cursor, Gemini CLI. Uh Devon is a little bit different

CLI. Uh Devon is a little bit different just because it works um uh out of band and right now our framework focuses more

on uh on the sort of uh inline usage and sort of more interactive usage. Uh and

so we we haven't really sort of invested in integrating the framework with Devon.

The registry would work with Devon just fine. It's basically information that's

fine. It's basically information that's uh that's available to it and copilot by the way we also support. So, the idea is to be able to expand it, but just like

with any stack, like you're sort of a bit more invested per your user base.

Awesome. This is great. Uh guys, I learned a lot today. Actually, I wasn't sure, you know, I done a bit of research on Tesla. I'm actually I've actually

on Tesla. I'm actually I've actually really learned a lot today about how you think about this stuff. So, thank you for taking the uh taking the time. Oh,

people can sign up for the weight list right now. We'll add links underneath,

right now. We'll add links underneath, of course, and they can actually go and play with the registry already, it sounds like. Go and explore that and

sounds like. Go and explore that and take a look at it. Um, anything I didn't ask that you know you should mention about Tesla or spec development?

Uh, no I think we we touched the main things. I I I would say um spectrum

things. I I I would say um spectrum development in in in my view but I think a lot people sort of see this today is the is the future and it's it's a a

question still about the right way to make this the present. Uh and so we are very committed to building in the open.

Uh and that's why we launched these betas. We have a big discord um uh

betas. We have a big discord um uh community that really kind of you know comments on the practice as a whole.

It's called the AI native dev. As you

mentioned I I host the native dev podcast but really it's part of a much bigger content portal. Check out a nativedev.io. And a lot of this is is a

nativedev.io. And a lot of this is is a comes from a conviction that this is a new dev paradigm uh and it requires its own dev movement, right? It requires a

lot of that uh uh uh formation. and it

needs to have the community to be able to be formed uh into the right shape.

And so I'd love to sort of see people on our discord consuming the content on a native dev. A lot of that is not Tesla

native dev. A lot of that is not Tesla content. In fact, all of that almost is

content. In fact, all of that almost is not Tesla content. It's just about the latest and greatest in AI and development. Uh and uh we also have a

development. Uh and uh we also have a conference coming up in November in person and virtual in mid November uh in Brooklyn called AI Native DevCon as you

might expect. So yeah, anyways, it's

might expect. So yeah, anyways, it's it's a we we we perceive a new development practice, a new development paradigm to be a community uh activity and we'd love to have you sort of join

and help shape it with us.

Awesome. Cool. Well, hopefully people will will take you up on that. Sounds

like a obvious invite. I'm going to add one last question that maybe we cut because I don't know if it's it's a sort of more personal question. So, I've seen this view of you a few times now over the years. Um, this background, but

the years. Um, this background, but there's some new things that we've got the Tessle HQ. Uh, I think there used to be a sneak dog and a sneaker in view,

but now there's an out of time license plate from Back to the Future. Like, is

it tell us the story? What's that about?

So, first of all, it's a it's a bit misleading. I'm actually in our office

misleading. I'm actually in our office in the Tesla offices nearground.

Uh, and I just sort of set it up to be sort of uh reminiscent of what I have in my home office out of it. So, it's not It does look a little different, but yeah, there there's a there's some some similarities. Um, yeah, I love the uh

similarities. Um, yeah, I love the uh this was actually like I was on a family vacation in uh in LA and we went to Universal and thought about what to bring the team and I saw this uh this

license plate from Out of the Future and it felt so so correct. like it's both a good momento from the uh from uh from a vacation, you know, back to the office,

but also as a reminder that uh like like startups as a whole are always crazy, but the AI space is moving so so fast. Yeah. And so out of time is a good

fast. Yeah. And so out of time is a good reminder. Uh

reminder. Uh like you're you're always out of time.

Uh which the doc represents pretty well on. So

on. So awesome. Well, glad I asked. Glad I

awesome. Well, glad I asked. Glad I

asked. Guy, thanks for joining us again.

Best of luck with Tessle and I'm sure we'll speak again soon. Thank you.

Loading...

Loading video analysis...