How Windsurf writes 90% of your code with an Agentic IDE - Kevin Hou, head of product eng
By AI Engineer
Summary
Topics Covered
- Agents Share Unified Timeline
- Meta-Learning Infers Codebase Style
- Scale by Scaling Intelligence
Full Transcript
[Music] all right how we doing New
York so my name is Kevin this is our first ever wind surf presentation so you could say it's the first time we're kind of spilling the beans on uh what the idea is all about so thank you all for
coming I'm going to be talking about wind surf the first AI agent powered editor so my name is Kevin how I lead our product engineering team we're a
team based out of uh San Francisco and thank you so much to swix and Ben and the whole AI engineering Summit team for inviting us here and letting us speak to you all um it's been
a pleasure talking to people in the audience uh at the booth and just generally talking about AI uh so let's dive into it wind surf is an agentic editor and we're going to talk a little
bit about some of the principles that we use when we're building a product like this so we believe that agents are the future of software development and you all are here so you kind of understand
the power of what agents can do both for software engineering and otherwise um but to start I'm going to take you down a trip down memory lane right let's go back to 2022 co-pilot was the state-of-the-art it just came out of
beta people were experiencing the ghost text they were seeing their completions and it was one of the first first times that people really got to see the magic of what AI could do for developers was making them more productive and we
codium um decided we were going to be one of the first companies to also launch an autocomplete product so we garnered a couple million users on our vs code jet brains Vim emac extensions
um raise your hand if you were one of those codium users nice nice um but we always knew that intelligence was going to get better right back then we were
doing short completions maybe finishing your functions but we knew that there were going to be better models larger models better training paradigms completely new you know RL new tool use
all this stuff and so we knew that we wanted to build the best experience for devs possible so even back then we started looking at agents we started thinking about what could the future of software development be if models just got
bigger and so we built the best experience that we could at the time and that was a chat autocomplete product but we always knew that copy pasting from chat GPT was going to be a thing of the past we also knew that people are going
to probably tab less we're going to have llms that we'd be able to generate more and more and who knows you know we always think all right agents are the best now but as a company we're always thinking about the future we're
technology optimists so if an ID in the future who knows we might not even be writing code inside ID of inside of IDE we'll just be there building the best product for
S and so this year 2025 is finally the year where I feel like we are all recognizing the power of Agents inside of software development agents are here to stay and when wind surf I'm proud to say is pushing the envelope of that
technology and we're going to talk about some of those features um and we're going to keep pushing that agentic future um because we believe that you know agents are going to move software
engineering in a direction that no other llm has done in the past so this slide I guess is titled Vibe coding with wind surf or also just coding in Wind surf um so I'm going to
give you a quick demo about this is the wind surf product here you have a a bar this is our agent and you can see that we're going to be building a python web scraper so what this is going to do it's going to
build a python WebCrawler give us some stats about the website um and you can see it's actually installing dependencies from pip um it's doing so inside of the terminal that you use so
you can interact with it um it's suggesting edits setting up your virtual environment and we give a user a very helpful accept and reject so that you can go through and have confidence that the code that it's generating works for
you and your codebase um this is of course there's a lot more features under the hood um some of the things that our users like to do they like to look up documentation we have web search enabled by default um it always looks at your codebase so you can grap through your
codebase uh we can generate commit messages you can drag and drop images right the possibilities are truly endless and I'm going to be talking about these features are powered by a
handful of principles a handful of through lines that we as an engineering team hold true as we're building and as a team we always go back to the same Mission which is to keep you in the
flow and unlock your Limitless potential right we want to work on we want to handle the grunt work for you we want to handle looking at your debug stack traces we want to handle modifying your original source code we want to pull the
correct version of documentation so that you never have to worry about pulling in the correct context these are problems that we are trying to solve um and we want you to spend time on things that you are good at right the things that make us all excited which is shipping
products building great features um and generally just shipping code and so with that goal in mind how do we tell what to work on um it's a
game of input and output so we want to allow users to give the least amount of explicit input possible to produce the most correct and production ready code
right we want you to contribute less and we want our agent to contribute more and we do this by reducing the amount of human in the loop required by doing things like background research we are
always trying to predict your next step and we'll make decisions on your behalf so that you can move faster and this might all seem like a fantasy but winds surf launched three
months ago on November 13th um that date is forever branded In My Memory um and these are the results that we're already seeing so in three months we've been generating 4.5 billion lines of code it
is an absurd number and since the time I started this presentation we've actually probably sent users have probably sent thousands of messages to Cascade asking it to refactor code to write new features to build new pages on their
website um and also a fun statistic since we're all Engineers here uh we've had 16 nights in the last you know 90 days where we've been woken up in the middle of the night from pager Duty on call
because we've had some reliability issues um due to us exceeding our capacity right these problems these we've had immense success getting people onto the platform um and we've been very
fortunate to have the issue of being some of anthropic and open ai's largest uh consumers and so with these Mission with this Mission and Metric in mind let's walk through some of the principles that
we use when we're building this agentic editor and for those of you that have used wind surf um you might learn about some of the new ways that you could use the product and also for my own curiosity how many of you have heard of wind surf just so I know who we're
talking about oh let's go that's sick um how many of you use wind surf okay everyone who put their
hand down do over there um all right let's get into it so the first principle trajectories what is a trajectory we use trajectories to read your mind so unlike
edit other editors like cursor the elephant in the room our agent is deeply integrated into the editor and we'll talk about what exactly that means but on one half you can imagine an agent has
to understand what you're doing and then on the other half it has to understand and be able to execute things on your behalf and this has led to features like one of my favorites quote continue my
work so we are building up an understanding of the user as you're writing code as you're executing terminal commands and then you can actually just go into the the agent sidebar and just say continue my work and it'll actually continue executing
that and it might even give you a full PR or a full commit right we also have things like terminal execution mode right it can automatically use the llm to decide what is safe and not safe so
that if you're running something like git it'll just work or if you it'll probably prompt you if there's an rmrf somewhere you probably don't want to run that automatically and the llm would be like oh we should probably flag to the user to confirm this these are just some
of the ways that we try and let the human be in the loop but as minimal as possible and then finally we also have you know a a stellar ux a stellar design team that's been working on how to integrate these sort of Cutting Edge
features into a product in a way that allows the user to feel like they're in control to be able to accept and reject changes into their code so they can have confidence in the code that they're pushing to production so here is how a trajectory
Works um we have this notion of a unified timeline so an agent is working in the background behind the scenes to understand what the user is implicitly
doing so this includes things like viewing files navigating around your codebase um let's say you edit a file uh and then the agent will edit a file this all kind of goes into a shared timeline
of actions you can imagine this includes things like searching grepping um making edits making commits right the user has this sort of holistic understanding of what you're doing and this entire experience is
Unified by this shared timeline so you can contribute to it it can contribute to it and in this way you never run into the problem where you're talking to the agent and it undo the change that you
just did or has some you know outdated notion of what the file state is so this is a first class principle of ours and when we decided we were going to build an editor we were going to build it around this notion of an agent in a
shared timeline and so here's an example of this feature in action here we're adding a new function and you're seeing the autocomplete and all the kind of like bells and whistles of that feature and
in the right side we just asked continue my work this is a new function we probably want our form Handler to use this new function and you can see based on the context that we gave it by making edits it's guessing Okay we probably
want to make this file change to this file maybe some others and then at the end it's actually just saying okay let's just run npm run Dev and it can run terminal commands on your behalf in the background in your kind of like command
J terminal popup um and in this way we're keeping you in the flow right something that would have taken minutes is now taking seconds and here's another example uh the terminal is now deeply integrated
into the agentic timeline so here if you're typing commands you know the classic example is like I npm install a new package or I pip install a new package the agent should know oh you just installed this package why don't we go ahead and implement it into your
project and based on context that it's able to pick up around the codebase it can continue that line of work so we very strongly believe in a future of no copy paste right you should never have a
situation where you're in a terminal or you're in a document or even on a website and you're copy pasting text into an agent that's that's just not how the way the world works and in the same way we strongly believe the future is
not going to be at Terminal here's another example of commands running inside of your terminal I've been talking about this for a little bit and this concept of a trajectory allows us to automatically
execute things inside of a Sandbox that is as similar to the way you run commands as possible so instead of running some shell script in the background what we do is we put this right inside of the place that you would actually write terminal commands so if
you pip install something or it pip install something it's going to the same environment you'll never have this instance of kind of weirdness and and this is all part of our effort to bring these two sides the agentic side and The
Human Side close together as close together as possible and you do this through building a unified product we believe that developers are here to stay and if you want to work
seamlessly with a developer that means the agent has to understand what they are thinking wind surf has to be ubiquitous and the agent will be reading more and more of your mind doing things
that you might not even know it's doing in the future we'll be looking not just one to five steps in the future but 10 20 30 steps into the future it'll be writing unit tests before you've even finished defining the function it'll be
performing codebase wide refactors on multiple files based on you just simply editing a variable name all this is part of this unified trajectory
concept now the second principle is meta learning so even if wind surf understands what you're doing in the moment there is still an inferred understanding of your codebase and your
preferences and your organizational guidelines that let's just say senior engineers at your company have built up a notion of over time we call this concept meta learning so wind surf we've
built from the ground up to to adapt and remember these things about you and your company so if you think about a frontier llm right the best llms that they exist in the world they're very very smart
Engineers definitely more capable than than I probably more capable than most of you they can just write an enormous amount of code and do so correctly and it probably runs and compiles pretty well but what they do not have is the
exposure that you've had the education that you've had and the ability to kind of remember and and know how you personally or your company writes code and so what does this mean for our
product we've implemented a concept called autogenerated Memories so over time we build up a memory bank what you are doing so you can say remember that I use tailn version 4 or remember that I
use react 19 instead of 18 and these things will be remembered you say them once in the be remembered forever we also allow people to implement things like custom mCP servers so you can plug in your favorite tools we can adapt to
your workflow we will also allow you to wh list and Blacklist commands going back to that same concept we want to keep you in the flow as least or sorry we want to keep you in the flow as much as possible but we can tell the agent
hey never run an RM command without my approval and so in this way it learns about your preferences over time and if you think about what makes a developer effective it's because they
remember things that you tell them and Wier must also model this Behavior if we hope that AI should write and maintain projects for us so in the short term this means you don't need to prompt the
agent again and again to do the same thing over and over um but in the long term the AI should just feel like a seamless extension of yourself it's this idea of explicit versus inferred
context and we always have the saying at the company ideas are cheap so here's an example of autogenerated memories in action here we're not even explicitly telling it remember this thing we're just giving it an architecture overview
we're asking what does this project do and it's remembering based on a couple tool uses it's looking at a couple different files looking at the routes and now it's committed to memory hey this is the project that this person is working on here are the endpoints that
are available and we can reference that in the next message that we send right so in the next future conversation we can now One-Shot things because we have a notion of a memory bank in the same
way documentation is auto learned we know what packages you're using because of your package Json because you've exped L told us and we're able to look up the web look on the web for documentation that matches those
versions and we do so all implicitly and so the dream of meta learning is that you can have an entirely inferred sense of context based
on a code base or based on the usage of the product and autogenerated memories are a step in that direction um we strongly believe that having a rules file you know we do allow users to to
add a rules file we strongly believe that a rules file is a crutch you know by the end of 2025 99% of the things that you're going to put in a rules file will be interpreted or inferred based on your code base or your usage so our
dream is that every single wind surf instance every single user using wind surf regardless of the company or the type of person the skill of the developer will be personalized to that
user and you'll only have to tell it one thing and finally my favorite principle which is scale with intelligence so what does this mean now that wind surf understands what you're doing in the
moment right the first principle and can improve over time the second principle how do we actually build an agent that will scale with the rates at which llms are scaling and while we're trying to get always give you the best tool today
we recognize that new models are coming out every other week right every day there's some new article about some new pattern and it's really really hard to keep up but we always think at codium how do we stay on top of this how do we
build the best product for not just today but three months six months 12 months out three years from now so in 2021 when chat GPT came out you probably like me we all had our imaginations
running wild we're like okay we're going to solve you know AGI post economy whatever but obviously there's a lot of things that need to happen between then and that future and so models at that
time were quite frankly a little bit too too dumb to be able to comp accomplish everything that we wanted them to do so we built up a lot of infrastructure and you and I have all probably done this we build out embedding indices we build
retrieval heris we have output validating systems to make sure that the code that it's generating is good right these are all things that were able to help at the margin but this is all predicated on the assumption that we
were operating with a fixed notion of intelligence 2021 2022 these models we were operating we were building all this infrastructure to compensate for areas and edge cases that models could not
handle and what's very different about the way we're approaching wind surf is that we want our product to scale with the models so if the models get better our product gets better and I'll give
you one such example um it kind of surprised me I was you know when I landed in New York I tweeted that we deleted chat in Cascade I was like a very I don't know I was just I had
thoughts and weirdly a lot of you picked this up and this is an example of something that we feel very strongly about one example of this principle in practice is that we deleted chat so what
does this mean we only have an agent and it's called Cascade inside of winds surf chat is a legacy Paradigm and we completely replaced it and as you can see here users are enjoying it or in
fact they might not even know the but they're just enjoying the higher quality an example of this is at mentions we built at mentions and probably you all have used mentions because context was not very good a year
two years ago today wind surf can dynamically infer the relationships between bits of code and documents 90% of the time you do not need to at mention something all you need to do is
let the retrieval system in the agent kind of plan out what it needs to do and then reconstruct the context automatically for you so at file and at web these are very helpful patterns when
you're working at kind of the margin but these are eventually eing out basis points in the long term we believe that llms are going to improve and they already have improved to the point where you don't need to explicitly specify an
at mention the llm should be intelligent enough to pick it up and so in this example previously I was implementing superbase inside of a xjs app previously you'd be at webbing you'd be at docs at
codebase at this at that no just add superbase right and it's able to infer and plan out let's search the web let's behave like a human would and to get into this there's there's also web
search built into um winds Surf and what's very special about this is that it reads the web the way a human would read the web so instead of these hardcoded rules and you know we probably could have created an edding index but
we would probably get very low quality results and so instead we said the llms are very very good let's let the model decide what it wants to do let's have it decide which search results to read what parts of the page to read and then
finally give us an answer and so we believe that as models will continue to get better we're going to be continuing to do unsupervised work we're going to generate full PRS we're going to read complex documentation the
possibilities are truly endless and so here are some of the principles that we just talked about um where are we going with this there's a lot of ways we can take this right the engine underneath wind surf is
really really the secret sauce and we believe that we're going to be 2025 is going to be a whole new world no rules files generating PRS generating commits it's going to be be crazy and we're
already seeing this 90% of our users or sorry all of our users 90% of the code that they're writing is generated with Cascade that's an astonishing number autocomplete was more in like the 20 30%
this is insane right people are using agents today to accomplish so much more than they could have in the past and we're all software Engineers I want to make sure that every single person in this room is armed with the best tools
and those best tools are agents and like every good thing in the city I expect tips 25% of your ticket price which I heard was quite a lot um here's
the actual QR that you're probably curious about um this is when surf's download link we offer a free tier um so go ahead and and scan that start using the magic
today and then finally we have some killer swag at our booth um you can also connect with me on Twitter I try and stay active with the community but thank you so much for watching I hope that you all learned something about how we're
building at Surf and enjoy the rest of the conference [Music]
Loading video analysis...