LongCut logo

How Windsurf writes 90% of your code with an Agentic IDE - Kevin Hou, head of product eng

By AI Engineer

Summary

Topics Covered

  • Agents Share Unified Timeline
  • Meta-Learning Infers Codebase Style
  • Scale by Scaling Intelligence

Full Transcript

[Music] all right how we doing New

York so my name is Kevin this is our first ever wind surf presentation so you could say it's the first time we're kind of spilling the beans on uh what the idea is all about so thank you all for

coming I'm going to be talking about wind surf the first AI agent powered editor so my name is Kevin how I lead our product engineering team we're a

team based out of uh San Francisco and thank you so much to swix and Ben and the whole AI engineering Summit team for inviting us here and letting us speak to you all um it's been

a pleasure talking to people in the audience uh at the booth and just generally talking about AI uh so let's dive into it wind surf is an agentic editor and we're going to talk a little

bit about some of the principles that we use when we're building a product like this so we believe that agents are the future of software development and you all are here so you kind of understand

the power of what agents can do both for software engineering and otherwise um but to start I'm going to take you down a trip down memory lane right let's go back to 2022 co-pilot was the state-of-the-art it just came out of

beta people were experiencing the ghost text they were seeing their completions and it was one of the first first times that people really got to see the magic of what AI could do for developers was making them more productive and we

codium um decided we were going to be one of the first companies to also launch an autocomplete product so we garnered a couple million users on our vs code jet brains Vim emac extensions

um raise your hand if you were one of those codium users nice nice um but we always knew that intelligence was going to get better right back then we were

doing short completions maybe finishing your functions but we knew that there were going to be better models larger models better training paradigms completely new you know RL new tool use

all this stuff and so we knew that we wanted to build the best experience for devs possible so even back then we started looking at agents we started thinking about what could the future of software development be if models just got

bigger and so we built the best experience that we could at the time and that was a chat autocomplete product but we always knew that copy pasting from chat GPT was going to be a thing of the past we also knew that people are going

to probably tab less we're going to have llms that we'd be able to generate more and more and who knows you know we always think all right agents are the best now but as a company we're always thinking about the future we're

technology optimists so if an ID in the future who knows we might not even be writing code inside ID of inside of IDE we'll just be there building the best product for

S and so this year 2025 is finally the year where I feel like we are all recognizing the power of Agents inside of software development agents are here to stay and when wind surf I'm proud to say is pushing the envelope of that

technology and we're going to talk about some of those features um and we're going to keep pushing that agentic future um because we believe that you know agents are going to move software

engineering in a direction that no other llm has done in the past so this slide I guess is titled Vibe coding with wind surf or also just coding in Wind surf um so I'm going to

give you a quick demo about this is the wind surf product here you have a a bar this is our agent and you can see that we're going to be building a python web scraper so what this is going to do it's going to

build a python WebCrawler give us some stats about the website um and you can see it's actually installing dependencies from pip um it's doing so inside of the terminal that you use so

you can interact with it um it's suggesting edits setting up your virtual environment and we give a user a very helpful accept and reject so that you can go through and have confidence that the code that it's generating works for

you and your codebase um this is of course there's a lot more features under the hood um some of the things that our users like to do they like to look up documentation we have web search enabled by default um it always looks at your codebase so you can grap through your

codebase uh we can generate commit messages you can drag and drop images right the possibilities are truly endless and I'm going to be talking about these features are powered by a

handful of principles a handful of through lines that we as an engineering team hold true as we're building and as a team we always go back to the same Mission which is to keep you in the

flow and unlock your Limitless potential right we want to work on we want to handle the grunt work for you we want to handle looking at your debug stack traces we want to handle modifying your original source code we want to pull the

correct version of documentation so that you never have to worry about pulling in the correct context these are problems that we are trying to solve um and we want you to spend time on things that you are good at right the things that make us all excited which is shipping

products building great features um and generally just shipping code and so with that goal in mind how do we tell what to work on um it's a

game of input and output so we want to allow users to give the least amount of explicit input possible to produce the most correct and production ready code

right we want you to contribute less and we want our agent to contribute more and we do this by reducing the amount of human in the loop required by doing things like background research we are

always trying to predict your next step and we'll make decisions on your behalf so that you can move faster and this might all seem like a fantasy but winds surf launched three

months ago on November 13th um that date is forever branded In My Memory um and these are the results that we're already seeing so in three months we've been generating 4.5 billion lines of code it

is an absurd number and since the time I started this presentation we've actually probably sent users have probably sent thousands of messages to Cascade asking it to refactor code to write new features to build new pages on their

website um and also a fun statistic since we're all Engineers here uh we've had 16 nights in the last you know 90 days where we've been woken up in the middle of the night from pager Duty on call

because we've had some reliability issues um due to us exceeding our capacity right these problems these we've had immense success getting people onto the platform um and we've been very

fortunate to have the issue of being some of anthropic and open ai's largest uh consumers and so with these Mission with this Mission and Metric in mind let's walk through some of the principles that

we use when we're building this agentic editor and for those of you that have used wind surf um you might learn about some of the new ways that you could use the product and also for my own curiosity how many of you have heard of wind surf just so I know who we're

talking about oh let's go that's sick um how many of you use wind surf okay everyone who put their

hand down do over there um all right let's get into it so the first principle trajectories what is a trajectory we use trajectories to read your mind so unlike

edit other editors like cursor the elephant in the room our agent is deeply integrated into the editor and we'll talk about what exactly that means but on one half you can imagine an agent has

to understand what you're doing and then on the other half it has to understand and be able to execute things on your behalf and this has led to features like one of my favorites quote continue my

work so we are building up an understanding of the user as you're writing code as you're executing terminal commands and then you can actually just go into the the agent sidebar and just say continue my work and it'll actually continue executing

that and it might even give you a full PR or a full commit right we also have things like terminal execution mode right it can automatically use the llm to decide what is safe and not safe so

that if you're running something like git it'll just work or if you it'll probably prompt you if there's an rmrf somewhere you probably don't want to run that automatically and the llm would be like oh we should probably flag to the user to confirm this these are just some

of the ways that we try and let the human be in the loop but as minimal as possible and then finally we also have you know a a stellar ux a stellar design team that's been working on how to integrate these sort of Cutting Edge

features into a product in a way that allows the user to feel like they're in control to be able to accept and reject changes into their code so they can have confidence in the code that they're pushing to production so here is how a trajectory

Works um we have this notion of a unified timeline so an agent is working in the background behind the scenes to understand what the user is implicitly

doing so this includes things like viewing files navigating around your codebase um let's say you edit a file uh and then the agent will edit a file this all kind of goes into a shared timeline

of actions you can imagine this includes things like searching grepping um making edits making commits right the user has this sort of holistic understanding of what you're doing and this entire experience is

Unified by this shared timeline so you can contribute to it it can contribute to it and in this way you never run into the problem where you're talking to the agent and it undo the change that you

just did or has some you know outdated notion of what the file state is so this is a first class principle of ours and when we decided we were going to build an editor we were going to build it around this notion of an agent in a

shared timeline and so here's an example of this feature in action here we're adding a new function and you're seeing the autocomplete and all the kind of like bells and whistles of that feature and

in the right side we just asked continue my work this is a new function we probably want our form Handler to use this new function and you can see based on the context that we gave it by making edits it's guessing Okay we probably

want to make this file change to this file maybe some others and then at the end it's actually just saying okay let's just run npm run Dev and it can run terminal commands on your behalf in the background in your kind of like command

J terminal popup um and in this way we're keeping you in the flow right something that would have taken minutes is now taking seconds and here's another example uh the terminal is now deeply integrated

into the agentic timeline so here if you're typing commands you know the classic example is like I npm install a new package or I pip install a new package the agent should know oh you just installed this package why don't we go ahead and implement it into your

project and based on context that it's able to pick up around the codebase it can continue that line of work so we very strongly believe in a future of no copy paste right you should never have a

situation where you're in a terminal or you're in a document or even on a website and you're copy pasting text into an agent that's that's just not how the way the world works and in the same way we strongly believe the future is

not going to be at Terminal here's another example of commands running inside of your terminal I've been talking about this for a little bit and this concept of a trajectory allows us to automatically

execute things inside of a Sandbox that is as similar to the way you run commands as possible so instead of running some shell script in the background what we do is we put this right inside of the place that you would actually write terminal commands so if

you pip install something or it pip install something it's going to the same environment you'll never have this instance of kind of weirdness and and this is all part of our effort to bring these two sides the agentic side and The

Human Side close together as close together as possible and you do this through building a unified product we believe that developers are here to stay and if you want to work

seamlessly with a developer that means the agent has to understand what they are thinking wind surf has to be ubiquitous and the agent will be reading more and more of your mind doing things

that you might not even know it's doing in the future we'll be looking not just one to five steps in the future but 10 20 30 steps into the future it'll be writing unit tests before you've even finished defining the function it'll be

performing codebase wide refactors on multiple files based on you just simply editing a variable name all this is part of this unified trajectory

concept now the second principle is meta learning so even if wind surf understands what you're doing in the moment there is still an inferred understanding of your codebase and your

preferences and your organizational guidelines that let's just say senior engineers at your company have built up a notion of over time we call this concept meta learning so wind surf we've

built from the ground up to to adapt and remember these things about you and your company so if you think about a frontier llm right the best llms that they exist in the world they're very very smart

Engineers definitely more capable than than I probably more capable than most of you they can just write an enormous amount of code and do so correctly and it probably runs and compiles pretty well but what they do not have is the

exposure that you've had the education that you've had and the ability to kind of remember and and know how you personally or your company writes code and so what does this mean for our

product we've implemented a concept called autogenerated Memories so over time we build up a memory bank what you are doing so you can say remember that I use tailn version 4 or remember that I

use react 19 instead of 18 and these things will be remembered you say them once in the be remembered forever we also allow people to implement things like custom mCP servers so you can plug in your favorite tools we can adapt to

your workflow we will also allow you to wh list and Blacklist commands going back to that same concept we want to keep you in the flow as least or sorry we want to keep you in the flow as much as possible but we can tell the agent

hey never run an RM command without my approval and so in this way it learns about your preferences over time and if you think about what makes a developer effective it's because they

remember things that you tell them and Wier must also model this Behavior if we hope that AI should write and maintain projects for us so in the short term this means you don't need to prompt the

agent again and again to do the same thing over and over um but in the long term the AI should just feel like a seamless extension of yourself it's this idea of explicit versus inferred

context and we always have the saying at the company ideas are cheap so here's an example of autogenerated memories in action here we're not even explicitly telling it remember this thing we're just giving it an architecture overview

we're asking what does this project do and it's remembering based on a couple tool uses it's looking at a couple different files looking at the routes and now it's committed to memory hey this is the project that this person is working on here are the endpoints that

are available and we can reference that in the next message that we send right so in the next future conversation we can now One-Shot things because we have a notion of a memory bank in the same

way documentation is auto learned we know what packages you're using because of your package Json because you've exped L told us and we're able to look up the web look on the web for documentation that matches those

versions and we do so all implicitly and so the dream of meta learning is that you can have an entirely inferred sense of context based

on a code base or based on the usage of the product and autogenerated memories are a step in that direction um we strongly believe that having a rules file you know we do allow users to to

add a rules file we strongly believe that a rules file is a crutch you know by the end of 2025 99% of the things that you're going to put in a rules file will be interpreted or inferred based on your code base or your usage so our

dream is that every single wind surf instance every single user using wind surf regardless of the company or the type of person the skill of the developer will be personalized to that

user and you'll only have to tell it one thing and finally my favorite principle which is scale with intelligence so what does this mean now that wind surf understands what you're doing in the

moment right the first principle and can improve over time the second principle how do we actually build an agent that will scale with the rates at which llms are scaling and while we're trying to get always give you the best tool today

we recognize that new models are coming out every other week right every day there's some new article about some new pattern and it's really really hard to keep up but we always think at codium how do we stay on top of this how do we

build the best product for not just today but three months six months 12 months out three years from now so in 2021 when chat GPT came out you probably like me we all had our imaginations

running wild we're like okay we're going to solve you know AGI post economy whatever but obviously there's a lot of things that need to happen between then and that future and so models at that

time were quite frankly a little bit too too dumb to be able to comp accomplish everything that we wanted them to do so we built up a lot of infrastructure and you and I have all probably done this we build out embedding indices we build

retrieval heris we have output validating systems to make sure that the code that it's generating is good right these are all things that were able to help at the margin but this is all predicated on the assumption that we

were operating with a fixed notion of intelligence 2021 2022 these models we were operating we were building all this infrastructure to compensate for areas and edge cases that models could not

handle and what's very different about the way we're approaching wind surf is that we want our product to scale with the models so if the models get better our product gets better and I'll give

you one such example um it kind of surprised me I was you know when I landed in New York I tweeted that we deleted chat in Cascade I was like a very I don't know I was just I had

thoughts and weirdly a lot of you picked this up and this is an example of something that we feel very strongly about one example of this principle in practice is that we deleted chat so what

does this mean we only have an agent and it's called Cascade inside of winds surf chat is a legacy Paradigm and we completely replaced it and as you can see here users are enjoying it or in

fact they might not even know the but they're just enjoying the higher quality an example of this is at mentions we built at mentions and probably you all have used mentions because context was not very good a year

two years ago today wind surf can dynamically infer the relationships between bits of code and documents 90% of the time you do not need to at mention something all you need to do is

let the retrieval system in the agent kind of plan out what it needs to do and then reconstruct the context automatically for you so at file and at web these are very helpful patterns when

you're working at kind of the margin but these are eventually eing out basis points in the long term we believe that llms are going to improve and they already have improved to the point where you don't need to explicitly specify an

at mention the llm should be intelligent enough to pick it up and so in this example previously I was implementing superbase inside of a xjs app previously you'd be at webbing you'd be at docs at

codebase at this at that no just add superbase right and it's able to infer and plan out let's search the web let's behave like a human would and to get into this there's there's also web

search built into um winds Surf and what's very special about this is that it reads the web the way a human would read the web so instead of these hardcoded rules and you know we probably could have created an edding index but

we would probably get very low quality results and so instead we said the llms are very very good let's let the model decide what it wants to do let's have it decide which search results to read what parts of the page to read and then

finally give us an answer and so we believe that as models will continue to get better we're going to be continuing to do unsupervised work we're going to generate full PRS we're going to read complex documentation the

possibilities are truly endless and so here are some of the principles that we just talked about um where are we going with this there's a lot of ways we can take this right the engine underneath wind surf is

really really the secret sauce and we believe that we're going to be 2025 is going to be a whole new world no rules files generating PRS generating commits it's going to be be crazy and we're

already seeing this 90% of our users or sorry all of our users 90% of the code that they're writing is generated with Cascade that's an astonishing number autocomplete was more in like the 20 30%

this is insane right people are using agents today to accomplish so much more than they could have in the past and we're all software Engineers I want to make sure that every single person in this room is armed with the best tools

and those best tools are agents and like every good thing in the city I expect tips 25% of your ticket price which I heard was quite a lot um here's

the actual QR that you're probably curious about um this is when surf's download link we offer a free tier um so go ahead and and scan that start using the magic

today and then finally we have some killer swag at our booth um you can also connect with me on Twitter I try and stay active with the community but thank you so much for watching I hope that you all learned something about how we're

building at Surf and enjoy the rest of the conference [Music]

Loading...

Loading video analysis...