LongCut logo

How Intercom 2X'd engineering velocity with Claude Code | Brian Scanlan

By How I AI

Summary

Topics Covered

  • 2x Throughput Is Just the Starting Point
  • AI Agents Are Making Code Quality Better
  • Software Factories Over Artisan Workflows
  • Backlog Zero Is Now Realistic
  • Give Permission and Own the Accountability

Full Transcript

Suddenly you started realizing that you have to think bigger about things or that your imagination is now the barrier not the tool.

How is this not happening in your organization? Like literally the

organization? Like literally the physical limits of my ability to type code are unlocked by AI.

Today we are seeing twice the number of throughput as we did compared to 9 months ago on our engineering team. Now

it's like why can't it be 10x? This is a little bit more of what my instinct tells me is possible, which is if you go allin, if you prepare your team, if you prepare your codebase, I think your overall product quality is going to go

up. I think your overall developer

up. I think your overall developer experience is going up. There's just so many good things that come out of using these tools and using them correctly.

Backlog zero is a realistic thing for teams to be able to go after. All the

things that you wish you would ever wanted to do, it's now just achievable.

I often advise a lot of CTOs and VPs of engineering when figuring out how to get their engineering team AI pilled say everything you hate about the codebase, go spend a month fixing and see how fast we can speedrun that. That's going to

feel really good.

I've been having the most amount of fun in my career over the last 3 months.

Welcome back to How I AI. I'm Claire

Vote, product leader and AI obsessive here on a mission to help you build better with these new tools. Today I am showing how Intercom 2xed the number of

PRs that their R&D department is shipping in just a few months. Brian

Scanland is a senior principal engineer at Intercom and he is going to show us truly all of their secrets to getting a large product and engineering organization cooking on cloud code.

Let's get to it. This episode is brought to you by Celiggo. Every company today wants AI to improve how work gets done.

The fastest way is building it directly into everyday business processes, automating employee onboarding, keeping customer data accurate, managing orders and inventory, or resolving finance and

operations issues. When AI lives inside

operations issues. When AI lives inside the flow of work, it can update records, trigger approvals, route work, and kick off the next step across systems. That's

how teams operationalize AI and deliver measurable results. Cigo makes this

measurable results. Cigo makes this possible. And now with Celiggo Aura,

possible. And now with Celiggo Aura, it's never been easier. Celiggo Aura

gives you access to the entire platform through natural language, connecting your systems and turning intent into action. All of it under your control.

action. All of it under your control.

Companies like Datab Bricks, PayPal, and Ollipop rely on Celiggo to run critical business operations at scale. Ready to

operationalize AI? Visit

celigo.com/helli

AI. That's cel.com/howi

AI. Brian, welcome to how I AI. Why I am so thrilled that you agreed to join the podcast is I think intercom has done it,

which is you all have met the moment in sort of two ways. One clearly met the moment from a product perspective. were

one of the first companies that had so I don't want to say legacy business but had a a going concern business that saw AI coming and really transformed how your product worked for customers and

I'm a happy Finn customer they did not tell me to say that and then second what we're going to talk about is the team met the moment in terms of really understanding AI was going to change how

in particular product engineering and design orgs and engineering organizations were going to work and you just went full speed at changing how the

team works. What what drove sort of the

team works. What what drove sort of the urgency around meeting the moment and how did that come to be? Was it a single person? Was it everybody? What was your

person? Was it everybody? What was your experience?

I think in in some ways it's been the easiest place to be driving out the adoption of AI in engineering and product. um because we've focused the

product. um because we've focused the company so much on focus and product on adopting AI and being AI first and how we think about the product future

customer support and all that and we also had very clear expectations like we you know we we've seen what's possible in the product space and it's just very clear and obvious to us as like

connoisseurs of AI it's like this is clearly going to be huge in engineering and product and building um and honestly there's been a lot of impatience for like why why isn't this happening

today? You know, if we go back a few

today? You know, if we go back a few years and cursor is picking up a bit of business and the models are getting better and but it still wasn't transformative. It still wasn't like the

transformative. It still wasn't like the whole business was changed and we're seeing vast amounts of extra productivity. We knew there was

productivity. We knew there was potential, but it still felt like we needed to have some sort of breakthrough moment or or something needed to to big had to happen for us to get to the kind

of huge velocity wins that I think now we're starting to achieve. That said, we still want more. You know, we're we're proud of where we're at. Um, but we're we we're not content with uh with what

we've achieved so far.

I feel like every 3 months I have a a breakthrough moment. And in fact, I feel

breakthrough moment. And in fact, I feel like Opus 46, I I don't know, something just like really inflected in what was possible when that particular model came

out. Now, I think the GPT54

out. Now, I think the GPT54 U models are also exceptional. And so,

it was something about that one moment with models that really inflected my own personal use of AI and engineering. Did

you all see the same sort of inflection around that model point?

Totally. I think you can go back to was like November December last year h and suddenly you started realizing that you're you have to think bigger about things or or that your imagination is

now the barrier not the tool you're spending less time massaging the tool to get it to the right place um and it's less about autocomplete and more about just literally giving us your your ideas

and seeing what happens. Uh I think the Christmas break happened as well. I

remember we we had pretty much decided before Christmas like hey we're going to go all in on clog code cuz up up to that point there was a bit of cursor here and there and augment and different tools.

Uh and the Christmas break really helped like we uh that I just saw everybody go wild on Twitter X you know that uh people were uh talking about how this was like they were getting so much done and they were building all these things.

I just come back to work after Christmas break going like, "Okay, everything's changed." Like, we we knew that there

changed." Like, we we knew that there was something here and that we're starting to see the signs of it, but now the whole world is convinced or at least all of the influencers on Twitter and LA.

That that would be me. Yeah, I'm I'm actually kind of convinced that companies should increase their PTO and parental leave policies because everybody I know right now in tech that

is quote unquote taking time off goes on their vacation and pops open clawed code and comes back like 10 times more skilled than they were beind their before their time off. And so if anybody

wants a a little minor hack to AI literacy in your org, give people time off to hack and they will come back with more information than you expected.

Okay, I think we're going to skip to the punchline which I love, which is we're going to see how AI has actually changed how you all ship at intercom. So can you just show us a little bit of how this

has changed inside the or I think you all are measuring a lot of this. Yeah.

So I think we've been diligent as you know product owners inside of Intercom uh that we've been trying to uh get feedback from people and see how they're

using uh the tools and uh really like just doing everything uh that we would normally do with a regular product. Um

and so uh we've spent a lot of time hooking up cloud code with telemetry both into things like honeycomb um and data also going into uh snowflake where

we have our data warehouse and we also store session data in S3. Um and we mind this stuff for for useful uh insights and um one one but one of the main

things that we used to drive adoption uh of the tool was uh our CTO Darra setting a goal of us 2xing like doubling the

throughput of R&D. uh and we use pull requests as a crude uh simple measure, but you know there's uh and you know you can argue back and forth about what's a good measure, what's a bad measure and whether measuring anything is

appropriate or whatever, but I think it's reasonable to just have the expectation that if you can get a lot more done and it's so fast and fun, then why aren't why isn't everyone just shipping more stuff? H and so it's a

basic measure that like the tools are being adopted um and that they're being used well and you know of course we don't tolerate lowering quality and we're high trust environments so we don't expect people not to gain these

stats or whatever but our metrics and what I'm showing on the screen here is you know it's a classic number goes up h kind of thing that where we had we started tracking this back uh like how

many uh pores and and what percentage of them were uh generated by either uh claw or cursor or whatever And um yeah, since our major investment in cloud code the

platform and going all in on it and really pushing out like enablement and uh giving people uh freedom to explore and start to build skills and everything

but also pushing them on on on we expect kind of throughput uh increase. We've

seen a big big increase in in the throughput uh of pull request through our system and you know like last year like our CI system completely broke. it

melted it, you know, but that I mean it got like 10 times as expensive and you know we did the work we fixed the bottlenecks we improved the performance of our CI system that stopped being the

bottleneck. Um now uh code review is our

bottleneck. Um now uh code review is our bottleneck but um but like we're still but today we are seeing uh twice the number of throughput as we did compared to nine months ago on our engineering

team. Um and like we're very proud of

team. Um and like we're very proud of that and you know now it's like why can't it be 10x? So what I love about this chart just for a moment is I had

spent the last two decades of my career in product and engineering last decade of my career as a CPTO and it's so funny I want to go back to a couple things you said which is one you have to treat your

org like a product and I always thought that my job was not just the product strategy and the capital P product that we were delivering to customers it was to design our organization to I would

say like output innovation on demand which is that was the job and less romantic less romantically put my job is to

invest R&D for positive enterprise value that was like fundamentally my job as a CPTO and so what I love about this is it's merge PRs per R&D head I'm

presuming that includes does that include product managers and non-engineering R&D or is that purely software engineers yeah this is all of R&D and it's definitely the case that our designer

ers and product managers and uh TPMs like every role in intercom is really actively using claw code and start and shipping code and all that. Um and also we've been hiring like this number has

not been static. So uh the number of PRs uh the raw number is is dramatically higher than just 2x what it was a good while ago. So this is everything from

while ago. So this is everything from your newest hire to uh your pro uh product manager who's like adding uh some copy or shipping like small changes

whatever um you know that's all uh based in this number. The other thing I want to call out for folks is every board meeting I have been in for the last three years have said how are we getting well actually every board meeting I've

ever been period has been how can we get more velocity out of R&D certainly in the last three years it's been how is AI inflecting our velocity and it's so funny I talked to so many people that

are like it doesn't really inlect velocity we're not actually becoming that more efficient and I'm like is that true because I look at a chart like this and I say this is a little bit more of

what my instinct tells me is possible, which is if you go allin, if you prepare your team, if you prepare your codebase, if you have, as you said, I think a high trust culture, people are going to look

at this and say, "Oh, they're shipping these smaller PRs or like engineers are gaming the system." I just I have not worked at a place that has such kind of like bad culture that that would

actually come as an outcome of setting some sort of ambitious fun target like this. And so I I take this as at face

this. And so I I take this as at face value and I think how is this not happening in your organization in your or like literally the physical limits of

my ability to type code are unlocked by by AI. You should get some inflection

by AI. You should get some inflection there. And so, you know, for VPs of

there. And so, you know, for VPs of engineering, CTO, even people that are on these R&D teams, look at this and think, you know, this is possible and it may be a crude measurement, but it's, I

think, an appropriate one as a leading indicator of of what's happening in your org around AI.

Yeah. And we support this with not just telling people to move faster like that's that uh you know, we're we're really looking from first principles of how to how to do the work. like we

believe that like all technical work will become agent first. Um and I'd like to set like a deadline for that that you know at the end of the month we're just going to go all in and h it's never

going to be the first thing that happens uh say in response to an alarm or uh in a planning meeting that there isn't like an agent in there kind of doing the the basic work and I I think that's a

realistic expectation but it involves not just we're not just moving faster for the sake of it. We're seeing that we're moving faster by looking at the fundamentals of where we're spending our

time and reimagining how that work could be done in an agentic world. And

honestly, if like the if the agents didn't get better, if the models didn't get better, the harnesses didn't get better, uh we've got the building blocks just today to be able to just continue

going moving around looking at how we do our technical work today. uh and like by technical work I mean everything in delivery of product and and move it to

entirely be agents first and allow us to move up to higher level to be able to like work on higher level concerns or just getting more stuff built more stuff out there or higher quality that's all

within every or grasp today but you have to be very open to change and I guess what's been fortunate to intercom over the last while is that we have been extremely open for change both in the

product side of things and adapting the the company to how uh I think companies need to work now with AI and we're starting to see results.

Yeah. The other other reflection I have upon looking at this chart is we're recording this in kind of the spring of 2026 and Anthropic just said that they

crossed 30 billion in in revenue I think up from 19 a couple months ago. And I I I suspect their revenue chart looks a little bit like your merge PRs per R&D

chart. So, how are you all thinking

chart. So, how are you all thinking about the trade-off on cost here, right?

Like we're all consuming clawed tokens.

Yes, you know, efficiency or output is going up, throughput's going up, but is cost scaling proportionately? Are you

all worried about is that the problem right now? Are you even worried about

right now? Are you even worried about it? How do you think about that? Yeah,

it? How do you think about that? Yeah,

we're definitely worried uh in that the build is looks exactly like this and and you know I I spent a lot of my career worrying about AWS costs and uh

worrying about our margins and stuff and then suddenly you've got these costs showing up uh and they're disproportionate to the growth that we've seen anywhere before. It's like

hiring whole new offices of people. Um

and but at the moment our attitude has been look everyone just turn on opus for everything one million with uh context window you know we uh we just use the

API plan so it's all just on demand and we we think that there's enough uh alpha or benefit in at this point going as fast as possible and caring about the

bill later because of the later benefits we'll get and maybe that's a position of where intercom is. I don't think it's realistic or feasible for absolutely every single business to do it. And

honestly, I do kind of respect when you have to actually think about your token use and how that can kind of force you to be more considerate or it sometimes even gets you better results. You know,

you don't need opus for everything like uh there's there's faster models out there. Um and so we're we've we're just

there. Um and so we're we've we're just kind of avoiding that optimization phase until a point of where we until you know we've gotten serious benefits from investment in this platform. And so I

think this investment and I think it's it we are treating it like an investment at this point um is worthwhile. Uh but

you know if this keeps going at this rate yeah we should all work for Antropic you know.

Um I think the way they're hiring we're all going to end up working for Anthropic. So okay and then one other

Anthropic. So okay and then one other thing because I think you know folks are going to look at this certainly engineers and they're go okay like you're shipping more PRs but it's all slop it's all garbage. uh you know I

know you all are measuring quality on the outside of this on the other side of shipping all this stuff. So how have you seen this inflect measurements around quality or customer value or what you're trying to achieve at the end not just

lines of code?

Yeah, I have a standalone graph that I can share which is kind of interesting.

Um, and so we've started to look at the uh the time it takes from the first line of code written in a feature to the time

it gets posted on our news channel like our updates. Um, and that's uh that that

our updates. Um, and that's uh that that has decreased consistently over the last few months. Now, we're not optimizing

few months. Now, we're not optimizing for this uh but we're interested in it.

And the other thing is like the sheer volume of things we have shipped also appears to have kind of just rapidly increased in the last few months as well. And that's that should be a bit of

well. And that's that should be a bit of a trailing metric. So we believe that these numbers this like increase in volume is been borne out in real features, real products that our

customers are using. And even we've been running some experiments like how far can one person get on their own building um something that's plausibly a whole

entire product area um feature to be able to sell. Uh so this is something we're taking seriously um and it's uh we also care a lot about quality. We've

been working with a research group in Stanford. We've been giving them our

Stanford. We've been giving them our data and uh you know mostly just looking for any kind of insights to make sure we're not blind. you know, I join absolutely every single incident. I'm a

ambulance chaser and I make like uh and I'm not seeing any increase in kind of regular kind of incidents or outages or customerf facing problems. Um we've had a few kind of weird problems but not

related to production. Um and uh but also uh the the interesting thing from the Stanford data when we checked back in on it last week was that their measures of code quality reckoned that

the code quality was improving. Um and

you know the models are improving, the agents are improving. We're adding more and more guidance and skills and all these kind of things which I think do craft uh or do force people down a road

which should result in higher uh quality output. But um it's great to see when to

output. But um it's great to see when to when when tools kind of can independently pull that out. Uh now

devils in details, you got to go into the we got to actually really have a strong sense for what quality means in your own environment. But, you know, we're not seeing some of the things that people are worried about out there. Um,

but that's it. We got a mature environment. We're 15y old SAS company.

environment. We're 15y old SAS company.

We've been doing this for years. Uh, you

know, AI and speeding up your velocity will will magnify all of your strengths and weaknesses. And thankfully, I think

and weaknesses. And thankfully, I think we've got a lot of strengths on the software delivery side of things that we've been able to take advantage of.

One thing that I want to kind of call out here which is you said that you've seen your code quality increase which again intuitively I've always believed

to be the ultimate endgame of this and every engineer not every many engineers that I've talked to and just don't believe it to be true but when you have the capacity to take on tech debt when

you have the capacity to take on the dragons in in your codebase you actually can do those things whether it's developer experience security and

compliance, just general maintainability of your codebase, flaky test, improving your CI/CD, all those things become very tractable. Not just technically, not

tractable. Not just technically, not just can an engineer execute on it, but actually the business, and I feel like people don't appreciate this, the business, capital T, capital B, only has

so much capacity for internal projects.

Meaning, we can only allocate so much of R&D towards improving code quality. just

just how we live. We don't generate ARR on code quality, unfortunately. But when

the the um when the cost of doing that compresses, then you're able to say yes, as a business, we should invest there.

One, because we can, and two, because it'll unlock velocity on the outside for our agents and for our product managers and for our engineers. And so I think

this is actually a really important moment for folks to invest in code quality. And I often advise a lot of

quality. And I often advise a lot of CTOs and VPs of engineering when figuring out how to get their engineering team AI pilled. Say

everything you hate about the codebase, go spend a month fixing and see how fast we can speedrun that. That's going to feel really good. Okay, we've

chitchated. We've shown graphs. Point of

how AI is to actually ship some code. So

let's switch over to that. We can

probably come back to all these topics.

I think they're so interesting. But

you're going to show us how you all again in your mature code base, mature organization, are actually getting things live and some stuff you've done in the repo to make that possible.

Yeah, sure. So, I'm going to do a fairly trivial change in our majestic Ruby on Mails monolith. So, this is millions of

Mails monolith. So, this is millions of lines of code, all the tests. Uh, yeah,

it's the code base is older than intercom. it was uh created before Incom

intercom. it was uh created before Incom was incorporated. Um and you know it's

was incorporated. Um and you know it's got it's got its problems, but we love it and we we tend to it. Um and so uh I'm just going to do a relatively simple

change of adding a uh a lobster emoji rails redirect to chatpure.ai.

So also I I try and give hints to cloud when I'm actually demoing something. Um,

I don't know if it actually helps, but it makes me feel better.

Uh, just trying to add a bit of urgency here, you know.

I think that's everybody's prompting strategy, which is I don't know if it helps, but it makes me feel better.

Totally. And so that's a nice way to uh interact with the agents, you know. Um,

and so what we're seeing here is, I mean, it's already kind of figured out uh where to put a redirect. It's got the nice lobster emoji. Um, and it's asking

me if I want to open IP or uh, so obviously I do. And, uh, I think it's actually gotten the URL run. I It's

app.inter.com,

which will have the URL, but we can tell, uh, cloud code later on about that. So, what we're seeing here is,

that. So, what we're seeing here is, first of all, an important point, I'm just going to scroll back up. One of the things we noticed early on when we started getting cloud code to write all of our code um and you know we're up

well above 90% now is that it would create pull request descriptions that were kind of terrible. They it would describe the code and that's the least

interesting part of a pull request. You

actually as a human or even as a an agent reviewing code you want to know the intent behind the pull request. you

want to know the interesting bits, what's kind of related to this and h you know LM are very good at just regurgitating or rewriting code into English that's fine but it's not what we

need and so one of the things and we noticed as well when people were using cloud code we we we created an LLM judge to evaluate uh because we had suspicions that the quality of the pull request

descriptions was going downhill. So, we

created an LLM judge to evaluate what does a good pull request, we decided what a good pull request description uh should look like and then got an LM judge to go through uh all

like months and months of data. And

yeah, the trend was awful. The trend was going in one direction. Um and this is bad. Um and you know, look, humans

bad. Um and you know, look, humans aren't perfect at creating pull uh pull request descriptions. Sometimes they're

request descriptions. Sometimes they're just blank and whatever. Um, but I think with uh our use of tools like cloud code and setting up these kind of platforms around it, you really have to be pushing

for like higher standards. You want as close to perfection as possible. And

this was clearly something that we're just not going to tolerate a lowering of standards or in our environment. So we

created a skill called create pure. And

what it does is it uses whatever context it can from the session to describe the pull request. Uh, so it's not quite

pull request. Uh, so it's not quite rocket science. Um but often the session

rocket science. Um but often the session knows exactly why it's doing the thing and so uh but then we had to kind of force it in you know we we started we told people like oh just use the create PR skill and then people would want

wouldn't use it you don't really actually want to be have people remembering things so we added it as a hook so if claud decides to uh you know

use the GitHub CLI to open a a pull request we just block it and we say yeah tough you need to use the create PR scale And also you're probably going to have to uh like figure out a different

text description. H and then I might

text description. H and then I might interview you if there's not enough context there. Hopefully uh there's

context there. Hopefully uh there's enough context in this. But the point being that you know this is a platform we want great outcomes and we measure

the inputs and outputs and after we we put this in place the LM judge reckon we're doing a great job now and so we're at higher quality pull request um

descriptions. Now this is not the most

descriptions. Now this is not the most important thing in the world like this is not going to get intercom to 2x or to 10x revenue or anything like that but

it's the all of the composite little jobs that like are when you assemble means you have an extremely competent engineer who works appropriately in our environment and that's where we're putting our investment for each little

skill and hook to do these things. So

they almost look inconsequential, but you know, they result in better outcomes. And so we look through here.

outcomes. And so we look through here.

It's uh it's creating a PR. I'm going to have to check on what it's going. This

probably will be automatically approved as well, which is pretty cool. And we

might even see some pull request feedback as well in action. Um still

building. We'll come back to it in a couple minutes. One thing I want to call

couple minutes. One thing I want to call out for folks as as you were describing sort of why you put in this skill to improve the PR and for those who don't know um a skill is basically just like a

set of instructions and sometimes scripts that a LLM or a agent harness can invoke at a certain step in your flow. One of the things that I was

flow. One of the things that I was thinking as you were describing why you put this skill together and got really opinionated about PR descriptions is in engineering we have been able to

architect really opinionated CI/CD pipelines. So how written code goes from

pipelines. So how written code goes from being written to deployed in production and we have I mean you saw it in GitHub we have all these checks and lints and

pre-eploy you know pre-flight things and preview branches all these things once the code is written but what I think is really interesting about skills is you

can bring some of that determinism to as you write the code how you want that process to go and we used to not be able to do it because it used to flow through the hearts and minds and hands of humans

which are much harder to put in these structured guard rails and we would do this by writing wikis or having you know SOPs where it said can you please follow

step A B CDE E and now you can just make it really easy to enforce those standards across a team which I don't think is micromanaging it's actually

just making everybody's golden path much smoother to production and so I think there's this just very interesting uh parallel to how we've approached CI/CD key to how we approach things more

upstream even from the product management perspective.

Totally. We're on this movement towards a software factory and uh what factories are great at is uh you know like an IKEA factory or something. It's all the same

furniture, all the different bits and you know how to assemble it and uh look it's not your artisan stuff. it's not uh or it's not cutting edge or whatever, but it's very predictable and uh you

know has a certain quality and meet certain standards when it comes out the other side of the factory. And so while pull request descriptions again, they're not they're not make or break for the factory or the pull request or whatever,

um it's one of those qualities of just good quality work that's reliable, predictable, and then when assembled together, you've got your IKEA factory.

Well, and people don't want to feel certainly engineers don't want to feel like they're part of a slot factory, right? And so these things that you can

right? And so these things that you can add into the flow that actually uplevel and meet the standards of the engineering team really help your human engineers on the team feel like they're

working at a place that values quality.

And so I appreciate that you've put those that effort into um into these behind the-scenes hooks and skills because I'm sure it reinforces to a

culture that's being asked to move very fast to ship how you know ship things differently than they have before that you still do care about their experience

reading pull pull you know pull request descriptions um their um you meet their bar for quality and I just think it makes everybody happier.

Yeah. Yeah. Well, it's great when the robots just produce the work that you'd expect of your best engineers, you know.

Yeah. And I, you know, maybe as you get this live, I also think there are just still such more interesting problems to solve in software engineering. And we

can talk a little bit later in the episode about some of the interesting problems that you all are solving on the product side, on the technical side. I

think there is no lack of hard, intellectually stimulating, creative problems to solve for customers and coding redirects is just 100% not one of them.

Um, so did we get do we get my redirect live or are we close?

It's still there. I'm waiting for an automatic review to kick in, but uh we can come back to it. So, one of the things I would like to show next might be some of the telemetry that we have in

place. So we saw that uh you know there

place. So we saw that uh you know there was different skills getting invoked and um and we don't like flying blind uh to run a system like this you need to know

how well people are using it uh are people using these skills at all uh you know the kind of basic information that you'd expect of like when you ship a product to your customers uh like you know where can I see the usage how can I

fight for the usage what's what's going wrong or what's not going uh wrong and so we collect a bunch of telemetry using different mechanisms and have different

homes for it. The most open one that we have is we collect uh basic usage information for skills and and the like uh and we send it to Honeycom. So we

just have a shared key that deployed to all of our laptops. Um and uh anyone can go in and kind of look through this data. So if you're developing a skill

data. So if you're developing a skill internally in intercom and like hundreds of people do this um it's very easy for you to go in to discover like okay how where's who's actually using this? uh

when are they using it? And you can kind of use this as a kickoff to like follow up on um uh just like basic discovery of usage of your skills and all. And like

unsurprisingly the kind of main skills that we have are things like creating PRs. Admin tools is our admin like

PRs. Admin tools is our admin like internal tooling APIs or um where we have an MTP in front of it. Build Kai is our CI system. Snowflake logs is where we put snowflake. So you can see from

this like a lot of work uh a lot of the skills are being evoked are all around the building and then seeing where my stuff is and maybe some troubleshooting type information as well. Um and so this is the first kind of step. It's like if

you don't have this it's hard to have a large system uh like all these hundreds of skills and uh hundreds of creators working in this area without having decent telemetry. The the next thing we

decent telemetry. The the next thing we do as well is we also collect all of the session data and put it into uh S3. And

so we anonymize it. We do a few things to make sure we're not doing anything too private. Uh, you know, people put

too private. Uh, you know, people put all sorts of stuff in their sessions.

They yell at their sessions.

Yeah. And yeah, people have personal relationships at times with uh with Claude and like we don't really want to know about that, but we do want to be able to dive deeper into uh how things

are going. You know, I think um

are going. You know, I think um understanding like how what the dropout rate of sessions like did how quickly people got to something useful like whether it was a PR or something like

that. uh this kind of information is

that. uh this kind of information is pretty interesting and so we're harvesting a lot of session data and we're doing different things. This is

what I'm showing here on the on the screen is like a very simple tool that we put together which just gives you some personalized insights and you know you can do this inside claude these days

as well. There's and there's plenty of

as well. There's and there's plenty of skills out there on GitHub where you can do session analysis. But I think we we just built a little tool on top of our session collection uh system to give

people feedback. And it's feedback that

people feedback. And it's feedback that we're interested in giving feedback about how their sessions are going and and how they're kind of fitting in, how you should think about your own, I guess, use of clog code compared to everybody else in the org. And you know,

I'm not doing too bad here. It's like

79th percentile. Um you know, someone has to be down the bottom of every percentile. Um there's and there's some

percentile. Um there's and there's some interesting feedback here like uh it's tell it's it's kind of getting annoyed at me or rather I was getting annoyed at Claude a few weeks ago because I'd set

up Gogg to interact with um all of our Google stuff internally um and uh but it kept on trying to do the wrong thing and I was kind of giving out to it and it ended up adding stuff to claude.mmd and

stuff um and it's it's kind of giving out to me here or it's reminding me that this wasn't a very effective way to interact worked with cloud code. So, you

know, it's a good prompt for me to actually go and fix up my memory or whatever. And like we all like people

whatever. And like we all like people are at different levels even on intercom. Um people are different levels

intercom. Um people are different levels of adoption. Uh people are joining

of adoption. Uh people are joining intercom they may not have like seen a system like this before and they want to know how things are going and get feedback. And so, uh, this is one

feedback. And so, uh, this is one example of how we're just trying to pull together this information to give useful, actionable insights, uh, to people so that they can they feel supported and that we're not just

throwing them an API key and saying best of luck. It's like, no, we've got we

of luck. It's like, no, we've got we understand what growth looks like and the progression that people go through um, when they're using these tooling um, and getting better and kind of self-improving. We want to support all

self-improving. We want to support all that. So, this is one of the things that

that. So, this is one of the things that we're doing with the session data.

There's loads of other things that's work in progress. Um, like being able to like we want to get insights to which skills are are are the highest quality, which gets you which which skills get

you to your results as quickly as possible and then which ones need work, you know, which ones uh aren't working out so well or might need a bit of attention to improve.

This episode is brought to you by Cursor. If you all have been watching

Cursor. If you all have been watching How I AI, you already know this. Cursor

is my favorite way to code with AI.

Whether I'm using plan mode to build out an ambitious feature, reviewing AI generated diffs right in my editor or kicking off cloud agents to multi-thread our roadmap, I reach for cursor as my

favorite multimodel coding platform.

Even better than building myself in cursor, I love collaborating with Bugbot to fix PRs for code security and quality and have begun relying on Cursor's automated agents to keep our codebase

clean. It's not just me. The most

clean. It's not just me. The most

ambitious teams love Cursor 2, including engineers at Stripe, OpenAI, and Figma.

Ready to build more? We're giving $50 in Cursor credit to How I AI listeners.

Claim your credits at chatardd.ai/howi

AI. That's $50 in cursor credits by going to chatpd.ai/howi

AI. I have to pause before we look at your list of skills because I'm so excited about that part. But if folks aren't watching it, they may have missed how amazing what you just showed is. So

I'm going to reiterate it, which is one, you've instrumented all your internal skills with telemetry so that and and you're using honeycomb. Um, love the honeycomb team.

You're using honeycomb to see how often those skills are invoked over time. So

this is just a tip for anybody building out a skills repository internally or even somebody who is maybe trying to get some visibility into their impact across

the org. Let's say you build a skill and

the org. Let's say you build a skill and you want to go to your boss and be like boss my skill is being used by literally everybody every day. Um find a way to

put event level telemetry invoked in the skill a little dashboard and you can track those over time.

Again, treating your org like a product, treating your repo like a product, treating your AI setup as a team like a product, and all products, all good products have tracking plans. And so

figuring out how you put that telemetry in, I think is really smart.

And then the second thing for for those that missed it or how to do it is you're taking all the raw session, I'm presuming JSON files. So, for folks that don't know, Cloud Code stores all your

chats with Cloud Code um in on your computer in JSON, and you can go look at those or query those at any time. It

sounds like you all are uploading those files to S3 and then layering on top of it some anonymization, some user level views and then you're essentially

building sort of what I would call like an internal eval of how people are using clawed code and what problems they are

having over time so that individuals one can triage their own implementation as you said oh it looks like I need to do this or that or improve my agents MD but

then if We're seeing consistent themes over the organization on it's never invoking this MCP when we need it to invoke this MCP or people are yelling no

every time the create PR um skill gets queued up. You can fix that at a systems

queued up. You can fix that at a systems level but you can't do that if you don't have the visibility. So again, my VPs of engineering, my CTO's, my friends out there, put some telemetry in your skills

and then do some meta analysis on your cloud code sessions across the org and you'll be able to identify places where some probably some high lever fixes are going to get your team unblocked over time.

I do hope and expect that this stuff will get easier over time. you know, um I'm happy to kind of invest the work uh so that we can move fast and kind of be on the bleeding edge, but there's

something to be said also for being for having like last mover advantage and just getting all this stuff for free whenever Antropic ship it or whoever shipped it. Um I mean maybe this is a

shipped it. Um I mean maybe this is a product just that people should buy or build. Um but for us right now, we have

build. Um but for us right now, we have no choice. We just got to build it.

no choice. We just got to build it.

We're we we're we're we like we're fascinated with the insights that are locked away in these sessions. Uh and so we just got to build uh stuff so that we can see what's going on.

I I love it. Okay. Can we see some of these skills?

Yes. Uh so

it's a very exciting GitHub repo.

Our lives are all GitHub repos and markdown files.

Totally. Um and we have we have a lot of activity at the moment. We we ran an AI day last week kind of getting more people uh contributing to it. And so

well what so what this is is it's a plug-in uh repo and uh we have a series of plugins and they've been they're growing daily at the moment. Um kind of

every team will have their own kind of specific plugins and actually in general though we're very liberal. We want stuff to end up in here, even if it's not great. And well, we do sweat the details

great. And well, we do sweat the details on the core plugins, things that we think are fundamentals, foundational ones that go out to everybody. And so

where we start off is we have like these base plugin which gets installed. Oh

yeah. So we distribute this not via the claw code plug-in mechanism. H we found was just a bit flaky. It was, you know, sometimes it would update, sometimes it wouldn't. And it was ended up kind of

wouldn't. And it was ended up kind of like trying to manage a Python install on on hundreds of different laptops. You

know, it's you just don't want to do it.

And so we ended up using our internal IT systems to synchronize all of the plugins to the discs of everyone's laptops. Uh so this is a great G-code

laptops. Uh so this is a great G-code and yeah strongly recommend getting very close with your IT team to be able to deliver things like this reliably and not have to rely entirely on the claude

code code plugins mechanism. Just our

experience is a bit flaky and it's just gives us a lot of reassurance. We don't

have to do certain types of debugging once it's all on disk. So so this is we know this stuff works everywhere uh because we've got our IT team pushing it out to disk. Uh and so we got some

safety hooks. We have some uh some of

safety hooks. We have some uh some of the B foundational things like yeah merging Pors we don't want our agents going off into AWS and then just different settings and the telemetry

things as well. So these are uh the core things that absolutely everybody gets and um but we you know these are minimalist we don't want anything that

could be inappropriate in say a non-technical person's laptop or whatever. So uh that's this is like the

whatever. So uh that's this is like the the basic uh building block. The the

next main bit for us is like what we call developer tools. Again, like this would be things that we then do all of engineering and beyond at this point. Uh

and these would be generally skills that would be appropriate to be used by any engineer in the course of their work dayto-day. And again, we would have a

dayto-day. And again, we would have a high quality bar again for all of these.

These would all require eval. These will

all require to pass different kind of tests or analysis that we do on the quality of skills. Uh, and so we we try and maintain these and make sure that they're well updated and well used and

we pay a lot of attention to to I can maybe go through one of these skills in a bit of detail. Uh, this one's near and dear to my heart. It's flaky specs. And

I think the interesting part here is not the skill itself. The skill does reliably fix flaky specs. And I'll I can pull up um in the meantime like here is

a list of flaky specs that we have at the moment. I'm going to open up uh the

the moment. I'm going to open up uh the skill and just start to run it on this issue. And so while this is running,

issue. And so while this is running, just walk through what's in the flaky spec skill. And so there's a checklist

spec skill. And so there's a checklist here. And the fun part about how I built

here. And the fun part about how I built this was not that I be was a world-class expert at fixing flaky specs. I roughly

know the problem h and you know have fixed a few of them in my time. But

there's different classific in a large test test environments like ours. We

have hundreds of thousands of tests and if you're not super careful about like data poisoning or race conditions and all these kind of things that kind of kick in when you're running millions and millions of tests a day. You know, you

end up with these tests that end up slowing down your ability to deliver code to production fast and reliably and not confuse developers by things randomly breaking. And there's kind of

randomly breaking. And there's kind of known patterns and known known ways you would go about this. Um, but I knew my goal, which was to have a skill fixing all of these flaky specs. And it was

something that agents are pretty good at when you give them a a kind of testable goal. You know, this wasn't quite

goal. You know, this wasn't quite open-ended. And I also had this huge

open-ended. And I also had this huge backlog or yeah, there was a backlog of probably a few hundred, but then also all of this historical flaky spec information. And so you can just harvest

information. And so you can just harvest all of this data in your environment to go, hey Claude, I'm going to build a skill. first of all go and research

skill. first of all go and research every single flaky spec we've ever had and then we're going to build a checklist. we're gonna build a mechanism

checklist. we're gonna build a mechanism and then we're just going to crunch through them over and over and over and you get to this like 1x kind of you know it's doing a good job probably as good as job as I would do but then as you

keep building up all of these like little teeny steps which are the kind of things that you know our best Rails coders kind of do they've got all the stuff in their head and all the

different classifications of flaky specs and you know verifying against real data um and uh and then but the the really fun part is then you get so you get

something that's starting to be like 10x. It's fixing flaky specs that I'm

10x. It's fixing flaky specs that I'm not even sure if I could do it might take me a day or something. Um and I probably wouldn't do it. But then you start to add in stuff into the skill

along the lines of like okay when you fix something and it's novel you need to update yourself as well. So in that session it's updating the skill. So the

skill itself is kind of learning as it goes along and we also fan out. So it's

like okay I'm very happy that you fixed that flaky spec. Now find every flaky spec that got impacted by that nature of it. And so I went from zero to like 100x

it. And so I went from zero to like 100x in terms of this skill now is like you know senior distinguished engineer

level or being able to fix uh these specs. But it was more like the process

specs. But it was more like the process that got there. Um, and sort of like working with a feedback loop, working with like a very clear goal and then giving it the freedom to do it. You

know, giving it access to the systems where it needs to pull in metadata, being able to run builds itself. Um, and

having that feedback loop where it's learning and and then, you know, designing the skill as well so that it's you have to editors every so often ends up taking up too much information that might confuse things, but then you break

things out into uh like reference guides. So you're doing this like

guides. So you're doing this like progressive discovery thing and and I've even accidentally pointed this skill at like a Python codebase and Claude has just gone ah like it's just Python I'll

give it a go. Uh and it kind of uses the knowledge that's applicable to it. And

so again this skill is not going to make intercom's revenue go 100x. Um, but it's now this like perfectly reliable thing that we really no longer have to think

about and that we can expand out into many many different areas and we're we just have to maintain this and the maintenance work for a scale like this just isn't much and we have eval so that when we're upgrading models or maybe

even moving to cheaper models or whatever that we can make sure yeah this thing isn't regressing. it's still

working as well as we think it is and we've got confidence and certainty that this is still a reliable building block and again the constituent part put when put together you've got like a very senior engineer who's able to get any

work done in your environment and so yeah uh we can take a look at what it's doing oh it's asking me for permissions um should have checked you forgot to d make no mistakes

dangerously skip permissions that's the rule on how AI one thing while it's running I wanted to say is you know this skill is a perfect example of what I

call the like and then AI workflow, which is I tell everybody like pull your skills and pull your workflows through a

bunch of and then. So I want to fix flaky flaky tests. So I go to GitHub, I find a flaky test, I run through the skit. Let's say you fix it and then what

skit. Let's say you fix it and then what would you do? Well, I would document how I fixed it. And then what would you do?

Well, I would go find all the other ones that are just like this and fix that.

And then what would you do? I would go from, you know, a Rails codebase to a Python codebase and apply the same, you can just do that over and over. And

because the cost of running these is so low, you can actually pull the thread of a bunch of stuff any reasonable human would have quit at step one because

you're not limited again by headcount or coordination cost. you're limited by the

coordination cost. you're limited by the technical capacity to solve the problem which I think is is a really interesting way to think about how you get from like

the you know engineering intern that whose job is to go through and take a first you know gentle pass at all these flaky tests through to the distinguished

engineer who has just speedrun through 300 of them and has thought of a completely different way to architect your your testing overall in your repo.

So I think that's a really great model for for things. And then the other thing is like again engineers go speedrun your tech debt fix your flaky tech. Like these are all

things that as somebody who has run engineering organizations I have heard over and over we can't because our code base blah blah blah blah blah like can we pretty please allocate this amount of

time to just fixing this really annoying front-end flaky test. like you don't have to ask permission for that stuff anymore because there's just a new way to solve it. And I think again just

going back to some of the stuff we were talking about earlier, I think your overall product quality is going to go up. I think your overall developer

up. I think your overall developer experience is going up. There's just so many good things that come out of using these tools and using them correctly.

Yeah, I think backlog zero is a realistic thing for teams to be able to go after. you know, all the things that

go after. you know, all the things that you wish you would ever wanted to do, you know, it's it's now just achievable.

Of course, you got to balance it with, you know, all of the extra stuff that you can just deliver at the same time.

But it's so sweet to be able to think that, hey, we actually have a path to getting rid of our all of our backlogs and all of the kind of architecture changes or whatever. You know, we we can

recently I was taking a Go microser and re-implementing it in Ruby and it was a single cloud code session before November. This was something that I

November. This was something that I would have had to advocate for on a road map and like you know plant some seeds and different engineers heads and kind of nudge people towards it and kind of

blame a lot of problems on the existence of this micro. Um but now wait trigger warning first before you talk about that process.

Uh sorry I'm giving the secret sauce here of how to influence an orc. Uh but

yeah and but now it's like well I don't even have to think about this now. It's

a single session and in fact I can I can get Claude to implement it five times and compare the styles or compare the you know get it to review them and figure out what the best way of of

implementing the thing is. And this is just like this level of kind of creativity and freedom that where like your imagination is the blocker, not not the the time it takes to actually knock

out one of these things which was months in the past. You know,

I I completely agree and I I feel this at chat PRD where people are like what are your I mean I'm a product tool for product people. They're always asking

product people. They're always asking what my road map is. I was like I literally don't have a road map. We burn

down the road map every week and then we figure out what we're going to ship next. And of course, we have thematic

next. And of course, we have thematic ideas we want to pursue and things that are larger. And one of the things that I

are larger. And one of the things that I do to keep myself from overshipping absent product market fit is literally constrain the ideas to what I can do in

my brain, which is there's like a natural throttle on not getting slop out because it's not engineering throttling me. It's actually just good

me. It's actually just good commercializable ideas. And I think that's where we're

ideas. And I think that's where we're going to see some of the limits start to come in play. Again, referring to Anthropic.

Another big news piece came out is that they're hiring a bunch of PMs because they have so much engineering capacity.

They're actually limited at the PM capacity. And so it'll be interesting to

capacity. And so it'll be interesting to see where the bottlenecks in your business, you know, end up which bottlenecks are appropriate. it's

probably good to have a product bottleneck a little bit because then you're not shipping anything um which customers can't absorb and so I just I I think it's going to

and it's going to evolve over time and then you know product is going to have a whole set of skills and then I don't know what we're going to do with our time hang out on the beach but I think it's it's a pretty interesting time to

to run orgs yeah you know I think engineers designers product managers maybe it's just all going to be one blob of builders or something like that.

Everyone everyone just does things.

Everyone just does things.

And uh you know that it's great. It's uh

it's it's lowering the barriers to like just getting a lot of stuff done. And

it's like so much fun when you can when you don't have to ask somebody or get something on a backlog or whatever. You

can just get it done yourself. Uh or

even just get it done very fast in a small group. Doesn't matter what your

small group. Doesn't matter what your discipline is. It's just like a great

discipline is. It's just like a great leveler at the moment. So yeah. So,

we're live. I think our lobster is live and it should be on app.incom.

Lobster emoji.

Look at that.

That's amazing.

I need to get you all an affiliate code, you know.

Uh, yeah. I mean, lobster emojis, they're they're the new thing. They're

the new um growth hack.

They are the new growth hack. Okay. So,

we have seen your PR per R&D employee go up. We've seen how you can get from kind

up. We've seen how you can get from kind of cloud code to production very very fast with a bunch of guard rails. We've

seen your list of it looks like hundreds of skills but at least dozens of skills that you're invoking via hooks. You're

using that to not only ship customerf facing product but you're also using that just to make developer experience better burn down tech debt all those things we want to see. You all are you're measuring it both from a

telemetry perspective um both like quantitative and qualitatively. you're

measuring your cloud code sessions and you know 2x isn't enough. You're going

to get to to 10x. So you all are on the edge at least for for folks that I talk to and I'm sure you're like me where you're like sure you think we're on the edge but then I see people and they're

really on the edge. So we always have ambitions to move forward. But my

question now to you is how has this impacted how you think about your customers product? You know I'm an

customers product? You know I'm an intercom customer. I'm a FIN customer. I

intercom customer. I'm a FIN customer. I

interact with intercom code and intercom UI literally every day. My OpenClaw has an intercom API key. How do you how do you think about, you know, now that you have this experience with cloud code

internally? How do you think about what

internally? How do you think about what that customer experience is going to look like?

Yeah, I there's a few things going on.

One is that people are outsourcing a lot of decisions to their agents and like this is a good thing in many cases but you know there there was good research done recently about what does cloud code

pick and certainly I've had the experience in the distant past where I'd ask an agent to add something except do it behind a feature flag and then it would start to go and implement its own

feature flag system and this no no this is in our codebase which has a pretty sophisticated ated old school uh home rolled feature flag system. So, you

know, nowadays mostly we'll stick to whatever is in the codebase and that's fine. Um but you know SAS products,

fine. Um but you know SAS products, they're really good at their jobs.

They're actually worth paying money for.

And uh getting back to the feature flag uh situation, you know, uh if you're building a new business, you're you're relying on your agent to make decisions.

Uh often an agent will when prompted it's like hey how should I solve the feature flag problem I want to make sure I'm doing all these safe deploys and that uh the agent will just go yeah I'll

do it myself and the kind of build over buy decision uh and you can see why the agents do it this way because they can achieve this they can get it done they don't have to rely on the human okay

like open claw changes things here a little bit and maybe computer use does as well but still we're not really we haven't really adopted um SAS businesses to be agent friendly

and that means well all sorts of things around how do we position our websites and content and how do you get updated in their in their knowledge and how do they discover it and but also can they

actually just get it done like can you ask an agent hey could you just sign me up to intercom and get me uh Finn working on my website and so uh like

this goes alongside just having to make more more APIs for things. I think I think I'm I'm kind of like omni channel as such. I think like there's a future

as such. I think like there's a future for CLIs and MCP and like REST APIs. I

think I' I'd like us to be get more comfortable around things like ephemeral APIs or multi-step APIs. I think CLI are good at wrapping these kind of things.

And but the whole point of all this, where I'm getting at is like, you know, you want to be able to just help agents out at the time uh when they're interacting, they're in discovery mode, and you want to give them clues. You

want to give them hints. You want to give them help to be able to do things like sign up for something fully without having to go back to the user and say, "Yeah, sorry, can't help you there. You

got to go away and like figure out how to sign up for something." Um, so I I've been working on something in the last few weeks, which hopefully should solve that problem. on it. I can I can paste

that problem. on it. I can I can paste in a uh a prompt and then see how far it gets.

I also just while we're running this, I have to go back to your feature flag example because it you know where I used to work. It broke my heart that build it

to work. It broke my heart that build it yourself was at the top of the feature flagging list. But I do think I have I

flagging list. But I do think I have I have a a paranoia moment about this, which is model providers and harness providers

are highly incentivized to build it yourself consumes lots of tokens versus buy it maybe consumes less. So I'm I'm

just really interesting to see how this all shakes out. You know, people people are very anti SASS is dead. Um, and I'm a little bit more like, yeah, but like

the current form factor of SAS really is has something coming for it in a particular dev tools because these models are so

good at writing code. I think you're in a real um pickle to try to figure out how to find the right value wedge at the right moment. how you can allow agents

right moment. how you can allow agents to not just sign up and set up things but purchase it, you know, like what does your trial experience look like if

your first user is an agent. I think all of that is super important. And then,

you know, to your point earlier where you said, you know, are we APIs, ephemeral APIs, CLIs, MCP, I think the answer is yes right now, which is you

cannot predict the the medium by which a user is going to come to your site. They

could come through a search and hit your website and download things and look through docs. They could come through

through docs. They could come through cloud code. They could come through an

cloud code. They could come through an open claw. You just really don't know.

open claw. You just really don't know.

And so you sort of have to meet your customers and your non-human customers where they're at. And um I think it's really smart for teams that have any

part of their product that needs to be implemented via code to be thinking about this problem yesterday because you will be left behind I think if your agent experience isn't there.

Yeah, agree entirely. And I think there's a whole craft in how to make say a CLI like agent friendly. I think like MCPS obviously get that right uh a lot a

a lot of the time. But you know, for example, uh one of the things that we do in in the help is like kind of just give a hint to the uh the agent. It's almost

like prompt injection to a certain extent, except it's not malicious.

You're just trying to get it along to what it's trying to achieve. You're

like, "Well, maybe you could check email." And if if an agent has access to

email." And if if an agent has access to your email, that's what I was looking at.

Yeah. So, it's it's just there going, "Oh, yeah. I can probably get this

"Oh, yeah. I can probably get this done." Uh or like you can hint to them

done." Uh or like you can hint to them like uh I've kind of cheated with this.

So this is my own personal website hosted in in Forcell and it is uh I've kind of pre-populated a few articles so they can upload and Finn has some

content to answer questions with but you can also just uh you know return in the the help going like hey you know you should probably think about creating some articles if you want Finn to actually start answering questions and

that can be extracted from you know the codebase or whatever. Well, uh, yeah, I've been like like I've been also think like a lot of interfaces like CLI interfaces like I use GOG uh you know

it's part of the Open Claw uh universe and uh I think it's a lot better than uh the official Google GWS one and uh but I

think if you start to use it it's it's actually just more human um as in it's the interface just kind of makes more sense to a human. I think the Google one is like I kind of get what they're

getting at and there's kind of Jason in there and stuff like that. H it's not that it h but it feels more human friendly or something things that are effective for agents can often be things that are more human friendly because

they're discoverable only use verbs and words and not just kind of inscrutable weird stuff going on um in command line options. I think I've confused Claude

options. I think I've confused Claude here. I'm not sure what where is this

here. I'm not sure what where is this that's that's okay. I'm going to I'm going to narrate for folks what's happening here, which is you basically said like install intercom on this site.

There's an intercom CLI that's like cool, I can access the intercom APIs and do a lot of this. My favorite part of it though is signing up, getting a

verification email in your email address, invoking via like this hint basically of like if the user has email access set up in however you're

accessing it, go check for this verification email because we have we have a code in there that we got to snag and because you're using GOG um which is a command line tool to access Google

Workspace, I you you can go do that, pull that code in and what I think is interesting about that particular flow is, you know, I I

think AI is creating sort of race conditions in shipping across the org, which is like you can yolo a CLI probably faster than whatever team that

manages email authentication can change how email verification works. And so

you're like, I'm not going to let that break my product. what I'm going to do is create a flow that I can I can use that sort of sticky part in the flow AI brains and and get through it. And so

again, your product doesn't have to be perfect for an agent to traverse it. And

this is one of the things I'm actually really excited about SAS is all those things that are just so complicated to do as a human multistep forms and like

nested fields on nested fields and finding you know categories and just those things that I would say UX designers and product managers have written their most tedious PRDS on and

done their most detailed specs on like you don't actually have to worry about making that quote unquote usable because you can just brute force intelligence against it and and solve the problem.

And so I think that's interesting because the core value proposition can get bigger and bigger without being constrained by the surface area of a website or a UI or any of those things.

Um and so I think if you're not thinking about what does that CLI look like for you and what adjacent systems does your product butt up against? It may be

email. It may maybe may be some other

email. It may maybe may be some other dependency. Um, and how an agent might

dependency. Um, and how an agent might traverse those systems, you're just going to get less and less adoption because this is going to be more how people install products.

Yeah. And if I don't poke holes and if I don't make a CLI that kind of bypasses some of the ways that product works, somebody else will. You know, they'll just put their own agents on it and they'll burn more tokens. They might get

frustrated. Um, so you may as well

frustrated. Um, so you may as well shortcut them and give them an interface which just works. May not be the perfect interface, but that's the beauty of these things. You can get updated over

these things. You can get updated over time. You can agents can just pull down

time. You can agents can just pull down the latest version. Um, and yeah, like hopefully I have something to show here though.

Well, the the other thing that I I want to call out while you're talking about that, which is as I'm watching this and it's taking some time to build, your conversion rate drop off point is

somebody pressing the escape button.

Yeah. and just saying, "Forget it."

Like, "This is clearly not working. What

if we built it ourselves?" And so, I think it's a really interesting moment for product managers who right now are not getting the visibility of the drop off, right? When you were going through

off, right? When you were going through a website, you could put telemetry in it. You could say, "Okay, users going to

it. You could say, "Okay, users going to the signup page, drop off. Email

verification, drop off. Going to the docs, drop off." You could build this nice little funnel that identifies where your users are having problems. You can put some telemetry in your CLI, but at the end of the day, some of that drop

off and the alternatives is very invisible to you here. And the the switching cost quote quote unquote is like pressing escape and saying do it a

different way. And so again, how quickly

different way. And so again, how quickly you can speedrun to a 0ero to1 installation in an agent, I think, is something that everybody should be running right now. Um, and it doesn't

just have to be a code product. Like I

think more and more people are doing non-technical tasks and interacting with non-technical SAS in claude code in

claude co-work and so you know even if you're not dev tools if you're not thinking about how can a user do this quickly in in in a third party harness

or or system or an agent can do this quickly um you're really missing out on customer growth totally. Okay, how are we doing?

totally. Okay, how are we doing?

It's on its fourth attempt.

That's fine. And you know what? Let's

let's press let's press the escape because you know what? Let me tell you how cheap that exercise was.

It was like five minutes and some tokens and you're going to spin up a fresh claude code. You're I don't know if you

claude code. You're I don't know if you put make no mistakes. That was probably what we missed. Make no mistakes. Um and

it and it could have done it. And again,

this is just learning like why why aren't why isn't every engineer, every PM doing this once a week or once a month just to figure out h how it can

work. Um, I think it's great. So, Ryan,

work. Um, I think it's great. So, Ryan,

you've shown us everything. You've given

us all all the secrets. Let's get out of the terminal and let's do some lightning round questions.

So, my first question for you is, how does it feel? Because what what I observe from our conversation is it feels fun. Like culture has in fact

feels fun. Like culture has in fact gotten better not worse because of this investment. And so you know as a company

investment. And so you know as a company that has really put in the effort both on the on the customer side and internally how do you think it's shifted culture has it at all? Um what have you

observed?

Yeah everything is just faster and more exciting. You know, I mentioned feedback

exciting. You know, I mentioned feedback loops a good few times and you know, you can just get stuff out there so fast now and uh I've been having the most amount of fun in my career over the last three

months or something like that. And like

it's it's fun in many ways. It's fun

because I can do stuff that again I would have had to convince other people to do or they were just things on my wish list and I could never get around to them and I just kind of complain about them. Um, but now they're just

about them. Um, but now they're just realizable. But also the fun aspect of

realizable. But also the fun aspect of like making other people productive, like leveling people up, getting like removing work. I had like uh intercom is

removing work. I had like uh intercom is pretty good culture around resisting like the kind of slow movement towards being a large company and all this process and stuff like that. And we're

kind of in denial that we're like a large company. Um, I think it's a

large company. Um, I think it's a healthy way to work in many ways. And uh

but this has kind of got us back to our roots in in a lot that you know you you can make fast decisions and get them delivered and get that feedback super

fast. Um and I've been able to like ship

fast. Um and I've been able to like ship actual features like not just the CLI but I ship shipped some web hook features and um it's been a long time since I've done that. I'm just I've been

in the weeds and platform space for a long time. Um and but it wasn't even a

long time. Um and but it wasn't even a big deal. It was like just a couple of

big deal. It was like just a couple of hours just kind of get something done.

and it was like something a customer asked for. So my job has become more

asked for. So my job has become more varied. Um I'm able to kind of see more

varied. Um I'm able to kind of see more and get more done and help other people get a lot more done. So you get this kind of excitement and velocity increases and you know we have all those measurements and that's all kind of good

stuff but just the excitement of waking up in the morning going like I'm going to get a lot done today. Like that is a fun way to go about your day.

I I completely agree and I hear this over and over and over again. I

certainly feel it myself which is this is the it brings me back to why I learned to learn to code. It's like that same moment of I didn't learn to code because I like to type code. I learned

to code because of the magic of you running like hello world and it it shows up somewhere and that feels so it's just a very creative experience which leads

us to my second question which is I see all the time that one of the most impactful change agents inside an engineering organization can be a senior

principal engineer saying let's go ham on some AI code and the single most blocking person in the organization can be a senior principal engineer going I

don't believe it. Absolutely not. Not

me. Not here. Not. No way. And in fact, last week I heard a story of somebody who had their most senior staff engineer quit. Says, and I quote, "I do not

quit. Says, and I quote, "I do not believe in AI. I will not work at a place that does this." So, what is your appeal sort of engineer to engineer of

of why to invest in this? Why why you think it's the way that engineer organizations are moving and how you kind of come to meet skeptics where they are? um and hopefully see things a

are? um and hopefully see things a little bit from more from where kind of intercom is approaching them.

I mentioned that intercom kind of had it on easy mode. Um we didn't have to convince leadership that there's something to this AI stuff like we were pretty much had decided the direction of the company the weekend that chat GPT

came out. So

came out. So y uh so we already had this expectation that this would be transformative across many parts of our work including all of building product and engineering. We

were just kind of mostly annoyed about how long it took. Um um but I think uh for sure it does need strong advocates

and you need to push uh boundaries like one of the biggest things that I've been able to do successfully was kind of push through the barrier of like should we

let uh an agent connect to Snowflake like what like and there's all these things can go wrong or should we let our agent run real production code in in our

Rails console over API and the easiest thing to answer there is like well you know I'm not sure h or like this is this is risky or we we should think about this but we've been largely pushing

through it and now like not recklessly uh like we've lots of good controls and we're a mature business and uh we have like I've been on our security team but definitely uh not trying to do anything

uh too wild but there's still even then I have apprehension like is this like I think I I think we should do this but it seems weird or it seems hard I just have to give myself permission and then I

realized if I had to give myself permission, there's loads of people out there who just need need permission and um honestly like one of the biggest things I do at intercom is just telling people they can do things. Um there's

there's a pre pre AI and post AI and uh or telling them like look whatever you do just blame me if it all goes wrong and I guess maybe we can blame blame

claudo but but ultimately it's that like permission and just like there's a level of ambition which comes from as well is like if you if you're out there saying

I'm not sure if AI is going to take or have a big role to play in all of our work and if you keep on saying that that kind of will permeate through the cult culture and people say that but if you're very clear you say you're saying

that like look all work is going to be agent first like at some stage in the near future h and so we're going to figure out the path there and so we're going to break down every barrier as we come across them and look it's your job

it's my job and if anything goes wrong blame me like that's largely been how I've been approaching but not just me like this has been a very large collective effort but giving that kind of permission thing but also the kind of

uh like freedom to like explore or push things or whatever It's kind of necessary. And look,

necessary. And look, it might be a less stressful way to go about it to like just take a nap for a few years and come back and then well, all the problems have been solved uh and

we've got these perfect agents uh running a muk in our environments then um then that that would avoid some of this. But like I think all places have

this. But like I think all places have to get through that kind of apprehension and initial kind of issues that some of these can uh some of the introduction of agents into environments can have. And I

think our job as leaders whether it's as an engineer or as a manager or whatever just has to be on that like enablement and giving people space to to to go deep on the work enjoy it and like have that

moment where things click and you start realizing like oh my god this is uh something that will transform how much I can get done.

Say it again for the people in the back.

I love I was like, "Oh my gosh, I love this so much." And you know, I it it is absolutely those two things which is like give permission. You you can please

just go please by all means go ahead designer hit me with a PR. No one's

going to get mad at you. Like go ahead.

And then the second thing of just accountability can roll to the top and not in a scary way. Let's not do irresponsible things. But I, you know,

irresponsible things. But I, you know, we've seen seen a couple incidents in the past months, some big ones, and what you see is CEOs or big leaders coming

out and saying like the team's shipping and we want to keep shipping and we're going to be careful with our customer data and we care for the customer experience and stuff happens. We've

learned from it. It's ultimately on me.

I'm going to call the customers and we're going to we're going to move on and deliver great innovation for you.

And you know what I tell people to, you know, to get them over that hump, which is like you really got to know what your existential problem is. And I love what you said is the second that chat GPT

came out, intercom changed, because that is an existential problem. Who writes

the code in your codebase, agents or humans? Not an existential problem. Like

humans? Not an existential problem. Like

will you be fundamentally disrupted by a new technology? That is the real problem

new technology? That is the real problem in your business. So, I always tell people like let's differentiate the real problems in our business from problems that we can tolerate and then go go use

the problems we can tolerate to move fast. Um, and so it sounds like you have

fast. Um, and so it sounds like you have a really good I mean I think at the end of the day the results speak for themselves and again you all are not asking me to say this. Intercom has

meant the moment you went all in on AI assisted you know customer support and experience. you're now building models

experience. you're now building models and so it's not just a oneanddone chat GBT is here we need to change how our product works or AI assisted coding's here so we need to change how our

engineering team works it's you know models are going to be how people differentiate we need to go there CLI are going to be how people use products we need to go there and so I think this sort of like fearlessness and what I

would suspect is like just a fun nice high trust culture good people uh you actually see the business results on on the other side So, I'm going to hype you up. I see a lot of teams. I see a lot of

up. I see a lot of teams. I see a lot of leaders. Um, and I think people can take

leaders. Um, and I think people can take a lot of inspiration from this. But,

let's uninspire them really quickly before I get you out of here, which is my last question, which is when um Finn takes 15 solid minutes on a live live podcast to do a very basic task that you

know it can do or not Finn when when clog code.

Yep.

What do you do? Do you yell? Are you a yeller? Um,

yeller? Um, what does your what does your metaanalysis on this internal dashboard say the human needs to improve on?

I I I do lapse into giving claw code like uh just like smiley faces or unhappy faces or, you know, not over the top. I I certainly haven't cursed at it.

top. I I certainly haven't cursed at it.

Uh, very polite.

That's kind of not my smile. But I do like the odd kind of like at a boy kind of smiley face. I don't know if it knows like that I'm deeply thinking about this and like these little subtle kind of

hints or whatever. But um yeah, no, I think like professional with a few emojis is is my style with Claude and you know hopefully that'll come back to me someday with an emoji.

Same. I waste the tokens on telling it it did a good job. I somehow in my mind I'm like that's going into into its own sense of it itself and it's going to

know what good looks like. So I I am there I am there with you. All right,

Brian. This has been one of my favorites, y'all. If you have gotten to

favorites, y'all. If you have gotten to the end, there is so much alpha in this episode. I cannot believe it. This is a

episode. I cannot believe it. This is a cheat code to winning friends and influencing SAS um through AI engineering. Brian, where can we find

engineering. Brian, where can we find you and how can we be helpful? Uh

I can be found on the internet at a nice vanity URL which is brian.scan.ie.

Uh and I got a few links here to some of the talks and similar writing and different bits and bobs. As you can tell, I'm not a designer. I asked Claude to design this as if I was a Unix systems administrator writing a little

web page and it kind of shows. Um, I'm

active on X Twitter and Brian Scanland.

Um, I'm on LinkedIn Scanb or something like that. I think I'm the most famous

like that. I think I'm the most famous Brian Scan on the internet. So,

generally you can just type Brian Scan in that tends to work. Um, and I tend to be active in like showing up to different conferences and uh just like getting the good word out about what we

do at intercom, mostly these days AI, but I've also given lots of talks about many other different topics. And um,

yeah, I'm also a big believer in just saying yes to a lot of things. Um, so if you look me up, you got a good idea, you want to get in touch, uh, you want to run stuff past me or whatever, chances are I'll say yes. H, and we can I'll

just keep on doing this until things break and then I start saying no. So,

but I'm still not there yet. So, bring

it on.

Great. So, search for Brian and ask him to do something for you. That's it.

Well, thank you. So, I mean, thank you truly for sharing all this information.

People are going to get tons of value out of this. It's going to be a hit for sure. And I just really appreciate you

sure. And I just really appreciate you joining how I of course. This is so much fun.

of course. This is so much fun.

Thanks so much for watching. If you

enjoyed this show, please like and subscribe here on YouTube or even better, leave us a comment with your thoughts. You can also find this podcast

thoughts. You can also find this podcast on Apple Podcasts, Spotify, or your favorite podcast app. Please consider

leaving us a rating and review, which will help others find the show. You can

see all our episodes and learn more about the show at howiipod.com.

See you next time.

Loading...

Loading video analysis...