The Rise of Agentic Commerce — Emily Glassberg Sands (Stripe)
By The MAD Podcast with Matt Turck
Summary
Topics Covered
- Payments Data Enables Custom Foundation Models
- Foundation Model Detects Invisible Card Testing
- AI Automates Dispute Wins Effortlessly
- Agents Demand Intent-Based Commerce
- AI Startups Monetize 3x Faster Globally
Full Transcript
Stripe's network handles on average about 50,000 new transactions every minute. To put that in perspective
minute. To put that in perspective because it's a lot of zeros. That's
about 1.3% of global GDP.
Welcome back to the Matt podcast. Today
I'm sitting down with Emily Glasber Sense, head of information at Stripe.
Once a payments API startup, Stripe has become one of the most legendary companies of this generation and a full financial infrastructure platform that moves 1.3% of the world's GDP online. We
talked about why Stripe decided to build its own AI financial model and what it learned in the process.
Stripe is a little bit different. We
have really differentiated data. OpenAI
doesn't doesn't have that data.
Anthropic doesn't have that data. Our
first instinct was actually full-on wrong.
We also discussed the brave new world of agent commerce where agents will buy and sell on our behalf and what it means for payments and new infrastructure like MTB servers. who's doing the buying is
servers. who's doing the buying is different and where they're doing the buying is different. It's pretty clear that MCP is becoming the default way that any single service, Stripe or
GitHub or Notion talks to an LLM.
We close a conversation covering fun Stripe data about the incredible rise of this generation of AI startups.
They are monetizing faster than any previous generation of startups that we've seen. Those that already hit 30
we've seen. Those that already hit 30 million in annualized revenue got there in about a year and a half. For
comparison, the fastest growing SAS startups on Stripe took 5 1/2 years to hit that same mark.
We're living in an era where AI is increasingly rewriting commerce, money movement, and risk. And this episode is a great way to make sense of where the world is going. Please enjoy this terrific conversation with Emily Glasber
Sense.
Emily, welcome. Thanks for spending time with us.
Delighted to be here. Thanks for having me. All right. So, everyone uh in tech
me. All right. So, everyone uh in tech obviously knows Stripe, which is a monster of a company, but um maybe for context uh what is the uh latest and
greatest way of describing the full breadth of what the company does and maybe the latest stats?
Well, Stripe builds programmable financial infrastructure. So put kind of
financial infrastructure. So put kind of less buzzwordworthy. We are giving any
less buzzwordworthy. We are giving any business whether it's a 20-year-old selling a Figma template or you know now more than half of the Fortune 100s um
the rails and the intelligence to move money online um and to grow faster. Uh
you asked about the numbers last year companies processed about $1.4 trillion uh on Stripe. Um, to put that in perspective because it's a lot of zeros,
that's about 1.3% of global GDP. Um, and
that number grew 38% year-over-year uh in what many experienced as kind of a a rocky macro climate. Um, Stripe's
network handles on average about 50,000 new transactions every minute. So, those
are the transactions that are adding up to uh 1.4 trillion in in payments volumes processed annually. Um, and
every one of those transactions is training data for some of the AI systems um that we will that we will talk about today. I just say because of the
today. I just say because of the flywheel like Stripe is no longer the payments API. If we were talking 10
payments API. If we were talking 10 years ago, we'd be talking about a payments company, but in practice, we're optimizing now the entire payments life cycle. the checkout, user experience,
cycle. the checkout, user experience, fraud prevention, um bank routing, automatic card update retries, even how you handle disputes as a business. Um,
and that's all in service of merchants profits, right? Growing their revenue,
profits, right? Growing their revenue, um, and reducing their costs. And so I think of sort of the tools we're creating as generating a structural
tailwind for the internet economy, for growth in any environment. And we're
already seeing it. You know, businesses on Stripe grew seven times faster uh last year than the S&P 500. So, um it's it's that infrastructure creating a structural tailwind for growth. That's
our primary focus.
Amazing. All right. So, we're going to unpack uh some of this. Uh before we do that, you are head of information at Stripe. What does that mean? What does
Stripe. What does that mean? What does
your remmit cover?
Yeah, our information org is really focused on three things. one is how do we use data effectively and that's you know end to end how do we do the data
engineering and analytics and internal science how do we build ML powered applications for our users um the second thing the information or works on is
growth and the self-s serve business so millions of businesses run on stripe um the vast vast majority of them and you know almost all of the SMBs and startups get going directly in our product and so
building that uh productled growth front door experience uh for users is sort of our second focus area and then the third thing we work on is experimental projects which um you know I I have
mixed feels on this name because I think innovation and experimentation is so important and it um can and should and
does happen everywhere but the concept of an experimental project's team is really just having a couple dozen
standout engineers and PMs who can go run ahead at really big perishable meaty opportunities that we couldn't easily
staff from within any of our current product verticals. So information is
product verticals. So information is data self-s serve and experimental projects.
Very cool. Experimental project sounds like a very fun job for the right person. Very cool. Uh and you uh came
person. Very cool. Uh and you uh came from the data science world, right? your
you were at Corsera before this and Harvard. Maybe walk us your uh through
Harvard. Maybe walk us your uh through your journey and um why you chose Stripe.
I think I've kind of just always chased puzzles where better data, better understanding unlocks kind of outsized
social impact. Um that's what drew me
social impact. Um that's what drew me into academia. So Harvard, I was an ecom
into academia. So Harvard, I was an ecom PhD and um you know ran a bunch of field experiments that expose hidden frictions, right? Like why do referrals
frictions, right? Like why do referrals dominate hiring? Why are female
dominate hiring? Why are female playwrights so underproduced? And um you know got a lot of um
just like pleasure from seeing policy shift, decision-m shift, incentive shift once the evidence was clear. Um, going
to Corsera for me was really about kind of translating that impulse into product, right? It was 2014. I was in
product, right? It was 2014. I was in my, you know, fourth year of the PhD program. I graduated a little bit early,
program. I graduated a little bit early, so coming up on graduation. And I said, hey, you know, where do I think sort of this obsession with like better data unlocking outside social impact is most
going to matter? Is it going to be in writing papers or is it going to be in, you know, diving into, in this case, edtech? And um you know Corsera was
edtech? And um you know Corsera was super small at the time. It was less than 40 folks. But you know what it turned into was AIdriven learning paths and skills-based hiring tools that
opened opportunity for um you know tens of millions eventually hundreds of millions of learners around the globe.
And the transition I was there about 8 years and the transition to Stripe's um you know really like the same mission at economic scale. Stripe is about
economic scale. Stripe is about equalizing access to creating a company and reaching customers globally for businesses everywhere. Um, and then, you
businesses everywhere. Um, and then, you know, I'm an economist by training, so really care a lot about incentives. And
I think a thing that struck me from my, you know, first conversation with Patrick was how aligned incentives are between what Stripe wants and what the
businesses running on Stripe want. So
like if a coffee roaster in Berlin sells more, right, Stripe grows and so does the internet GDP. Um, and so that
ability to like build and ship any product that makes a business more successful without even really needing to worry about first order monetization of that product, right? Because we in
most cases already sit on monetization of the payments infrastructure. um was
just really exciting for me and kind of kid in a candy shop and that's all manifested over the last kind of almost almost four years now. The only other thing I'll add about about the stripe
poll was just like the data set here is kind of like looking at a macro MRI.
It's like a real time image of the global economy that we can then actually action and improve. And so you know that's a little bit of economist catnip.
Awesome. All right. So the big news um that uh you announced a few weeks ago now is the launch of your own foundation model uh which I find fascinating in so
many ways um including for starters the fact that uh if you listen to the general zeitgeist on Twitter or on AI
panels a lot of people say well it's a silly idea to create your own foundation model these days because the large general foundation models will do all
things to all people uh and or for all people and uh so it's it's it's interesting to start with from that perspective. So maybe walk us through
perspective. So maybe walk us through the thinking of experimenting with the idea of a foundation model uh and then launching it.
We've I think all seen and are all experiencing this sort of explosion of impact from foundation models that are trained on broad data um and that can
then be adapted for a bunch of downstream tasks, right? So you know GPT for for language or diffusion for images or time GPT for time series. And in each
case the trick is kind of the same which is there's a transformer and it soaks up incredibly diverse data. It learns a
kind of dense embedding space and then later you fine-tune or prompt it for whatever job you need. Um, you know, to
your kind of push earlier, I think if you're doing a pretty standard image thing or you're doing a pretty standard language thing, you should for sure use
out of the box LLMs with some prompting or some fine-tuning. And maybe we'll talk later about sort of the AI economy that we're seeing, but like there is
just a wealth of really cool, you know, applied AI companies solving vertical problems that start out just as like pretty simple rappers. Um, and rappers
is sometimes said in kind of a derogatory way. Uh, which I think is
derogatory way. Uh, which I think is actually misses the point. like these
businesses are bringing real context and real relationships and real incremental data um to build that you know rapper differentiated product experience but like I totally agree with with the
general sentiment that for many slashmost businesses and certainly many slashmost startups who don't have access to any kind of proprietary data um start with
out of the box um uh LLMs stripe is a little bit different right so um We have really differentiated data which is um sort of data at the scale that I was talking about earlier of like $1.4
trillion a year in in payments volumes flowing through us and that's data that's like OpenAI doesn't doesn't have that data anthropic doesn't have that data um and it's a pretty different
problem in some ways not in all ways but in some ways than than a language problem and certainly quite different than an image problem.
Um and you know this isn't our first time putting that data to use. you know,
been well over a decade at Stripe that we've relied on specialized ML systems, right? We have radar for fraud, we have
right? We have radar for fraud, we have adaptive acceptance for soft declines.
Um, but each of those models is sort of narrow single task model. Um, and each of those models historically only saw
kind of a sliver of reality. And so, you know, last year we were stepping back and looking at what foundation models can do and recognizing that, you know, we're logging tens of billions of
transactions. And at that density,
transactions. And at that density, actually, payments, while a different problem than language, start to look like language in some ways. Um, there's
an agreed upon syntax, right? There's
the bin and the MCC and the amount. Um
there's sort of some longer range semantics like is this device reuse?
What's the merchant history? Where is it in the card life cycle? In a similar way to how kind of language transformers learn an embedding space where words
with similar meanings cluster together.
We thought, hey, intuitively at our scale and given how kind of payments data is structured,
um we could probably learn payments embeddings as well. Uh or it's at least at least worth uh worth a shot.
Yeah. And and just to just to double click on this since you're on the topic that's one of the things I find uh particularly interesting about the idea of creating this foundation model is
that as as you said in credit card data there is a lot that looks like language but equally there is a lot that looks
very different right the data is presumably sparser uh there's no grammar to it the way you would find in language
or or code. So I'm curious about how you thought about that uh kind of um you know two sides that heterogeneity of the data.
I would say the thing that's most interesting to me about the analog between language and payments is in
language words have a meaning in relation to the other words around them.
And in much the same way, a payment has a meaning in relation to the other payments around it. And so with our foundation model, what we're really
asking is like what if every charge got its own vector
in a similar space? Um, and then as each new charge comes in, you place it in that many dimensional space. and
understand where it sits in relation to for example a known card testing attack or known fraud or a known merchant issue. The other thing I'll note about
issue. The other thing I'll note about learning these embeddings is it doesn't require any labels, right? It's fully
unsupervised. So, you know, jumping back to the specialized models, fraud, O disputes, um those work because of the labels, but being able to do a fully
unsupervised approach means you can actually use the all of the tens of billions of transactions. You can adopt it at very large scales. don't have to
constrain to the subsets of data where you have um relevant labels. And so I guess like the the simple description of
uh why a payment foundation model has turned out to work is like how much data can we learn from. So literally all of Stripe's history, not just some task
specific subset. How richly we learn. So
specific subset. How richly we learn. So
you know these these these very dense embeddings capture subtle interactions and similarities among charges that like manual features or like counterfeatures
will totally miss. And then the third and this is more kind of operational but I think it matters given kind of the pace of AI is just how efficiently we can build. like we now have these shared
can build. like we now have these shared embeddings. They're available in
embeddings. They're available in Shephard which is our our shared feature store which we actually co-built with Airbnb and have open sourced um under the name Kronon. But like it makes
spinning up a new model become a weekend project not a quarter project because you get kind of out of the box um these these embeddings. One aspect that I find
these embeddings. One aspect that I find particularly fascinating is that uh is that tension between traditional machine learning and generative AI slash
foundation models. My takeaway from
foundation models. My takeaway from spending a lot of time in the space is that um the end result of the current phase we're in is more of an ensemble
approach where you have foundation models for certain things and traditional machine learning models for other things. typically stuff that fits
other things. typically stuff that fits a bit more precisely in rows and and columns. What I'm getting a sense uh in
columns. What I'm getting a sense uh in this discussion is that effectively the foundation model uh just outperformed what traditional machine learning models
were were supposed to be best at to the point that the foundation model would replace the machine learning models. Is
that is that the right impression or am I jumping to conclusions? So, so yes and I think we will get to a point where it fully replaces today it is as you put it
an ensemble but it's an even more nuanced ensemble which is it's an ensemble um within a problem space. So
like take the example of card testing.
So, so card testing is when a fraudster um is is trying to find um cards that work either so that they can use them later for fraudulent purchases or that
they can so that they can sell them um to other fraudsters to use. Um there are labeled examples of card testing. There
are traditional um machine learning models that Stripe has and has invested in substantially to identify and block
card testing. Um but there are important
card testing. Um but there are important slices of card testing that traditional methods just literally can't see. So um
if you think about like a global online retailer, they might see hundreds of thousands of legitimate purchases in an hour. fraudsters might slip in I don't
hour. fraudsters might slip in I don't know like a few hundred 37 cent authorizations like way too dilute for
any of your traditional models to catch.
Um the foundation model is basically watching the sequences in the way that you like watch frames in a movie, right?
So it sees 200 like near identical requests, same low entropy user agent like maybe rotating the proxy IPs maybe I don't know spaced like 40 seconds
apart or something, right? And and they they light up kind of this um red island um that denotes card testing and and and
can get blocked. And so what's unique about that is like you know the number of clusters can be very large. There's
like a lot of different card testing attacks that can be happening, but the number of labels that are needed to correctly classify a cluster is actually quite small. Like you really just have
quite small. Like you really just have to know that if the cluster is tight enough, you really just have to know that there's like some evidence of card testing there to know that the whole cluster is card testing. And so given
the size of the stripe network, like we can find labels for even very small clusters, which is, you know, what boosts our recall, right? So in this
case, we ensembled together the existing traditional card testing models with this classifier classifying sequences of these foundation model embeddings and
our detection rate on large merchants went from 59% to 97%. And so you know will we move to a world where eventually all card testing is detected by the
foundation model? Maybe. But what's more
foundation model? Maybe. But what's more interesting to us right now is solving the problems that couldn't previously be solved.
So how does one go about building a foundation model? Walk us through the
foundation model? Walk us through the history of this when you guys started thinking about it and then what do you next and what team does it?
Yeah. Our our well so first of all our our first instinct was actually fullon wrong right which is like let's just throw like bigger transformers at single payments. And I said earlier like, oh,
payments. And I said earlier like, oh, what's interesting about payments sort of similar to language is like words only matter in relation to the words around them. Payments only matter in
around them. Payments only matter in relation to the payments around them.
But actually like that wasn't Xanti obvious um to us. Um a loan payment record you mentioned kind of sparse.
It's also kind of boilerplate and after something like a billion tokens, the loss curve kind of flattened, right?
Like scaling wider wasn't going to be the answer. And so we actually had to
the answer. And so we actually had to change the question and instead of treating a payment as an isolated atom, we um stitched charges together into
these short histories, right?
Represented as sequences. um you know everything the same I mean there's lots of different types of sequences but like everything the same card did in the past few minutes everything that flowed through the same device on a Friday
night everything that this merchants's new bin saw during some pre-sale frenzy and then kind of like the moment we trained on sequences the model had fresh
signal to learn and and kind of the curve um started dropping again and so the the backbone that we ended up with is a BERT encoder and by the way we also tried
a decoder only you know model architectures like GPT but just BERT is better for understanding tasks right what we're really trying to generate is the embedding the understanding of the payment and then we put it in relation
to other payments and the GPT is better for generation right but we're we're not actually trying to generate um in in the first stage so it's all based on BERT uh versus GP fascinating
yeah it's a B it's a BERT encoder um and it definitely like you know you asked who did the work uh we actually just originally had three MLES who we like put in a little bubble. They had
they'd worked on riskrelated problems in sort of um uh previous instantiations of their careers at Stripe. Um but we put them in a little bubble and said, you
know, think about the broad set of problems Stripe faces that might be solved by a foundation model and choose a couple of steel threads and then, you know, go see how much progress you can
make against those steel threads. These
folks were, you know, protected from day-to-day operational load, were protected from incidents, like weren't running any, you know, production grade systems at the time, and it really operated kind of more like um more like
a research team.
Are they part of that experimental group that you mentioned up front?
It actually wasn't because the experimental uh group has only been around uh about a year and a half now.
So, we started this like shortly before that. Um but same same concept, right?
that. Um but same same concept, right?
Like they they don't happen to report into that. they report into our ML
into that. they report into our ML foundations team. But structurally it's
foundations team. But structurally it's it's the same idea and and was part of actually the motivation for then scaling up um experimental projects and because it's bird based was that
less of a massive compute data crunching effort or was it still intense?
I mean uh less of yes and still intense yes like definitely wasn't all smooth on the infrastructure side. Um, we had to
build a custom tokenizer and optimize it for Stripe events. Um, we had to scale our data pipelines to grow to the very large data sizes. I mentioned earlier
like previous models just hadn't trained on such large amounts of unstructured data all at once. We also just like had to build custom data loaders to make
sure that GPU utilization was high.
um earlier versions actually resulted in like pretty low GPU utilization. Data
loaders, you know, became the bottleneck. And so, yes, that made
bottleneck. And so, yes, that made training more expensive, but also it made it slower. And so, yeah, I mean, this was this was something bigger than we trained before. We had to add a bunch of checkpoints to make our runs more robust, intermittent failure, you know,
the kind of stuff that you would be doing anyway if you were like an AI lab, but we are not first and foremost um an AI lab. And so um those were those were
AI lab. And so um those were those were all um sort of progressive builds for us.
Any other bottlenecks or parts that felt uh harder than they should have been whether there was I don't know data quality or um any other part
when it came time to actually so running the model in shadow we we run all of our ML in shadow before we roll it out.
Running in shadow was relatively straightforward. The first experiment
straightforward. The first experiment though we ran in production had um a bunch of like latency and reliability requirements um that um put pressure on
you know some of our some of our some of our systems. Um as you can imagine like these decisions have to be made in the charge path. So in real time like you
charge path. So in real time like you have maybe dozens of milliseconds um to make the decision. And actually part of the reason um that we were totally happy to start with this kind of ensemble
model is we had a full fallback to the to the existing model in cases where we couldn't meet the the latency requirements. But yes, plenty plenty
requirements. But yes, plenty plenty learned in the journey.
How do you think about transparency? So
in the world of uh financial data and given the absolute mission criticality of what you do and also from a regulatory standpoint uh the concept of
blackbox quote and of quote AI maybe something that uh people may raise an eyebrow about um how do you think about
transparency and explanability? My first
reaction to that is LLM are actually getting quite good at explanability, right? And so to the extent that the
right? And so to the extent that the model is seeing patterns, even patterns that humans couldn't enumerate, um, sort
of an LLM on top can say something like, you know, high velocity CVC mismatches on a new device are the explainable
reason, uh, sort of the summary of this cluster. Um, but I really do think of
cluster. Um, but I really do think of all of these defenses as um sort of like a twostep dance. Like there there there
will always be room for rules. Um rules
provide speed. Rules provide clarity. Um
we ultimately put our users in the driver's seat. Users can write radar
driver's seat. Users can write radar rules. They can say, you know, never
rules. They can say, you know, never accept firsttime cards from this country over $1,000. Um, and we actually uh
over $1,000. Um, and we actually uh about a year and a half ago released a tool called Radar Assistant that lets them type that in plain English and test it and ship it instantly without even
having to write code. Um, but then the models are really needed um for for nuance, right? For for seeing the the
nuance, right? For for seeing the the patterns that humans can't and um when they conflict historically the rule one all right merchants keep ultimate veto
power. Um but a few weeks ago we
power. Um but a few weeks ago we actually updated our systems to blend the two even better. So we call it dynamic riskbased rules. And how it
works is um instead of the user writing a brute force rule like block every you know CVC mismatch or every postal
mismatch the rule can be blended with the model. So like block every CVC
the model. So like block every CVC mismatch if um you know the real-time model or the issuer score uh call it risky beyond some threshold. What that
allows is kind of the the best of both worlds, right? Like there's always some
worlds, right? Like there's always some good customer who fatfingered and and they should be able to get through. Um
but the sketchy traffic is is still is still stopped. So, you know, I don't
still stopped. So, you know, I don't think transparency or explanability is yet 100% there. Um I think we will
continue to use um rules and models in parallel. Um and then there are of
parallel. Um and then there are of course just like engineering and logging best practices around um making sure you
are storing the the features that were used uh by the model and the model output so that expost if you know whether a user or a regulator uh comes
and and wants to understand um what drove the decision beyond what you've logged um you you can always reconstruct that cleanly. You mentioned radar and um
that cleanly. You mentioned radar and um the long history that Stribe has had to build. I'm curious about how you think
build. I'm curious about how you think about uh where to deploy machine learning and AI across products. Uh
obviously we are in that moment in tech when like everybody wants to do problem X plus AI and equals equals magic. But I
think it would be very interesting for people to hear about how somebody like you at the at the very you know edge of the the space think about okay this is a
problem for AI and this is a problem where AI should be actually not included at all.
You know there's so much enthusiasm about the latest models and the latest methods and I think it's really easy to start with like what can the models and the methods do and then try to like come
up with a product from that. We like to start at the opposite end of the spectrum which is like the simple business test like what is the user pain that we are hearing or seeing. what
metric best captures that user pain and if we were to build a AI solution ML solution that nudges this metric by you know even a single percentage point right when you're talking about stripe
scale like a single percentage point of improvement is a lot of money back to the businesses that run on us and the internet economy like does moving that metric matter so it's sort of like sort
of like starts with the the user pain and and the business need um and then we look at the data like it has to be plentif It has to be already flowing through Stripe's pipes. Doesn't mean we can't
Stripe's pipes. Doesn't mean we can't think expansively about what other data we'd like to be collecting over time, but like you're not going to turn on a solution, an AI solution today if like you don't have the data. Um, and it has
to either be amendable to unsupervised approaches or we have to be able to label it well enough that the model can learn. Is there the data and is it
learn. Is there the data and is it structured in a way that's useful? Um,
and then finally, we like to ask just like whether Stripe has a built-in advantage. is this something we can do
advantage. is this something we can do uniquely well because of our network and that usually comes down to the shape of the data that enables it and um the fact that we have
that data in a way that other people may not. So a recent example that might um
not. So a recent example that might um bring that to life a little more is our smart disputes product which we announced just a few weeks back. So
chargebacks so let's start with like the the the user pain the business needs. So
chargebacks are really painful. Um
merchants lose about $55 billion a year um to chargebacks and fighting disputes is also really costly for the business.
Like fighting a single dispute can mean putting together a 12-page evidence packet, digging up receipts, looking at IP logs, um tracking down delivery confirmations, pasting everything into
this like dozen page PDF. Um, most
businesses are only bothered by the biggest ticket items and um, for lean teams, which includes like basically all of the startups out there, they rarely bother. Like it's just not worth their
bother. Like it's just not worth their time. They don't have the expertise in
time. They don't have the expertise in house. I was talking to a friend of mine
house. I was talking to a friend of mine the other day who um, runs a uh, like jobs marketplace and she's one of the few marketplaces that monetizes off of
the job seeker instead of monetizing off of the employer. And she's just getting crushed by disputes. and she told me like, "Hey, Emily, it's crazy that these people are disputing because they're
saying that they that they never used my service, but they've literally uploaded their resume. Like nobody else has their
their resume. Like nobody else has their resume. Nobody else benefits from
resume. Nobody else benefits from uploading their resume." Like this is, you know, it's called friendly fraud, but like that's kind of a misnomer because it's not friendly, but it just gives but but she literally doesn't fight them. Like she has all the
fight them. Like she has all the evidence, but she doesn't fight them.
And if you ask her, she's like, "It's just not worth my time to put together these crazy packets." Okay, so like a small improvement in dispute win rates would translate into hundreds of millions of dollars um across the stripe
network. Um so it sort of this satisfies
network. Um so it sort of this satisfies like the first bill of like there's a real user pain and there's real business opportunity here. Okay. Then the second
opportunity here. Okay. Then the second is kind of do we have the data? Well, we
already see which disputes are being won and lost um you know of those that are being fought. We already store most of
being fought. We already store most of the data an issuer would want to see when it decides a chargeback. So, it's a great candidate. Um, which is why we
great candidate. Um, which is why we launched smart disputes. And it's
basically just a classifier that grades every incoming chargeback as it comes through on its likelihood of success.
And if the model thinks that the merchant can win, then we overlay this LLM powered agent that goes out and gathers the right proof, right? like um
you know IP address matches for the digital services and screenshots of like the usage and you know whatever the issuer historically prefers and then it just bundles that evidence into the
format that the bank expects and files the response um without any human having to touch the case. Um, and then of course it watches the ruling um, and
then feeds the outcome back into training so it keeps getting smarter.
And like Vimeo and Squarespace were our two first adopters, but they're recovering 13% more revenue on disputed charges um, from adopting it. And
they're doing that with zero extra labor. Like you literally don't even
labor. Like you literally don't even have to click a button. You you just toggle once to turn it on. Um, and then you know the impact is even greater for these tiny merchants who never used to contest chargebacks at all and they now
have kind of this like AI parallegal that's working for them. And so you weren't asking about smart disputes, you were asking about the mental model, but it's basically like big user pain, abundant stripe only data, a clear kind
of model driven fix, and that's how we decide where's the next sort of place that that Stripe AI should go.
And a lot of what we talked about so far has had to do with with stopping bad things from happening. So fraud, card testing, um illegal chargebacks. Are
there uh examples where you use AI to generate revenue? I guess the example
generate revenue? I guess the example that you just mentioned uh does generate increasing revenue, but whether that's I don't know a smarter route or faster checkout, any of those things.
For sure. And by the way, fraud done well also uh generates revenue in the sense that the alternative is usually doing fraud poorly, which has a bunch of false positive, which means you're
blocking some good users.
The way we think about it is like there's a we use AI across every stage of the payments life cycle. So like from the second a customer lands on the checkout page like all the way through right to
handling those those refunds and disputes. And if you think about that
disputes. And if you think about that life cycle, there's kind of five like meaty steps. There's checkout, there's
meaty steps. There's checkout, there's authentication, there's fraud detection, which is where we've spent most of our time talking. There's authorization and
time talking. There's authorization and then there's the downstream of events like like the refunds and disputes.
Checkout is sort of like the easiest for you or like me pre-stripe to reason about because we all experience it as consumers. And I think we can all agree
consumers. And I think we can all agree that like checkout experiences um feel pretty sort of stayed and inefficient. Like no matter who you are,
inefficient. Like no matter who you are, no matter where you're shopping from, no matter how you like to pay, you usually get like isish the same old form. It
doesn't adapt. It doesn't know you. Um,
and a lot of times that's kind of all it takes for a customer to drop off at the finish line. Some of that is little
finish line. Some of that is little stuff, but some of that is big stuff.
Like, you know, if if um I only have an MX on me and MX isn't shown, like I literally would have to text my husband to get a a Visa card. And if I'm in another country and have no access to
any of the payment methods that are listed, then you've basically um shut off my market entirely. So, we've been working a lot on fixing that. In
checkout, AI is our magic wand here. We
call it Stripe's optimized checkout suite and it's just about making the checkout experience increasingly personalized for our users customers right so dynamically tailoring that
experience to each of the end users again our users users um in in real time so like Turo maybe you've used it they're like a the world's largest car sharing marketplace
they moved over to our checkout suite and saw a 5% increase in recaptured revenue which for them was I think likeundred some million dollars a year. Um, payment
methods are a really interesting subcomponent of checkout. So, there have been a proliferation of payment methods in the world, which from a market efficiency perspective is probably a
great thing. Um, Stripe now supports
great thing. Um, Stripe now supports well over a 100 payment methods. So,
like Apple Pay, um, you know, Ideal, buy now pay later. Um, and so what we do in the optimized checkout suite is like more payment methods is better for
business because it comes kind of out of the box for businesses. They can reach more customers with what they need. But
actually showing more payment methods to their customers is suboptimal because people get choice anxiety. If they don't see what they need in the first three, they give up. And so we provide all these payment methods, but then we
automatically surface the most relevant payment methods based on who the customer is and and what they're buying.
And it works like businesses that show um at least one relevant payment method beyond just cards see like a 12% increase in revenue um and more than 7% lift in conversion. So like conversion
goes up and the size of the transaction goes up and that's like a really big deal for something as small as kind of the order of buttons on on a screen. So
that's that's checkout.
Can we nerd out on data infra for a few minutes? I'd love to talk about lessons
minutes? I'd love to talk about lessons learned operating uh data infrastructure specifically for data science, machine learning and AI at this scale. What
tools you use, what worked, what didn't, any lessons around yeah scaling uh and and operating at um uh you know that level. We use ML infrastructure that
level. We use ML infrastructure that we've developed over time at Stripe and that relies on, you know, open- source where available um and sort of
third-party buy solutions where it's not differentiated for us and where there's a third party that meets our you know reliability and latency and cost
consideration needs. So for example um
consideration needs. So for example um the data scientists and MLES and even um some of the software engineers here uh use notebooks for experimentation we use
data bricks uh notebooks uh we use flight for orchestrating our our training runs um we use NVIDIA GPUs and
PyTorch for model training um feature computation including those LLM embeddings and feature serving um is done in Shephard which again we built in
partnership with Airbnb and have since open sourced um under the name of Kronon. Um you know I I think so Shepard
Kronon. Um you know I I think so Shepard is new for us. Um actually we we just completed the full migration to Shephard a month and a half ago. Uh that
migration took you know on the order of about 6 months. But um one lesson learned is to really make sure that we're investing sufficiently in the
horizontal infrastructure layer so that individual product teams snap to the same infrastructure versus allowing their sort of golden workflows to
diverge and everyone to to spin their own. Um transparently the the original
own. Um transparently the the original feature computation and feature serving system we built which was called semblance um had a number of limitations. It was pretty hard to
limitations. It was pretty hard to develop on and as a result one of our largest uh machine learning groups at
Stripe decided to fork um by Tecton you know a third party solution. um we
couldn't adopt Tecton across Stripe because Tecton was only useful for batch solutions and like didn't meet the latency and reliability requirements of the charge path, right? So you could use it for example to um score merchant risk
at onboarding because you have a couple minutes to make that decision, but you couldn't use it to score a charge because you have tens of milliseconds to make that decision. Um and we ended up in this fractured world which led to all
sorts of issues including like actually like one of the most valuable signals for understanding whether a merchant is fraudulent is looking at the transactions that are happening on that
merchant. Um because you know there are
merchant. Um because you know there are certain patterns of transactions. Oh
your all your many of your buyers are from the same IP or there's a big jump in um prices. You used to be selling everything at $2 and suddenly you're selling everything at $2,000. That in
and of itself indicates that the merchant is fraudulent. And those
features actually couldn't be shared because we're bifrocated. Plus, just
from an investment perspective, you basically have like mini ML infra teams in in within the applied teams that are operating kind of inefficiently. And so,
we brought all that together under Shepard. It was a bit of a long journey,
Shepard. It was a bit of a long journey, but it was definitely worth doing. And
then we put enough work into it that we were like, we should just open source it and and make sure other people um can can build on it can build on it as well.
You have a a obviously a key realtime aspect to what you do. You need to detect fraud in real time. Is there a specific way this translates into infrastructure tools that you use for
that real-time component?
I think it results in us. It's it's a combination of the latency requirement and the reliability requirement, right?
We run on 569's reliability. Like you
can't have downtime. And that's not just like downtime of the core payments APIs.
like downtime of the radar API is super super costly to the businesses that run on us. And so, you know, the the SLAs's
on us. And so, you know, the the SLAs's needed for us to be able to to buy are are quite high. There's also pretty stringent security requirements. So
there are often new startups less so on the infrastructure side and more so on the applied side who we would love to buy from partner with but um they don't
have the security protocols and controls in place for us to feel comfortable um operating in their stacks and so um I do think that the nature of what we are
doing yes the um timeliness requirements but also the um reliability requirements and the security requirements um you know push us to and and I'm a big
proponent of like only build where you have a core competitive advantage but like on the margin for ML infra do push us uh a little bit uh more towards build uh than than we would have in in other
contexts.
All right, let's switch to the rise of agentic commerce. Uh so obviously aentic
agentic commerce. Uh so obviously aentic is uh one of the big words of the last uh year or so. Uh how do you all
envision this? Uh do you view uh
envision this? Uh do you view uh autonomous shopping agents as a a part of that future? And where do you fit?
Well, reasoning models are on the rise and with that AI is no longer just about getting answers to your questions, right? It's starting to do things for
right? It's starting to do things for you. I think most individuals first felt
you. I think most individuals first felt that. like our individual aha moment was
that. like our individual aha moment was maybe with the shift from chat GPT to operator, right? Answer questions to
operator, right? Answer questions to like go out and execute tasks in a browser. But that shift from knowing to
browser. But that shift from knowing to doing is a big deal. And I think one of the earliest places we're seeing it's going to change things is commerce. Um
we've all seen those cool demos of agents buying stuff for people. At
Stripe, we started leaning into this about a year ago. And back in November, we launched a toolkit that makes it easy for agents to transact on someone's
behalf. So, I like coffee. I drink a lot
behalf. So, I like coffee. I drink a lot of coffee. Uh there is, you might be
of coffee. Uh there is, you might be able to tell by the pace of speaking, but there there's this barista agent that is out there today and you tell it what kind of coffee you like and then it
just scour the internet for the best beans and then it buys them for you. Um,
but what's interesting about the barista agent is it's not a traditional coffee shop. It doesn't own any of the
shop. It doesn't own any of the inventory. it is literally just doing
inventory. it is literally just doing the discovery matching plus I'll talk a little bit about the the payments flows
um like that is the entirety of the app and I think that's just a glimpse of how kind of who is doing the buying is starting to shift like agents are buying
on behalf of humans and then there's another big shift that's kind of happening in parallel and by the way both of these are early but I think just given the pace at which we're seeing
things change. We'll probably move
things change. We'll probably move pretty quickly here is like where the buying happens. So like more people in
buying happens. So like more people in more businesses are spending time inside AI tools. Um and with that product
AI tools. Um and with that product discovery and browsing and now even buying are starting to happen in those tools. So like Perplexity, you may have
tools. So like Perplexity, you may have seen that they recently launched hotel discovery and booking um in the app and it's powered by Stripe, but you know
unlike most hotel discovery and booking surfaces you might think of, you're not linked out to a merchant website. You
aren't taken to separate checkouts. You
stay within the Perplexity app. Um, and
I think that kind of like insitu commerce is is really interesting. Um,
we're also working with hip camp. It's
like summer season, so maybe a good time to mention this. They just use agents to book campsites at state or national parks um on the camper's behalf, even off platform. The agent goes and
off platform. The agent goes and completes the booking. They do it, you know, really safely with these virtual cards um in terms of the money flow. and
it just gives campers access to sites that aren't normally all bookable um in one place.
So behind the scenes uh how does that translate into uh requirements whether that's uh I don't know speed, data formats, authentication, the checkout
experience uh that requires you guys or or not to just change the way Stripe works. early days, the biggest change is
works. early days, the biggest change is around the money flows. But I would caveat because, you know, people get really jumpy about like, oh, an agent buying for me, that sounds super scary.
I'd argue that like in practice, agents have actually been buying for us for years. They were just human agents,
years. They were just human agents, right? Like when I order my salad from
right? Like when I order my salad from Door Dash, Door Dash charges my credit card and then it issues a singleuse virtual card to the driver, right? The
driver is my human agent who goes and buys the salad on my behalf and they can only buy at Sweet Greens and they can only buy for $25 and they can only buy in this 2-hour window in my town. But it
is like very controlled and that that single-use virtual card in the Door Dash case happens to be, you know, powered by Stripe. And so what we're doing here in
Stripe. And so what we're doing here in sort of the first most simple iteration, your mental model should be swap out the human agent for an AI agent. And that's
how barista agent works, right? It's
just using a single-use card from Stripe issuing to make the purchase just like the Door Dash driver does. So the
transactions controlled and you know your your data stays safe. Now I don't think that that will be the only mechanism for agentic commerce or the
limit to what gets done. Um but it is sort of the first instantiation um that we're that we're seeing is this just difference in money movement or
replicating kind of human agent uh money movement with with machine agents. Um
the other thing that's kind of interesting, you know, those were all uh B TOC like consumer examples, but just like you and I are spending a bunch more
time in chbt or perplexity or whatever we like to use, developers are spending a lot more time in cursor and various AI dev tools to code faster. And so another
example of agentic commerce which maybe isn't the first thing that comes to mind for people is like you're in cursor you're building your product you you know want to set up some like front end thing b protection whatever something
like versell normally what do you do you like stop coding open a new tab go to verscell sign up get your API keys bring them back to cursor like total context
switch but now you can just buy versel from inside cursor like right there in the code editor so you don't break your flow So it saves time for the user. It
also creates sort of a whole new channel for versel and cursor to sell software directly right where the work is happening. So I mentioned like institute
happening. So I mentioned like institute commerce for the consumer but this is like institute commerce for the developer or or B2B and Stripe enables those transactions too. So, I don't
know. I I just I think there's a there's
know. I I just I think there's a there's a brave new world of agentic commerce and who's doing the buying is different and where they're doing the buying is different. Um, but there's a bunch of
different. Um, but there's a bunch of other stuff that's that's going to need to evolve too. And as you push the reasoning further and think of like uh multiple agents that need to coordinate
and everything happens through code, do you then get into a different world where Stripe needs to sort of behave differently?
I don't know if you know Daffhne Caller, but she she hired me at at Corsera. She
she co-founded Corsera back in the day.
And um I remember talking to her in the parking lot one night. She was notorious for staying very late. So it was always dark when we talked in the parking lot.
and also driving very quickly. So, you
really had to get out of the parking lot before she got in her car. Um, but she she was saying to me in the parking lot late one night, we were so everyone was so young and like half those people ended up married to each other. We were
like there at all hours. Anyway, the
first the first movie, the way she described it to me was like we because we were talking about, you know, where where were we going to this was literally 2014, like where do we need to evolve the learning platform and teaching platform to be? I think her
words were something like, you know, the first movie was just filming a play on stage and then you think about whatever today's latest Hollywood release and
it's like this whole set of experiences that are only possible because it's on film and sort of the analogy she was drawing is like the very first MOO like literally what we had in 2014 massive open online courseware was like
recording Andrew Ing up in the front of his Stanford classroom, right? But today
companies like Corsera and Con and whatever like we've actually built learning experiences that are only possible because of the data, because of the technology, because of what you can do sort of through this new medium. And
just bear with me like I kind of think it's going to be the same for commerce, right? So the earliest versions of
right? So the earliest versions of agentic commerce have looked a lot like flipping an AI agent for a human agent, right? Like instead of the Door Dash
right? Like instead of the Door Dash driver doing it, like the the barista agent is doing it. Um, and actually, you know, we didn't talk about order intents, but one of the things that
we're also enabling is right down to the agent navigating a web browser and filling out the human optimized checkout form. And that feels like a very
form. And that feels like a very reasonable place to start, but that's not what agent commerce will be, right?
Like imagine now you're like no longer selling to a person who's scrolling through your site. you're selling to a piece of software that has already read all the reviews and has price compared the market and is now in a hurry to kind
of tick payments off its list. That's
like how the AI agent is going to feel and it's going to buy very differently from you and me. Um, and so, you know, we're still working through a ton of this. A ton of this is yet to be built,
this. A ton of this is yet to be built, but you asked like what's it going to demand of Stripe? And I think sort of some highlevel design principles like the first is just that intent is the
interface, right? like humans click
interface, right? like humans click around. Agents just declare what they
around. Agents just declare what they want. So in Perplexity, a traveler will
want. So in Perplexity, a traveler will type, you know, find me a flight to New York under $300. But Perplexity is going to turn that sentence into like a single JSON blob. It's like origin,
JSON blob. It's like origin, destination, and budget cap. It's going
to like fire that at the seller. And so
every merchant API is probably going to need one canonical kind of intent endpoint that accepts those structured desires um instead of sort of this UI
click uh world that we live in today.
Second, I think it's pretty clear that product data is going to have to be machine readable, right? Like I don't know if you've ever played around with like United's fair database. It is not
perfect for humans. Sometimes it's like intentionally opaque for humans, but it is definitely useless for code. And so I think early adopters who want to sell through agentic channels are going to
need to expose kind of an open product schema like the skew and the inventory and the price and the constraints and you know maybe even the wedge that you're willing to give to the facilitator agent who's facilitating the
commerce like that's not CSS like that's not JavaScript. Um, and then the agent's
not JavaScript. Um, and then the agent's going to be able to run kind of a skew level search and know with like cryptographic certainty, right? Like
flight UA263 for $250 is still available, right? So,
I think that'll change. I think latency budgets are going to shrink to machine time. We talked about latency budgets in
time. We talked about latency budgets in the context of the charge path, but like, you know, people will wait 3 seconds for a spinner. I think an agent's just going to retry somewhere else after a couple hundred
milliseconds. And so, um, it's all going
milliseconds. And so, um, it's all going to have to be like pretty fast. And then
we touched on this briefly, but it's like a ton's going to have to evolve in the risk space. Um, you know, today I think human buyers think of themselves as owning their credentials. I own my
card numbers. Um, credentials are going
card numbers. Um, credentials are going to have to move from like being possessed, being owned to being permissioned, right? someone gets like a
permissioned, right? someone gets like a onetime scope limited token to spend that $250 on United Airlines before
midnight. Um, and that token, you know,
midnight. Um, and that token, you know, can't be used, replayed at another, um, provider and it evaporates after use and it has all sorts of limits, whatever.
Um, trust is going to have to be super programmable. like some developer IDE is
programmable. like some developer IDE is going to be buying GPUs on behalf of 50 different startups and we'll want it to attach sort of a verifiable business
profile um including a risk score so the downstream sellers can accept or refuse the purchase. um we're going to need a
the purchase. um we're going to need a lot more observability. Um we want to kind of if you take the hip camp example, right? Like um it it's camping
example, right? Like um it it's camping bot should be able to book federal park campsites, but it also needs to be able to expose these real-time logs so that hosts can reverse anything um that looks
odd. And then it's probably obvious, but
odd. And then it's probably obvious, but like good bots need to be very distinguishable from bad bots. And a lot of the classic fraud tools might mistake
a good bot as a bad bot or just consider a bot um to be bad. Like speed, data format, off are all going to change when the buyer's a bot. And I think it's just going to require designing for intent
and publishing those structured cataloges and signing and scoping every credential, instrumenting everything, and then just like we're all going to have to teach the risk stack um to tell
the good from the bad.
Fascinating. Where where does MCP fit in that picture? So MCP being the emerging
that picture? So MCP being the emerging agent to agent protocol and Stripe was early in setting up your own MCP server.
Uh where does that fit and any lesson learned with your experimentation with MCP so far?
I think there's two bits. One is like our own MCP server and the other is how we enable MCP payments. And they're
they're different but I think they're they're both kind of interesting in their own right. um on the former like you know we talked a bunch about you know commerce related examples but there
are AI agents out there now probably a greater number actually than commerce agents that are helping you run your business. So not the not the transaction
business. So not the not the transaction commerce part but the running your business like doing the boring admin stuff you hate so generating the invoices off of messy spreadsheets and you know updating cards on file and
changing billing plans and um analyzing business metrics and doing support stuff. um and they're doing that without
stuff. um and they're doing that without needing a human and MCP model contact protocol is a critical enabler here right so um yes it can be agent to agent
MCP can also be a translator between LLMs and you know SAS APIs like more deterministic SAS APIs and so you can think of it as like the simplest version
just the LLM reads like a menu of tools so for example a menu of stripe tools and then when you ask a question the model picks the right tool and fills in
the JSON And then the the MCP server sort of fires in this case the actual Stripe call. And it's sort of the same
Stripe call. And it's sort of the same principle um as a browser hitting a REST endpoint, but the client is just a bot instead of instead of a person. And so
you know what does this actually let you do today? Um Stripe's MCP server lets
do today? Um Stripe's MCP server lets you do all the most common kind of lowrisk tasks that you can do on Stripe or through our API, right? to list
customers or create customers or find your product prices or spin a payments link or issue a refund or pull up your balance. Like all of the kind of boring
balance. Like all of the kind of boring but essential stuff you do 100 times, right, while you're while you're while you're wiring up your your Stripe integration or or trying to serve your customers. Um, one of the most
customers. Um, one of the most interesting use cases I saw recently was actually Decagon. Are you familiar with
actually Decagon. Are you familiar with Yep.
Yeah. So, um,
the customer AI company, right?
Customer AI. And in less than one week, one engineer at Decagon um built an integration with Stripe through our MCP server that just lets Decagon's customer
support agents securely access all of the info for their users, right? So
their users invoicing info and subscription cancellations and whatever.
So now, Decagon's customer support agent can on behalf of the business find the business's customers invoicing info or cancel their subscription or deliver
their refund directly from their customers Stripe accounts. And um the first Decagon customer that they released this to reported a 65% drop in support costs. Like it's kind of
support costs. Like it's kind of striking how much of support is cancel my subscription, give me a refund, explain my invoice, right? stuff that
actually can be done in a fully automated way if you have um clean access to to your to your Stripe systems. So, where do we think this is going to go? I mean, it's pretty clear
that MCP is becoming the default way that any single service, Stripe or GitHub or Notion talks to an LLM. And
so, you know, naturally, I think MCP um also needs to support monetization, which is why we've enabled MCP payment.
So you can seamlessly monetize your MCP server um using Stripe as well.
We we previewed talking about uh the new AI economy earlier in the in the conversation. Stripe for the reasons
conversation. Stripe for the reasons that you describe as a a very unique vantage point uh into uh what companies do and um their growth and all the
things. and you release from time to
things. and you release from time to time uh really interesting stats and perhaps we'll put some of those as a link in the in the show notes. Uh to
start at a high level, what do you see uh that's um different in this generation of AI companies from your vantage point?
One of the things, you know, for my for my economist hat that I love about working here is just kind of this front row seat to hey, what's the growth trajectory of each successive wave of startups in particular? And you know the
current wave of course is AI. Um we work with AI companies across the stack. So
when I talk about the AI companies uh on Stripe you should think of this as everything from like infrastructure and modeling to full-blown applications open eye anthropic perplexity cognition 11
labs um decagon Sierra right and like a long tale of of others. Um we we recently looked at the the Forbes AI50
and 78% of them are Stripe users. that
78% reflects 100% of the Forbes AI50 that accept online payments. Um, and you know, I think there's there's a lot of hype around AI tech and I think fair
questions around the monetization. And
so we took a look at, hey, with this current wave of AI startups, what do we see in in their monetization trends and and in their growth trajectories? The
long and short of it is like they are monetizing super fast. they are
monetizing faster than any previous generation of startups that we've seen.
Um, we focused in um, just for concreteness on the top 100 highest grossing AI companies on Stripe. Um, and
we asked, okay, for the median in that cohort, how long did it take them to hit various revenue milestones? Um, what did their customer base look like? What did
their monetization strategy look like?
And, you know, those that already hit 30 million in annualized revenue got there in about a year and a half. For
comparison, you know, many of us were around 5 years ago. Like the fastest growing SAS startups on Stripe took, you know, 5 and a half years to hit that same to that hit that same mark. So this
kind of AI wave is is scaling revenue at you can think of like 3x the speed of of the SAS boom. And it's not just the big players. Like if you look at the newest
players. Like if you look at the newest AI startups, the ones just getting going, they're ramping even faster. You
know, like the the ones that hit a million, the median gets there in 5 months.
um they're earning 4x more in their first year than peers who launched um just a couple years earlier. I was at Stripe Tour Paris and so a couple weeks ago and was week and a half ago and was
looking at some of the some of the uh European breakouts. Lovable uh out of
European breakouts. Lovable uh out of Stockholm hit 50 million ARR in 6 months and is now for sure the fastest growing startup in Europe. Um Cursor, which you
know of course we we mentioned earlier helps developers code with AI, they only launched two years ago. They recently
announced that they're over 300 million in ARR. So just like really astounding
in ARR. So just like really astounding growth rates and I think um doesn't mean it comes without cost including inference costs but like this is a real
wave of businesses building real value in the market else they wouldn't be able to monetize it. Um and they're doing that way faster than we've seen in in any previous tech cycle.
You mentioned uh Paris and Europe. Do
you find that those companies are global earlier in their life as well?
For sure. These AI companies are going global way faster than their predecessors. If you look at the uh AI
predecessors. If you look at the uh AI that AI 100 group and you ask the median, the median is in 55 countries in their first year and 80 countries by
their second year. And that is twice the internationalization of equally promising earlier SAS companies at the same stage of their evolution. And it's
real money that they're getting cross border like today these companies generate the majority of their revenue.
I think the median is 56% of revenues from international customers. Back to
France like photo room it's very cool.
It's like you know one of the darlings of France AI photo editor. It helps you clean up images. They went from I think 0 to 50 million ARR in three years. They
already sell into 184 markets. Like I
mean you don't have to go that deep into your geography background to know there are that many more markets. um to sell into, right? And well, some of it is
into, right? And well, some of it is that they're selling, you know, infrastructure and models and digital art and music and stuff that just works across borders. Some of it is that LLMs
across borders. Some of it is that LLMs are good at translation, but some of it, honestly, to our conversation earlier on optimized checkout suite is just that the bar to going global has gone down,
right? So, almost all of these guys
right? So, almost all of these guys adopt our optimized checkout suite.
comes with over 100 payment methods out of the box that gives you global reach and conversion but also there's all the hassle of you know global for managing
tax and regulations and a bunch of our solutions like stripe tax help these businesses scale up globally with very lean teams because you know that's that's another trend we we didn't talk
about yet but these these folks are building very real businesses with like 10 20 30 people um in a way that's that's quite striking and and actually never been seen before.
And look 100% I mean not not uh that you need you know praise for me from me but like you guys should absolutely uh take a victory lap for uh enabling a whole
generation of startups around the world.
you know combination of AWS, Stripe, companies like like deal and and others you can just launch your business globally uh in you know a few days after
you you you incorporate the company which is insane and which is you know partly uh the reason why you see this generation of companies growing so fast.
Yes, AI is hot, but the enabling layer now exists in a way. Uh, and I would add that on top of that, you got the global communication layer where everybody, you
know, at least in tech is is on X. Uh,
and uh, the, you know, the the whole the whole this whole world of problems that were just abstracted away in a way that was just completely unimaginable 15 years ago. So I have a hypothesis about
years ago. So I have a hypothesis about a second order effect from that which I haven't robustly validated but I'm going to say it anyway because I'd love for you to chew on it which is
an interesting corollary of being so global from day one is that today's vast internet markets enable and reward
specialization.
Um the markets today are so much bigger than they were a decade ago and correspondingly what people are building with AI is starting to look a lot like what we saw with SAS like first horizontal and now vertical right SAS
was like first Salesforce and then toast AI was like you know first broad tools like chat GPT and then highly specialized industry specific applications in healthcare in real
estate in architecture in restaurants but the switch from horizontal to vertical which definitely happened in SAS happened so much faster faster with AI. And I think part of that is for sure
AI. And I think part of that is for sure that like the models enable these we talked earlier about rappers enable these sort of specialized products to
spin quickly and find product market fit without having to sort of invest a bunch in upfront research. But also I think the fact that they are global provides additional tailwinds to that which is
when stuff is truly borderless um specialization is rewarded because the markets are bigger and so even a very specialized niche is a very large business.
Yeah. No no no vertical is narrow is too narrow when you can do it globally.
Really interesting. Yep. I love uh I love that thought. Yes. I think I think part of it is also seeing the LLMs go up the stack and go from being financial models to increasingly application
companies and covering a lot of the broadly horizontal stuff which pushes people to the the vertical uh aspect of things. But uh I I love that thought
things. But uh I I love that thought that uh being international in a given vertical makes your vertical market very very big. what what are you seeing uh in
very big. what what are you seeing uh in terms of uh you know in your world that's different with EI companies in terms of like uh billing pricing business models
okay so selling software used to be you build it once you incur a fixed cost of building it once I'm slightly oversimplifying obviously you continue
to do R&D but like it is a high fixed cost of building and then you sell it by seat over and over and over again at very high margins because the marginal
cost of providing the software is low.
Okay, that is not true with AI. As
products get more AIcentric at least today, inference costs are you know more meaningful and so companies are shifting from this sort
of per seat billing and by the way if the AI does really well there might also be fewer human users who need such seats. So it's not clear that Percy
seats. So it's not clear that Percy billing was going to get you the revenue even if it right. Um but companies are shifting to usage based billing first to
align pricing with costs and then second a trend and this one's earlier but I think it's where actually like the market equilibrium like where clearing
will actually happen you know 2 3 5 years from now is experimenting with new pricing models like outcome based pricing um and actually increasingly using outcomebased pricing which you know provides flexibility and really
only charges you for the stuff that works as a competitive competitive differentiator. So it can be hard to
differentiator. So it can be hard to evaluate whether AI is going to work or not. And if you can go in and say look
not. And if you can go in and say look we're only going to charge you for what works like that is a much lower risk proposition for the business than saying we're going to charge you per se or we're going to charge you for usage,
right? It's like well what if I use it
right? It's like well what if I use it but it doesn't work well enough and so like I'm paying the inference cost but it's not like moving the needle for my business. Um, and so Intercom is an
business. Um, and so Intercom is an Irish um, founded company and they're they're also reinventing customer service. There's a lot of interesting
service. There's a lot of interesting stuff in the customer service space, but they're moving their support product from charging per seat, right, the olden days model, which is how most SAS is
built, to charging per resolved case, which is it aligns incentives with their customers. And it is actually like an
customers. And it is actually like an outcomebased version um of pricing and you know, so I I just think like stepping back, AI is changing everything. It's increasing
everything. It's increasing productivity. We think you don't want a
productivity. We think you don't want a pricing model that is static. You don't
want a pricing model that depends on your customers hiring ever more people.
You also don't want a pricing model that is assuming near zero marginal costs um given inference costs. And so we do see these these businesses iterating very
quickly to figure out kind of where to supply and demand intersect. And
correspondingly, we're sort of arm-in- arm with them working on our billing solutions, including usage based billing and outcome based billing and um really partnering with this current wave of AI
startups to make sure that their pricing and monetization approaches a work for the market um and then b can be like very fast evolving and and highly unconstrained. So maybe as a last theme
unconstrained. So maybe as a last theme to to close the conversation you know at topic duour is how companies use AI internally and it's a little bit of you
know AI coding vibe coding on the one hand and then on the other hand you know the the Toby memo uh about um AI literacy and then you saw Aaron at box
do the same and the CEO of Zapier and so on so forth. How how do you all think about this in terms of uh building or or
governing um you know AI literacy inside Stripe? For us, I think it really starts
Stripe? For us, I think it really starts with a culture of experimentation. And I
actually like to tell the story of how back boy like two years ago now, right, a couple of engineers hacked together a little internal beta for an LLM
Explorer. And the basic idea was like,
Explorer. And the basic idea was like, hey, let's get a chat GPT- like interface in the hands of thousands of talented Stripe employees and just have them figure out how to apply it to their
work. And you know, Stripe is coming
work. And you know, Stripe is coming into this from kind of a longunning culture of bottoms up experimentation all the way up to kind of Patrick and John leaders here have very
intentionally crafted that um we we think a lot about sustaining experimentation and innovation um internally as as we grow. Um and so in
the case of LLMs for us this was like hey let's just quickly unlock internal experimentation and obviously that needs to be done safely right like people are going to experiment the enthusiasm was palpable like they better not be in
their personal chat GPT accounts especially given the sensitivity of of Stripe data so you know we decided fairly early on to organize cross functionally and just set up the tools
and policies so that any Stripe could safely play with LLM capabilities. Um we
also decided early on to decouple from any one model because we saw the models evolving quickly. Um so you know the
evolving quickly. Um so you know the first version of this LM explorer had just touched you know GPT 3.5 and GPT4 but today we serve dozens of models
through the tool. We assumed
collaboration so people are very social um and the returns you get from building something are almost never worth it if that thing only works for you. And so we enabled these things called presets
which are basically sharable prompts.
And basically overnight the Stripe community developed hundreds of these kind of like reusable LLM interaction patterns. And I think from there we were
patterns. And I think from there we were kind of off to the races. And we had a bunch more to do like hey let's um make sure that well so we built LM proxy like
any engineer should be able to like hit a standard API to get access to LLMs and build their production grade applications. Um we actually only
applications. Um we actually only relatively recently um you know GA like an agent builder internally that hooks up to what we call tool shed. So it has
access to the MCP servers for Google cloud and Jira and Slack and whatever else. But it it started with just a
else. But it it started with just a small number of engineers saying everyone in Stripe should have access to LLM. They should be able to share what
LLM. They should be able to share what they build with LMS. Then they should be able to access those LLMs programmatically and then they should be able to build agents on top.
Zooming out. uh anything you can talk about in terms of like road map what you're currently working on uh what should we expect in the next 12 to 18 months anything you can share I mean it's a lot of going big on what
we talked about today like deploying our foundation model across applications um building really robust risk as a service um helping our users prepare for
commerce in an AI era um I can't share any super specifics but I think you kind of see where we're headed with foundation models with MC CP with order intend with sort of the perplexity
shopping example um and you can you can expect to see more of that from us um in the in the coming months.
Brave new world. Um all right, thank you so much. This was fantastic. Love the
so much. This was fantastic. Love the
conversation. Thank you so much for spending time with us.
Super. Thanks for having me, Matt.
Hi, it's Matt Turk again. Thanks for
listening to this episode of the Mad Podcast. If you enjoyed it, we'd be very
Podcast. If you enjoyed it, we'd be very grateful if you would consider subscribing if you haven't already, or leaving a positive review or comment on whichever platform you're watching this or listening to this episode from. This
really helps us build a podcast and get great guests. Thanks, and see you at the
great guests. Thanks, and see you at the next episode.
Loading video analysis...