$6.6B AI CEO: How to Make Your First $10,000 with AI
By Silicon Valley Girl
Summary
## Key takeaways - **Voice is the future interface for AI**: Voice will be a key interface for interacting with technology, transferring more information than text by conveying emotionality, inflection, and imperfections. [01:47] - **AI voice agents boost business efficiency**: AI voice agents can handle customer support, guide users through products, and accelerate sales pipelines by providing instant information and even converting leads for self-serve business tiers. [02:37], [03:56] - **Monetize your voice on a marketplace**: Creators can earn passive income by cloning their voice, recording about 30 minutes of audio, and sharing it on a voice marketplace where others can use it, with over $5 million paid out to the community. [12:17], [13:36] - **AI safeguards needed for voice authentication**: As voice cloning advances, a three-layer safeguard model is proposed: device authentication, watermarked AI content, and defaulting to distrusting unverified content to combat deepfakes and impersonation. [21:36], [23:03] - **Adapt to AI: Use it to enhance expertise**: Jobs at risk are those replaced by AI, but individuals can adapt by learning AI tools, enhancing their domain expertise, and combining it with AI for higher value and output. [27:54], [30:54] - **Focus on problems to build a business**: When starting a company, obsess over the user's problem and validate if it's a burning issue; 11 Labs pivoted from dubbing to voice cloning after discovering users prioritized the latter. [36:53], [38:38]
Topics Covered
- Will Voice Become Our Main AI Interface?
- How Voice AI Drives Business Growth and Sales?
- Earn Passive Income by Sharing Your Voice.
- How to Trust Voices in an AI-Driven World?
- Combine Expertise with AI to Thrive in the New Economy.
Full Transcript
We paid about $5 million to the entire
community.
>> Meet Mari, CEO and co-founder of 11
Labs, a company that has grown into a
$6.6 billion leader in the voice AI
space, shaping how we talk, work, and
even earn money. They've created an
entire voice marketplace. Now, anyone
can clone their voice and earn passive
income. Can you name some opportunities
that you see that can make people this
amount of money so they can make a
living like 10k a month? Something
that's immediate
>> business and you just want to make good
money. I would try to take those voice
agents and go to let's say local
doctor's office and
>> 11 labs built the world's most realistic
voice deck. The question is can they
control what happens next?
>> Most of those companies just don't know
this is possible. You don't have to be
the coder. You just need to
>> if my voice is authorized to use my
credit card to buy anything and then
somebody just uses the resemblance of
it. I
>> think it's it's it's going to happen.
But uh
>> hey guys, welcome to Silicon Valley
girl. We have one of the guests today
whose product I've been using for a
while now. So I'm going to ask a little
technical questions as well, but please
welcome Mati from 11 Labs. Thank you so
much.
>> Thank you so much, Marina. Great to see
you again and thanks for thanks for
having me.
>> Yeah, thank you. So I feel like you're
one of the pioneers of this AI industry
because when I ask people like what apps
they're using or when I'm talking about
apps that I'm using I always mention 11
labs because it's been a lifesaver. I
wanted to start with a question um about
the role of voice in AI. So what it
feels to me is that 2023 you know we
started adopting Chad GBT. It was all
text and then these voice capabilities
became more and more powerful. It
understands what I'm saying now. It
understands my accent. If I mispronounce
something it still gets me. Do you feel
like we're moving into the era where
voice is our main tool to interact with
AI? I mean 100% I do think that voice
will be the one of the key interfaces to
the technology around us and um and that
shift is happening like you said it's
like few years back you wouldn't even
dream of this being possible and now I
think it's it's becoming a reality where
it it allows you to transfer so much
information more than the text you can
you can get the emotionality the
inflection pattern the imperfections
reflected in the voice which of course
makes it easier for the um if it's an
input for the for the technology to
understand a lot more about the the
setup that or what you are trying to
achieve and then if you hear it back as
well I think it's a lot better and more
pleasurable um experience as well
>> how do you see voice transforming
businesses do you have any cases where
people are using voice to generate leads
or convert leads
>> there's definitely a few different areas
whether it's on the more classic uh
customer support use cases where you
instead of having a old IVR system or no
system, you can now deploy a voice agent
that will take the calls instead and and
will both delight the customers on the
other side because it understands you.
It's quick, it's good um but then also
just performs better. And then outside
of customer support, we are seeing that
across the entire life cycle of of of
the user journey in some places where uh
uh it adds some an experience that
wasn't possible before. in a simple case
is uh inside of the product or even
outside of the product um and you might
have seen back in the day there was
those widgets for chat. Now you could
have a voice agent that helps you
navigate through the product experience.
So it becomes your like a partner
programmer product person that helps you
navigate through that that life cycle.
And you also mentioned so of course some
of the big pieces is in inbounding and
outbounding. We actually use it
ourselves in the 11 laps too where um
where of course we we do have a standard
flow. We have people that will answer
the the the the reply and take a a phone
call too. But if you want to go quicker,
you can speak straight directly with our
agent to understand our product
offering, understand our pricing,
understand what you what you can do with
the product, which helps you accelerate
through the pipeline depending uh and
sometimes self- disqualify if you are
not the right uh um fit for our product
offering and sometimes helps you
accelerate. Okay, this is exactly the
set of use cases I can do. This is how I
can deploy and then routts it to other
people.
>> So it doesn't actually convert
>> it uh in some cases it does. In some
cases it's um as a quick step back we
have a few different tiers. We have like
a business tier and an enterprise tier.
So it does convert immediately sometimes
to the business tier program.
>> It's a preset
>> because it's preset it's self-s serve.
Um on the enterprise side we all still
run KYC checks. So it doesn't do that
immediately. Uh but uh but on the
business one it it it does and and then
we've seen some of those voice um agents
also um from from a lot of the
technology and platform we built help in
a completely different non-commercial
aspects too.
>> Quick follow-up question for like about
the the sales process. Have you measured
the conversion uh percentage into sales
with the AI voice salesperson? We did
but given it was uh and I don't remember
the number off top of my head but given
there was alternative before would have
been just waiting so it was just a net
new amount of leads and we received so
much inbound of of using a lot of the
products which we are lucky to to have
that it helped us just convert so many
more leads that we would have otherwise
taken weeks months or or maybe never
gotten into.
>> How can I set this up for my company?
Let's take a quick break here. You know,
as we're talking about how AI is
transforming sales and support, there is
one thing that hasn't changed for any
business. No matter what tools you use,
you still need a home for your product.
A website where your customers can
actually find you. And here's the
challenge that I've experienced myself
many, many times. Try registering a
good.com domain today. Almost everything
is taken. You end up with these long,
awkward names that don't really match
your brand. That's why I was so excited
to discover online domains. It's
actually the world's second largest new
domain extension trusted by more than
3.5 million businesses worldwide. And
the word online itself is incredibly
powerful. It's searched over 500 million
times every month, which means it helps
you rank higher and become more
discoverable in search. What I really
like is how it works for literally any
type of business. Freelancers, creators,
service providers, big or small
companies. I've seen everyone from
global stars like Maluma, Colombian
Megastar with over 100 million audience
across different social medias with
Maluma.online to the classic game
Mindeeper who all spend hours playing
now lives on Mindeeper.online.
So if you're trying to build your
business on.com domain, for example,
voice agent.com, you know how often the
good names are gone. Withonline is much
easier to secure the domain that
actually fits your business. Whether
it's an AI startup, a side project, or
your personal brand, now is the perfect
moment to claim your domain. And the
good news, for a limited time, you can
get it for just 99 cents for the first
year with my exclusive link and coupon
code. Just go to www.get.online
or use the code from the description and
secure your name today. Let's get back
to the interview with Marty. How can I
set this up for my company?
>> The easiest one would be to register on
our platform. Uh so that that part of
offering and we have two key offerings
is our agent platform offering. You jump
into the platform and we help you
abstract two elements. The first one is
all the research or experience
complexity. So we help you connect the
speech the LLM elements the texttospech
elements. So so the agent speaks in a
smooth and a a quick way. So it's a very
a low latency a reliable part on that
side. And then there's a second part
where you will need to spend a little
bit more time on bringing your business
logic in place. So example could be
what's the knowledge base of how your
business operates or what are the
questions you want to be asked. What are
the materials you want to surface? So
you would bring that into the platform.
Then we have a set of workflows that you
can set up effectively. Imagine like if
this happens this happens or if this
happens I want this function to trigger.
Um, this could be if someone is calling
me and I want to appoint, schedule an
appointment. We have a predefined
workflow for you to be able to do this.
So, it can look into your calendar
appointment.
>> Selling a course basically like what
language does, we're we sell courses.
So, basically one simpler
>> to sell courses to people. Can I do it
in different languages using my voice?
>> You can.
>> Wow. So you could you could and so so
it's selling the courses and the people
would call in buy the course and off off
to the go and maybe they on board with
the agent later on to help.
>> How do they how do they buy over the
phone? Do you send them a link ask for
their email or they just Yeah,
>> it depends. Uh but the simplest would be
what you suggest which is we do have an
omni channel solution where you
effectively get a link as part of that
and you can leave additional details or
you have a follow-up on the email of
like a checkout subscription for the
course. So both of those would be
possible. Or you could, depending on how
that website is set up, you could
effectively embed the agent on your
website. So it helps you redirect to the
subscription page. It guides you through
it and they check out themselves live
with the agent that helps them
>> wow fill in the form. But like you said,
one of the great things on the function
side is that you can you can you can
switch languages. You can hand over like
>> that's fascinating for my business. So
it's I mean you've been pioneering a lot
of that language learning work and I
think this would be amazing because both
it would switch the language and to
switch it with your own voice if that
was your own voice so it continues
speaking in that same manner and then of
course the last piece is all the
integrations so we support integrations
>> where you headed congratulations
>> thank you it's one of the big so maybe
that's a good cue for me as well but
because when we started the company we
of course started from pioneering the
research on the speech side so text to
speech voices and then we expand it to
speech to text the orchestration models
now music but as we think about the
research it's always how we can push the
audio frontier forward
>> I love how you found this new
opportunity and now it's bigger chunk of
your business as far as I understand how
much would it cost for a business like
mine small business to have AI answer
the calls and sell
>> I think the and and of course depends on
the volume but I think what hopefully
will happen is that both you will see
more people coming through and if we set
it up in the right way Maybe this will
mean even opening up the channel which
over time hopefully means even more
calls but I think to start it would be
in order of hundreds of dollars per
month.
>> Mhm. It's also IP calling right uh is
integrated in that.
>> Yes. So we integrate with Twilio or or
or Telephone systems. Okay. So whatever
works.
>> So you can bring Yeah. You can bring any
phone number that you already have and
it works. I I don't know who currently
do you already accept any of the calls
coming through the telephone too or it's
all all on the website? We mostly try to
navigate them to WhatsApp because a lot
of people who are calling they don't
speak English so they don't feel
comfortable. But if we advertise that
it's, you know, Marina's voice AI,
nobody's judging your accent because I
feel like when people even talk to me,
if they're non-native speaker, they
first the first thing they do, they're
like, "I'm sorry, my English is not as
good as you." I'm like, "It doesn't
matter." But I feel like even like using
English to make a phone call is such a
huge barrier for non-native speakers.
And I feel like if you understand that
you're talking to AI, it just makes it
so much easier.
>> That's true. It doesn't judge. You can
do little mistakes, which is maybe a,
you know, like uh there's a completely
other aspect what, uh, you of course
been helping people learn languages for
a long time. But maybe there's even an
aspect where they could practice
speaking their language with you. uh
which would be like a you know kind of a
slightly different of course deployment
but completely possible where you can
give them tips improve uh in in and
effectively create a marina's dual lingo
that people have dynamic experience with
which is another kind of incredible area
that's growing in the at tech space.
>> Yeah, let's let's talk about that part.
So we talked about deploying 11 labs to
work as an sales agent. Let's talk about
like I have this number here where you
paid $2 million in royalties to people
who kind of share their voices with 11
Labs. Can you talk about that? How can
people start making money by share their
voice with 11 Labs?
>> So uh it's it's one of the efforts we
launched in the early days where we we
effectively created a voice marketplace
voice ecosystem where every person can
create their own voice go through
authentication flow. You need to record
roughly 30 minutes or more of you
speaking. Then you have a perfect
replica of your own voice that speaks in
in the language you recorded plus all
the language we support. So you have
usually 30 or so um different
variations. Now with the new model we
are releasing will be 70. So um so you
have the voice that that's now available
for your own use and then if you decide
you can share it to our marketplace and
if you share it to your to our
marketplace specific period of time
specific conditions of what you are
sharing it for then other people can use
it across 11 laps ecosystem and when
your voice is being used you get paid
back as a result. This way we have now
almost 10,000 voices that people shared
and created. What is incredible is it
spans so many different languages,
accents, um different styles. So like
now if you are logging to the to the to
the platform, you just have this
incredible plethora of voices and we pay
uh pay voice uh pay voice down back. So
it was I think $2 million at the
beginning of the years that we paid back
and now um I think last time I checked
it was a few months ago. We paid back $5
million to the entire community.
>> How much does average an average voice
creator make? It depends uh of course
you know like so it's like the the like
probably in total approaching close to
$10 million and we have close to 10
10,000 voices. Um so that would be like
you know if you if you if you take the
average uh but I think it it's um
especially given a lot of the voices got
are kind of new and it takes a little
bit of time before they take attention.
You also to actually make it successful
ideally you try to engage some of the
community around that they can see the
voice whether it's the discord the
Reddit some of the other forums it
definitely helps break through that
initial and if not over time we also try
to surface new voices and and and get
them out in the audiences so it really
depends I think it'll be a lot of people
in like a few hundred per month category
and that's probably what you could
expect if if you if you do a little bit
of that effort and and what you could
what you could earn. However, the you
know, it's it's um I think it's true
that it's if your if if your voice
sounds very similar to other voices,
it's very much
>> Yeah, it's interesting how many voices
like in general do you have
>> and how many can you distinguish
>> but if you if you have a unique voice,
if you have a new Exactly. then then it
can it can be it can be incredible. our
first voice uh one of our first voices
that got shared and it was a Spanish
voice that had a very deep um way of of
speaking the deep proided and uh that
voice became one of the most popular not
in Spanish but in English-speaking
countries and became like our top 10
voice um where where where it was just
such a unique and different
>> interesting let's talk about the nuances
of cloning your voice because for
example so what happens sometimes in my
team we clone my voice using all the
different mics that I have. But
sometimes we insert it and it's still
slightly different from the video
because the way we use it is that you
know we recorded something here. I
recorded some brand deal or whatever and
then I start traveling and they're like
could you re-record this phrase? So we
just take a piece from the video uh redo
it with a phrase that the brand asked
for. But then we insert in the video and
it's slightly different like the it
sounds in a different way. Are there any
ways to fix it? Yes, of course. So, we
re like ask it to uh remake it again,
but it's still like not exactly what we
recorded.
>> No, it's it's it's um it's of course a
tricky problem where when you create a
voice, you most likely take the voice
throughout the entire video and then you
create that voice and then in a it it it
is the effectively the average of how
you spoke around that video. But in a
given scene, you will have maybe changed
the inonation pattern a little bit or
the emotional pattern slightly off that
average. Um the ideal way would be to
affect for us to do more of the
conditioning on of like what you do pre
and post in the video. So we take that
more of as an input and we try to morph
it in in a slightly better way. Uh and
then there's a second thing sometimes
even though I know you'll try to clean
up the voice and and and then add the
background sounds background effects
they might be by by by by just the
process be mixed in and then not doesn't
smooth entirely. So from our side what
we hope to do over time is that the as
you insert those videos we can
precondition it after 3 seconds and
after and it will sound better. So
that's something we
>> have that feature. So upload the video.
>> So we are working on that. Not yet. It's
not applied, but it's it's going to be
the big piece. We definitely need to
bring it there. I think in the in the
short term,
>> what you mentioned is is what we see as
the most common pattern, which is
redoing and and and regenerating. But
the other thing you could try is uh try
to instead of um taking longer audio
sample across the video just take few
even few seconds or which I know sounds
like maybe it will be wrong worse result
but if you just take few seconds from
that fragment and create that lower
quality version it actually can could
sound pretty good.
>> Okay thank you. So where where do you
see all of this going with people
recreating their voices? Will everybody
have a clone in two or three years? like
because I we couldn't we could have
thought about you know 11 labs when I
heard about it like two or three years
ago right I couldn't think about a
salesperson using my voice now we have
it what do you think is going to happen
in two years what is this new use case
that this all is going to unlock
>> interesting question of course we are
seeing like kind of entirely new ways of
of of of interacting with voices so I do
think yes you will have your digital AI
voice and I think even step further you
will have your own digital voice agent
that does things for you, but you want
to make sure it's authenticated, people
know you operate. So, you know, like we
spoke about the example of how people
can call in, you can configure a voice
agent, but I think the other side will
be also true. I will have my voice agent
>> because they use voice authentication,
right? It's going to
>> I think that's not the best mechanism
for for the future
>> anymore. Not not anymore. Um but uh but
like say you want to book a restaurant
or follow up about appointment and a in
a in a in a healthcare and um and you
want to make sure that they know your
most recent details or that it's
confirmed. I think you will want an
authenticated version of voice agent.
I'm saying the authenticated because
like you say most of the verification if
they don't will will fail and you want
to know that it's a permissioned voice.
Um so you will need to start embedding
watermarks and and and metadata around
that. Um but I think the the to kind of
go back to your question of like where
it all evolves. I think there will be
like an interesting pattern where and I
think it will happen on both sides as a
user but also as a business. You will be
able to serve so many different voices
to your customers or you as a customer
can decide what voice speaks to you. So
to speak for specific examples we are
working with a company in in in Korea
Korea and Japan. Um it's a multinational
company there which has a very different
um age groups calling in um set of older
uh patients and then much younger uh set
of set of people and they want to serve
depending on the data the number that is
calling in serve different voice to that
group both in terms of how it speaks um
how it sounds but also the style in
which it speaks. Um, of course it's a
it's a you know it's a generalization
but roughly they wanted that if you if
an older person is calling in the voice
speaks much slower much calmer less
emotionality it's a younger person much
quicker a lot of higher amplitude of
emotions and I think this same pattern
will start happening across everything
where if you are calling in a specific
region you might have an accent of that
region if you are calling a restaurant
that's maybe representing a specific
cuisine you get a voice of that cuisine
speaking with you um and And maybe there
are like variations of all those
different types um which which which
which can work and then separately as a
person calling in to any of those
services you could pre-seelelect that
too. So if you are calling a bank and
you enjoy speaking always with the voice
of this specific style then you can
select it and that voice will be the
voice of your preference. We've seen
this uh happen in in an in a company in
also in in Asia where they created a um
effectively a a a travel agent or like a
Google maps competitive product where
you can select a voice that narrates
your direction and one of the voices
they selected became like viral and
everybody wants to use it now in the in
the in the in the travel directions
because it just made for such a better
experience. So if I extrapolate in the
future, I think there will be a lot more
both personalization but also selection
that you can choose into. I think 100%
true. You will have your own
authenticated voice that you can use for
your voice agent for your content
>> that has all the information.
>> Has all the information that you can
>> that's very interesting. I like that
part like having my voice call and be
authorized to use my data. How do you
talk about impersonation with voice?
like if there's if my voice is
authorized to use my credit card to buy
anything and then somebody just uses the
resemblance of it uh will there be any
metadata that could be detected by other
systems and how would it what would it
look like? Yeah, it's um so I think f
first of all I think it's it's it's
going to happen like I think the
assumption we should be going with is
that where um where you know you will
have good actors good technology trying
to avoid it but then there will be also
more permissive and and technology and
and and bad actors trying to abuse it
with any technology shift and already
now there is a lot of open- source
technology other commercial technology
which doesn't have the same safeguards
that could clone your voice and create a
mimicking and that sounds like you. Uh
so I think any system that we think
about devising in the future kind of
needs to uh assume that you can create a
clone of a voice and and and and and
make it a perfect replica. Now of course
if you like as I think about 11 laps we
can and we do add safeguards as you
create a voice. So you cannot do that or
if you do we detect it and moderate and
can flag it internally if we are not
sure. Um so whether it's it's being able
to trace everything back to to the
account or moderate what text was used
whether it was trying to do a scam. Um
but to core of your question like as you
think about the future the ideal system
and it would require cooperation from
number of parties would have three
different layers and then the first
layer is instead of trying to check for
AI you actually check for human. That's
easy for me to say. Of course, there's
like how do you check for humanness? But
a s simpler step or original step could
be that you on the devices that you use.
So on my telephone or on my uh laptop, I
am encoding that this is my phone, my my
laptop. When I'm calling from it, it's
being decoded on the other side. They
know that this is device I use. So most
likely this is me. That's the first
layer. Second layer is actually what we
spoke about earlier where and that's
that's possible. You watermark
authenticated AI. So if I'm using uh a
specific tooling, the tool the tools
that can add this watermark are known
and I watermark that within the content.
It's not um super straightforward
especially in audio because you if you
add a watermark in content it can affect
the quality of the content itself but
it's roughly roughly good and um and
that's the second layer. So you check
for authenticated AI and then the third
layer is by default is AI and you assume
it's AI. So if it didn't pass the first
or second layer and you see content that
hasn't been authenticated or proofed for
being a human, it's AI by default and
you don't trust it. And then you can add
more mechanisms on top of that third
layer where you like try to explicitly
check or add additional signal like ah
this is real. But that would be a
mindset shift where today if you look
for content you're like oh maybe this is
AI. It should be opposite where it's
like oh no this is definitely AI. Is it
maybe human or is it maybe AI that was
created with creators permission? And
then you have those cases in between
that will be interesting as as you of
course create the content. You mentioned
that sometimes if you need to re-record
you might create an AI voice with of
course with your with your with your
with your permission but then um do you
do that across the clip and maybe you do
that 1% or 5% of the content is AI
voice. Maybe in the future it will be 30
or 50%.
And at what stage would you say this is
like your AI delivery or or human
delivery?
>> You're you're a founder in AI? How do
you sleep at night when everything is
moving so fast? Uh what are your main
fears? What keeps you up at night?
>> I you know like I think there are two
parts to it. I think the first part that
I need to to mention is that it's it's
it's it's such an incredible opportunity
with the shift like it's a the biggest
shift or maybe bigger shift than the
internet and we are at 11 laps. So um
happy and lucky to be part of that shift
and be leading on the voice frontier. So
I I think the and I think that the team
and all of us are feeling that that we
have unique opportunity that never
happens in your life that you can create
a technology define how it will be used
and hopefully create value across across
whether it's voice agents and how voice
interface will look in the future
whether it's making content global
whether it's making content available in
audio. Um but of course with all of that
as we think about being at the frontier
it like also makes us carry some of the
responsibility for how we define that.
So um so a lot of our parts will will
stem from that. I think the first one is
we still think there's innovations on
the research level that you can bring
into the space at least one or two big
ones in audio and we've been able to do
it so far in text to speech speech to
text recently in music but we still want
to continue leading and continue being
better than some of the biggest labs in
the world whether it's uh some of the
new new AI companies or all in we think
we have that opportunity and uh and that
is motivating but of course definitely
causes less sleep at night. Uh um the
team is is is super hardworking too
which which makes for shorter nights. Um
then from the risks perspective we spoke
about some of those we uh we do feel
like it's our responsibility to make
sure that we avoid some of those risks.
So we are trying to invest a lot of time
in developing safeguards around that.
Then of course the third one with a lot
of the technology how the economy uh or
how the jobs in that economy will change
and we would like to do it in a way
which brings a lot of the people in that
economy together with the change rather
than it's change that will just affected
and disrupted but how can some some of
the people that want to be part of it be
part of that disruption too that's the
the voice ecosystem that we built is
part of that that that reason um uh but
you of course I think we need to we need
to keep hiring amazing I think people
keep keep pushing ahead as well while so
much is happening. I still think it's
very early. I may be biased and self-s
serving here but but it's it's still
very early.
>> You mentioned jobs that are being
replaced with voice technologies. What
do you think are the jobs that are at
most risk? I guess like customer support
and what should these people be doing
now to not get replaced in a couple of
years?
>> I think the the trope and and I think
it's very true is that or the people
that will be replaced will be replaced
by people that use AI. So I think this
is the key message that like you should
effectively go into trying a lot of
those tools um uh and products so you
stay at the at the frontier and then the
people that are in any of those jobs
that use AI I think can actually benefit
a lot a lot too and um even in customer
support of course a lot of that will
will will shift but for example what we
are seeing is that the simple manual
tasks of I know appointment taking or
doing and processing a simpler refund
And all of that is uh is like very
manual, very recipe based in most cases.
But then as you go to the more complex
parts, you need a human expert to help
close that gap. Um and that part of the
process is actually even more in need.
Whether it would be debugging a harder
problem that that you have in the
product, whether it's understanding your
what happens after the appointment,
there's a specific thing you receive and
you want to decide whether you need the
X or Y uh help which of course needs to
go for some of the regulation too. But
for all of those you kind of the the
pattern is that the expertise is even
more valued. And of course over time I
think the AI will start shifting and
taking more of that. So there will be
like some percentage that goes across.
Um but uh but that'll be my my my main
piece of like if you understand how AI
works, you can become more of the expert
and better knowledgeable yourself. Um
and and and help and that's also true in
a creative space too. I think in the uh
so so much you can do so you can iterate
so much more frequently. You can produce
to the wider audience.
>> You have to go faster and faster. That's
what I'm feeling with this. You can
definitely do faster iterations.
>> You have to run to stay where you are. I
don't know if you get this feeling, but
for me it's like the world is speeding
up every single day.
>> I do think it's speeding up, but at the
same time, I think it's not zero sum
where it's not uh by by speeding up in
this category doesn't take away from
another category. I think the entire
economy is just growing as well with
with a lot of that adoption. So there
will be more creative opportunity um
than it ever was before and yes to be
part of that creative opportunity you
probably need to move faster with a lot
of the innovation than you might have
needed to before but you I I think still
like a wide set of of of people can can
and will benefit but of course you know
it's going to a lot of the the
repetitive manual non-talented
intelligence non like basic intelligence
based work will be will be replaced with
well AI workflows. Um and the best the
best way to to avoid this is is is by
learning a lot of the AI tooling. So you
yourself are better and and maybe just
to finish off and maybe to summarize the
customer support piece thinking about it
slightly differently and outside of
customer support is that frequently if
you have a domain expertise whichever
domain that is then you that's that's
where you can um deliver even more
value. So combin combining your domain
expertise with AI is um is is is much
higher uh um value and and and and
output. And if you don't have domain
expertise then you probably want to gain
that domain expertise uh which which
which which would be
>> yeah I've seen a lot of graphs for like
future of jobs reports and uh there's
this section like your expertise plus AI
and it goes like this in terms of
demand.
What would be the tools that you would
recommend everyone to start using now?
Name top three AI tools.
>> Top three AI tools. Okay. Outside of 11
Labs, which you do need to try and use.
Uh I would say I really like Black
Forest Labs for for their for their
image uh image work. I mean the Mid
Journey has been cranking out for for so
many years, but Black Forest Labs I
really like as kind of the uh new
iteration. And I think they have a good
realism and I think they will go through
a set of additional iterations that that
are that are great from the classic
ones. Um I mean entropics cloud code I
think it's incredible. Uh where where
where I think it helps you like be
another level engineer or even if you're
not engineer try to be a little bit more
of the engineer. And then last one I
would really I really like lovable. Um
but similarly I mean vzero rep are
great. Yeah.
>> Uh uh but uh but given given we are in
Europe, I I feel uh lovable deserves the
the the
>> they're from Sweden, right?
>> They are from Sweden. Yeah. Uh but all
of them I mean it's it's just so
incredible to see like our go-to market
teams try whether it's lovable vis or
replet. Um I think now Figma also
launched their so I haven't tried it yet
but uh that's uh it's it's it's fun to
see how like people that haven't been
traditionally on the engineering front
are closer and they understand the
product pain points they understand the
use case all better. So there's both
this path of like prototyping showing
the clients which is amazing but then
also by extension they are effectively
getting closer to what is behind the
scenes on the product side too.
>> Yeah. And when when you mentioned
lovable, do you build something for
yourself or for 11 Labs?
>> Um both. So on the go to market side, we
frequently will do a demonstration to to
a customer of like let's say we were
doing the use case that you mentioned.
We could build a prototype on a mockup
website of how the checkout would look
like, how the agent would interact with
you. That's that type of um type of use
case all the time, whether it's on the
pre or conferences or with the client
calls. Uh but also on a personal side, I
recently tried with my two nieces to um
to they are five and seven years old. So
I have the best job of being fun or
trying to be. And they um uh uh they
were we were speaking about uh how they
could potentially create a story
generator for themselves where you would
type in the character names and the
story would be created.
>> You're an entrepreneur. You started this
company, spotted this opportunity. Do
you see any other areas aside from voice
where people should be doubling down?
Because um one of the founders I had on
this podcast told me that uh actually
co-founder of hugging face, he told me
that in the in the next 5 years you have
to be an entrepreneur or you're done. So
a lot of people are learning how to
become an entrepreneur. Can you name
some opportunities that you see that can
make people decent amount of money so
they can make a living like 10k a month?
Something that's immediate something
that you see a gap in the market. it
will be voice specific but I think it's
so so early that I think it's it's a
huge one is uh there's definitely a lot
of the infrastructure being built for
the voice agents we we we build it but
other companies are are are too um and I
think there is a big gap between voice
agents and then actually deploying them
in a lot of those businesses and you
don't have to have the engineering
expertise to deploy those voice agents
the platform now frequently will support
a relatively self-s served manner of
taking it but you can easily
take that voice agent and deploy that in
a specific domains and most of the
businesses in the world still don't know
don't not don't know know about it if
it's you know not um venture scale uh
business and you just want to make good
money I would try to take those voice
agents and go to um let's say local
doctor's office and help them
appointment schedule for for for the
dentist so they can take appointments
more easily and they can then focus more
on the work instead of nurse doing that
in between or missing appointments
that's actually one of the most common I
don't know the the percentage but so
frequently those those appointments
don't get booked because there's no one
on the phone and can take them um you
can go to local mechanics and help them
take appointments and I think there's
all of these require slight variation of
the domain piece that you need to know
and all of those businesses are in
thousands to tens of thousands of
dollars per month if you get it to the
few um the infrastructure is there you
just need to bring it to to to those
domains
>> yeah it's like B2B be automated
businesses with AI.
>> Yeah. And small businesses all
>> you don't have to be a coder.
>> You don't have to be the coder. You just
need to spend the time call them and ask
or or or go to them. Um and I think
there's like this category which might
not be um taken off by by some of the
biggest companies that will focus on
bigger enterprise uh elements. So like
you know classic uh uh this is like
small medium businesses rather than than
than the enterprise segment. And at the
same time most of those companies just
don't know this is possible. So like
next year or two is just a incredible
opportunity to do it. And of course you
know starters in English speaking but I
think the same is true for so many of
the of the countries and languages which
which might be uh given so much of that
work isn't always localized. I think in
our case uh we doing a pretty good job
there. You can you can bring it to a
local market and do exactly the same the
same work.
>> Absolutely love it. Thank you.
>> Thank you. So if you were a starting
company today and you're a brand new
entrepreneur, what would be your advice
for anyone who's starting out?
>> The first advice would be that you
deeply understand your your user um and
the problem that you're trying to fix.
Like I think that would be the first
piece. It's like do I know the problem
and do I know people have that problem?
because you you started 11 Labs because
you were uh you didn't like the
transcribing the translation of
>> this is a super um super crazy piece uh
that in Poland if you watch a movie all
the characters whether it's a male or
female character are narrated with one
single voice
>> with no inonation right
>> no inonation it's flat exactly exactly
>> I think it was the same as postviet
times and
>> exactly because and it still continues
today you know it was it was an kind of
obvious when we started looking into the
audio space and then realized that this
is still a problem something we grew up
with something that you ask any Polish
person or most of Polish people and they
will tell you how bad of an experience
that is as you can likely imagine it's
pretty bad and it will it will it will
change and um and it will take an
obvious okay if you think about the
future you will have all different or
different um uh original voices
represented so if the movie is streamed
you just hear exactly the same language
of course expanded from the dubbing to
to just uh voice overs and speech
because so much of the content isn't
available in audio in the first place
and and now a lot of voice and stuff but
it was a very clear problem and I think
as I think about starting a company or
if I were to start a company again I
would try to obsess about the problem
and then the second one is um do people
actually have that problem is it
actually burning a problem and in
dubbing it was a good example where we
we thought the dubbing is the biggest
problems but before we actually solved
the dubbing We realized from a lot of
conversation with users that there are
so many other problems that they would
like to fix first. The most common one
is actually one you mentioned where
people just wanted to repair lines after
recording um or just being able to
deliver voice over without speaking and
that was like the most common thing
after we tried to reach out to people
like oh before we had it ready it's like
hey we are almost finished with our
dubbing product would you like to dub
your movies and they most likely we
would get some small percentage of
replies and then in inside of those
replies it would be yes this would be
interesting but actually if you could
help me with just my voice and it yeah
that would be much much better. So then
we were like okay there's this
incredible opportunity that's smaller uh
component of the technology we want to
build that we should we should do
instead uh first and and and we did and
then we validated that again and people
were yes that's that's that's something
we would love and then um given we
started from um creators on on on social
media
uh after we heard this but then we
realized that there are actually other
people not on social media that also
want voiceovers being The biggest group
for us was book authors initially.
Everybody just couldn't record audio
books. Exactly.
>> Because that's like a few days in the
studio.
>> Few days in the studio. Very expensive.
So many people get tired with the voice.
So it's never as as expected initially.
So it takes more than that. And then
that turned out to be like second of the
first biggest ones. So
>> but you actually built the dubbing
product first and you realized nobody
wanted to pay for it.
>> Yeah. So we we we did the prototype. We
did. Yeah. psych was a little bit of a
like you know like a stitch up of not um
not it did it did have a little bit of
our own research but uh but it wasn't it
wasn't um months of work uh it was like
we we created a prototype we while we
were building the prototype we're
reaching out to customers like we we
were working on this do you want it we
had a good waiting list then we tried to
show them what it what it how it looks
and they were like oh this quality isn't
as But if you could actually help me
with this and this instead, it would be
better. Which is the same technology
because people notice that if you dab,
you can hear the voice of the person in
the other language, it still sounds the
same. And and it turned out that the
problem was even earlier. It's like, oh,
just my voice.
>> I love that. I love how you started with
the surface. Then you went deeper and
built the whole technology that solved
so many problems that were in the
surface as well. Yeah, that that's how
so I think yeah I think the
entrepreneurs building today if if they
understand the problem then and and and
of course the I'm in very lucky position
where I know my co-founder now for 15
years and know it know know him inside
out and he is the the genius behind a
lot of the the work we we we do but I
think that would be my second piece
where like you want to really pick your
uh co-founders and the early team as
carefully as you as you can as these
will be the people you will spend most
of the nights and and years ahead
success depends all of that the culture
depends on that. So um and then
similarly we're very very um happy to
have some of the best early joiners like
to one of the person on the growth side
we trusted inside out and two of our
engineers turned out to be some of the
hard most hardworking and smart
engineers we we have which set up the
culture bar very high.
>> Nice. Nice. Okay, I'm going to uh wrap
up with this question. As a person who's
been advocating learning languages, will
people still learn languages in 3 years?
If they can have their AI authorized
voice speaking any language, join any
Zoom call, the only thing that's left is
maybe a one-on-one conversation, but
then maybe we have a device that
translates everything.
>> The uh interesting one I think they
will, but the not always the primary p
purpose will be for for understanding
others. It will be frequently for uh um
just developing yourself as a more of an
enjoyable thing you want to do for your
own sake
>> like horse writing right from a
necessity to a hobby right
>> to to more of a hobby and of course
there are like parts that by learning
language you learn the culture you learn
and your your kind of your perspective
opens I think that still will be true
>> or if you're moving to another country
>> are you moving to the country
>> I mean like if you want to move to the
US you would still learn some English
right
>> hopefully will not need to do it and you
will still be able to understand a
culture in a level that you never could
before. So
>> hitchhikers it will be like a bubble
finish variation like headphone maybe
device maybe neural link but even in
those cases there will be some
processing time involved because you
need to finish speaking for the device
to pick it up and then translate it. So
language natively speaking will be
better. Uh but yes, I do think most of
that need will disappear for you to be
able to interact and and and understand
which I think will be a beautiful thing
and then hopefully you can you can learn
it for other purposes.
>> Interesting how the whole industry is
like might disappear or might transform
completely but it's it's happening not
to just language learn it's happening to
everything
>> 100%. But I think it will stay. Uh it's
uh I don't know if definitely it will
morph but uh but some some some of that
will definitely stay.
>> Thank you so much M. It was very
inspiring and very practical. I love
that
>> and thank you so much for being an early
user and all the feedback as well.
>> Thank you. And I'm hoping we're going to
integrate the sales part. I'm excited
about that. Talk to my team right now.
Let's go. Thanks.
>> Thank you. Thank you.
Loading video analysis...