LongCut logo

$6.6B AI CEO: How to Make Your First $10,000 with AI

By Silicon Valley Girl

Summary

## Key takeaways - **Voice is the future interface for AI**: Voice will be a key interface for interacting with technology, transferring more information than text by conveying emotionality, inflection, and imperfections. [01:47] - **AI voice agents boost business efficiency**: AI voice agents can handle customer support, guide users through products, and accelerate sales pipelines by providing instant information and even converting leads for self-serve business tiers. [02:37], [03:56] - **Monetize your voice on a marketplace**: Creators can earn passive income by cloning their voice, recording about 30 minutes of audio, and sharing it on a voice marketplace where others can use it, with over $5 million paid out to the community. [12:17], [13:36] - **AI safeguards needed for voice authentication**: As voice cloning advances, a three-layer safeguard model is proposed: device authentication, watermarked AI content, and defaulting to distrusting unverified content to combat deepfakes and impersonation. [21:36], [23:03] - **Adapt to AI: Use it to enhance expertise**: Jobs at risk are those replaced by AI, but individuals can adapt by learning AI tools, enhancing their domain expertise, and combining it with AI for higher value and output. [27:54], [30:54] - **Focus on problems to build a business**: When starting a company, obsess over the user's problem and validate if it's a burning issue; 11 Labs pivoted from dubbing to voice cloning after discovering users prioritized the latter. [36:53], [38:38]

Topics Covered

  • Will Voice Become Our Main AI Interface?
  • How Voice AI Drives Business Growth and Sales?
  • Earn Passive Income by Sharing Your Voice.
  • How to Trust Voices in an AI-Driven World?
  • Combine Expertise with AI to Thrive in the New Economy.

Full Transcript

We paid about $5 million to the entire

community.

>> Meet Mari, CEO and co-founder of 11

Labs, a company that has grown into a

$6.6 billion leader in the voice AI

space, shaping how we talk, work, and

even earn money. They've created an

entire voice marketplace. Now, anyone

can clone their voice and earn passive

income. Can you name some opportunities

that you see that can make people this

amount of money so they can make a

living like 10k a month? Something

that's immediate

>> business and you just want to make good

money. I would try to take those voice

agents and go to let's say local

doctor's office and

>> 11 labs built the world's most realistic

voice deck. The question is can they

control what happens next?

>> Most of those companies just don't know

this is possible. You don't have to be

the coder. You just need to

>> if my voice is authorized to use my

credit card to buy anything and then

somebody just uses the resemblance of

it. I

>> think it's it's it's going to happen.

But uh

>> hey guys, welcome to Silicon Valley

girl. We have one of the guests today

whose product I've been using for a

while now. So I'm going to ask a little

technical questions as well, but please

welcome Mati from 11 Labs. Thank you so

much.

>> Thank you so much, Marina. Great to see

you again and thanks for thanks for

having me.

>> Yeah, thank you. So I feel like you're

one of the pioneers of this AI industry

because when I ask people like what apps

they're using or when I'm talking about

apps that I'm using I always mention 11

labs because it's been a lifesaver. I

wanted to start with a question um about

the role of voice in AI. So what it

feels to me is that 2023 you know we

started adopting Chad GBT. It was all

text and then these voice capabilities

became more and more powerful. It

understands what I'm saying now. It

understands my accent. If I mispronounce

something it still gets me. Do you feel

like we're moving into the era where

voice is our main tool to interact with

AI? I mean 100% I do think that voice

will be the one of the key interfaces to

the technology around us and um and that

shift is happening like you said it's

like few years back you wouldn't even

dream of this being possible and now I

think it's it's becoming a reality where

it it allows you to transfer so much

information more than the text you can

you can get the emotionality the

inflection pattern the imperfections

reflected in the voice which of course

makes it easier for the um if it's an

input for the for the technology to

understand a lot more about the the

setup that or what you are trying to

achieve and then if you hear it back as

well I think it's a lot better and more

pleasurable um experience as well

>> how do you see voice transforming

businesses do you have any cases where

people are using voice to generate leads

or convert leads

>> there's definitely a few different areas

whether it's on the more classic uh

customer support use cases where you

instead of having a old IVR system or no

system, you can now deploy a voice agent

that will take the calls instead and and

will both delight the customers on the

other side because it understands you.

It's quick, it's good um but then also

just performs better. And then outside

of customer support, we are seeing that

across the entire life cycle of of of

the user journey in some places where uh

uh it adds some an experience that

wasn't possible before. in a simple case

is uh inside of the product or even

outside of the product um and you might

have seen back in the day there was

those widgets for chat. Now you could

have a voice agent that helps you

navigate through the product experience.

So it becomes your like a partner

programmer product person that helps you

navigate through that that life cycle.

And you also mentioned so of course some

of the big pieces is in inbounding and

outbounding. We actually use it

ourselves in the 11 laps too where um

where of course we we do have a standard

flow. We have people that will answer

the the the the reply and take a a phone

call too. But if you want to go quicker,

you can speak straight directly with our

agent to understand our product

offering, understand our pricing,

understand what you what you can do with

the product, which helps you accelerate

through the pipeline depending uh and

sometimes self- disqualify if you are

not the right uh um fit for our product

offering and sometimes helps you

accelerate. Okay, this is exactly the

set of use cases I can do. This is how I

can deploy and then routts it to other

people.

>> So it doesn't actually convert

>> it uh in some cases it does. In some

cases it's um as a quick step back we

have a few different tiers. We have like

a business tier and an enterprise tier.

So it does convert immediately sometimes

to the business tier program.

>> It's a preset

>> because it's preset it's self-s serve.

Um on the enterprise side we all still

run KYC checks. So it doesn't do that

immediately. Uh but uh but on the

business one it it it does and and then

we've seen some of those voice um agents

also um from from a lot of the

technology and platform we built help in

a completely different non-commercial

aspects too.

>> Quick follow-up question for like about

the the sales process. Have you measured

the conversion uh percentage into sales

with the AI voice salesperson? We did

but given it was uh and I don't remember

the number off top of my head but given

there was alternative before would have

been just waiting so it was just a net

new amount of leads and we received so

much inbound of of using a lot of the

products which we are lucky to to have

that it helped us just convert so many

more leads that we would have otherwise

taken weeks months or or maybe never

gotten into.

>> How can I set this up for my company?

Let's take a quick break here. You know,

as we're talking about how AI is

transforming sales and support, there is

one thing that hasn't changed for any

business. No matter what tools you use,

you still need a home for your product.

A website where your customers can

actually find you. And here's the

challenge that I've experienced myself

many, many times. Try registering a

good.com domain today. Almost everything

is taken. You end up with these long,

awkward names that don't really match

your brand. That's why I was so excited

to discover online domains. It's

actually the world's second largest new

domain extension trusted by more than

3.5 million businesses worldwide. And

the word online itself is incredibly

powerful. It's searched over 500 million

times every month, which means it helps

you rank higher and become more

discoverable in search. What I really

like is how it works for literally any

type of business. Freelancers, creators,

service providers, big or small

companies. I've seen everyone from

global stars like Maluma, Colombian

Megastar with over 100 million audience

across different social medias with

Maluma.online to the classic game

Mindeeper who all spend hours playing

now lives on Mindeeper.online.

So if you're trying to build your

business on.com domain, for example,

voice agent.com, you know how often the

good names are gone. Withonline is much

easier to secure the domain that

actually fits your business. Whether

it's an AI startup, a side project, or

your personal brand, now is the perfect

moment to claim your domain. And the

good news, for a limited time, you can

get it for just 99 cents for the first

year with my exclusive link and coupon

code. Just go to www.get.online

or use the code from the description and

secure your name today. Let's get back

to the interview with Marty. How can I

set this up for my company?

>> The easiest one would be to register on

our platform. Uh so that that part of

offering and we have two key offerings

is our agent platform offering. You jump

into the platform and we help you

abstract two elements. The first one is

all the research or experience

complexity. So we help you connect the

speech the LLM elements the texttospech

elements. So so the agent speaks in a

smooth and a a quick way. So it's a very

a low latency a reliable part on that

side. And then there's a second part

where you will need to spend a little

bit more time on bringing your business

logic in place. So example could be

what's the knowledge base of how your

business operates or what are the

questions you want to be asked. What are

the materials you want to surface? So

you would bring that into the platform.

Then we have a set of workflows that you

can set up effectively. Imagine like if

this happens this happens or if this

happens I want this function to trigger.

Um, this could be if someone is calling

me and I want to appoint, schedule an

appointment. We have a predefined

workflow for you to be able to do this.

So, it can look into your calendar

appointment.

>> Selling a course basically like what

language does, we're we sell courses.

So, basically one simpler

>> to sell courses to people. Can I do it

in different languages using my voice?

>> You can.

>> Wow. So you could you could and so so

it's selling the courses and the people

would call in buy the course and off off

to the go and maybe they on board with

the agent later on to help.

>> How do they how do they buy over the

phone? Do you send them a link ask for

their email or they just Yeah,

>> it depends. Uh but the simplest would be

what you suggest which is we do have an

omni channel solution where you

effectively get a link as part of that

and you can leave additional details or

you have a follow-up on the email of

like a checkout subscription for the

course. So both of those would be

possible. Or you could, depending on how

that website is set up, you could

effectively embed the agent on your

website. So it helps you redirect to the

subscription page. It guides you through

it and they check out themselves live

with the agent that helps them

>> wow fill in the form. But like you said,

one of the great things on the function

side is that you can you can you can

switch languages. You can hand over like

>> that's fascinating for my business. So

it's I mean you've been pioneering a lot

of that language learning work and I

think this would be amazing because both

it would switch the language and to

switch it with your own voice if that

was your own voice so it continues

speaking in that same manner and then of

course the last piece is all the

integrations so we support integrations

>> where you headed congratulations

>> thank you it's one of the big so maybe

that's a good cue for me as well but

because when we started the company we

of course started from pioneering the

research on the speech side so text to

speech voices and then we expand it to

speech to text the orchestration models

now music but as we think about the

research it's always how we can push the

audio frontier forward

>> I love how you found this new

opportunity and now it's bigger chunk of

your business as far as I understand how

much would it cost for a business like

mine small business to have AI answer

the calls and sell

>> I think the and and of course depends on

the volume but I think what hopefully

will happen is that both you will see

more people coming through and if we set

it up in the right way Maybe this will

mean even opening up the channel which

over time hopefully means even more

calls but I think to start it would be

in order of hundreds of dollars per

month.

>> Mhm. It's also IP calling right uh is

integrated in that.

>> Yes. So we integrate with Twilio or or

or Telephone systems. Okay. So whatever

works.

>> So you can bring Yeah. You can bring any

phone number that you already have and

it works. I I don't know who currently

do you already accept any of the calls

coming through the telephone too or it's

all all on the website? We mostly try to

navigate them to WhatsApp because a lot

of people who are calling they don't

speak English so they don't feel

comfortable. But if we advertise that

it's, you know, Marina's voice AI,

nobody's judging your accent because I

feel like when people even talk to me,

if they're non-native speaker, they

first the first thing they do, they're

like, "I'm sorry, my English is not as

good as you." I'm like, "It doesn't

matter." But I feel like even like using

English to make a phone call is such a

huge barrier for non-native speakers.

And I feel like if you understand that

you're talking to AI, it just makes it

so much easier.

>> That's true. It doesn't judge. You can

do little mistakes, which is maybe a,

you know, like uh there's a completely

other aspect what, uh, you of course

been helping people learn languages for

a long time. But maybe there's even an

aspect where they could practice

speaking their language with you. uh

which would be like a you know kind of a

slightly different of course deployment

but completely possible where you can

give them tips improve uh in in and

effectively create a marina's dual lingo

that people have dynamic experience with

which is another kind of incredible area

that's growing in the at tech space.

>> Yeah, let's let's talk about that part.

So we talked about deploying 11 labs to

work as an sales agent. Let's talk about

like I have this number here where you

paid $2 million in royalties to people

who kind of share their voices with 11

Labs. Can you talk about that? How can

people start making money by share their

voice with 11 Labs?

>> So uh it's it's one of the efforts we

launched in the early days where we we

effectively created a voice marketplace

voice ecosystem where every person can

create their own voice go through

authentication flow. You need to record

roughly 30 minutes or more of you

speaking. Then you have a perfect

replica of your own voice that speaks in

in the language you recorded plus all

the language we support. So you have

usually 30 or so um different

variations. Now with the new model we

are releasing will be 70. So um so you

have the voice that that's now available

for your own use and then if you decide

you can share it to our marketplace and

if you share it to your to our

marketplace specific period of time

specific conditions of what you are

sharing it for then other people can use

it across 11 laps ecosystem and when

your voice is being used you get paid

back as a result. This way we have now

almost 10,000 voices that people shared

and created. What is incredible is it

spans so many different languages,

accents, um different styles. So like

now if you are logging to the to the to

the platform, you just have this

incredible plethora of voices and we pay

uh pay voice uh pay voice down back. So

it was I think $2 million at the

beginning of the years that we paid back

and now um I think last time I checked

it was a few months ago. We paid back $5

million to the entire community.

>> How much does average an average voice

creator make? It depends uh of course

you know like so it's like the the like

probably in total approaching close to

$10 million and we have close to 10

10,000 voices. Um so that would be like

you know if you if you if you take the

average uh but I think it it's um

especially given a lot of the voices got

are kind of new and it takes a little

bit of time before they take attention.

You also to actually make it successful

ideally you try to engage some of the

community around that they can see the

voice whether it's the discord the

Reddit some of the other forums it

definitely helps break through that

initial and if not over time we also try

to surface new voices and and and get

them out in the audiences so it really

depends I think it'll be a lot of people

in like a few hundred per month category

and that's probably what you could

expect if if you if you do a little bit

of that effort and and what you could

what you could earn. However, the you

know, it's it's um I think it's true

that it's if your if if your voice

sounds very similar to other voices,

it's very much

>> Yeah, it's interesting how many voices

like in general do you have

>> and how many can you distinguish

>> but if you if you have a unique voice,

if you have a new Exactly. then then it

can it can be it can be incredible. our

first voice uh one of our first voices

that got shared and it was a Spanish

voice that had a very deep um way of of

speaking the deep proided and uh that

voice became one of the most popular not

in Spanish but in English-speaking

countries and became like our top 10

voice um where where where it was just

such a unique and different

>> interesting let's talk about the nuances

of cloning your voice because for

example so what happens sometimes in my

team we clone my voice using all the

different mics that I have. But

sometimes we insert it and it's still

slightly different from the video

because the way we use it is that you

know we recorded something here. I

recorded some brand deal or whatever and

then I start traveling and they're like

could you re-record this phrase? So we

just take a piece from the video uh redo

it with a phrase that the brand asked

for. But then we insert in the video and

it's slightly different like the it

sounds in a different way. Are there any

ways to fix it? Yes, of course. So, we

re like ask it to uh remake it again,

but it's still like not exactly what we

recorded.

>> No, it's it's it's um it's of course a

tricky problem where when you create a

voice, you most likely take the voice

throughout the entire video and then you

create that voice and then in a it it it

is the effectively the average of how

you spoke around that video. But in a

given scene, you will have maybe changed

the inonation pattern a little bit or

the emotional pattern slightly off that

average. Um the ideal way would be to

affect for us to do more of the

conditioning on of like what you do pre

and post in the video. So we take that

more of as an input and we try to morph

it in in a slightly better way. Uh and

then there's a second thing sometimes

even though I know you'll try to clean

up the voice and and and then add the

background sounds background effects

they might be by by by by just the

process be mixed in and then not doesn't

smooth entirely. So from our side what

we hope to do over time is that the as

you insert those videos we can

precondition it after 3 seconds and

after and it will sound better. So

that's something we

>> have that feature. So upload the video.

>> So we are working on that. Not yet. It's

not applied, but it's it's going to be

the big piece. We definitely need to

bring it there. I think in the in the

short term,

>> what you mentioned is is what we see as

the most common pattern, which is

redoing and and and regenerating. But

the other thing you could try is uh try

to instead of um taking longer audio

sample across the video just take few

even few seconds or which I know sounds

like maybe it will be wrong worse result

but if you just take few seconds from

that fragment and create that lower

quality version it actually can could

sound pretty good.

>> Okay thank you. So where where do you

see all of this going with people

recreating their voices? Will everybody

have a clone in two or three years? like

because I we couldn't we could have

thought about you know 11 labs when I

heard about it like two or three years

ago right I couldn't think about a

salesperson using my voice now we have

it what do you think is going to happen

in two years what is this new use case

that this all is going to unlock

>> interesting question of course we are

seeing like kind of entirely new ways of

of of of interacting with voices so I do

think yes you will have your digital AI

voice and I think even step further you

will have your own digital voice agent

that does things for you, but you want

to make sure it's authenticated, people

know you operate. So, you know, like we

spoke about the example of how people

can call in, you can configure a voice

agent, but I think the other side will

be also true. I will have my voice agent

>> because they use voice authentication,

right? It's going to

>> I think that's not the best mechanism

for for the future

>> anymore. Not not anymore. Um but uh but

like say you want to book a restaurant

or follow up about appointment and a in

a in a in a healthcare and um and you

want to make sure that they know your

most recent details or that it's

confirmed. I think you will want an

authenticated version of voice agent.

I'm saying the authenticated because

like you say most of the verification if

they don't will will fail and you want

to know that it's a permissioned voice.

Um so you will need to start embedding

watermarks and and and metadata around

that. Um but I think the the to kind of

go back to your question of like where

it all evolves. I think there will be

like an interesting pattern where and I

think it will happen on both sides as a

user but also as a business. You will be

able to serve so many different voices

to your customers or you as a customer

can decide what voice speaks to you. So

to speak for specific examples we are

working with a company in in in Korea

Korea and Japan. Um it's a multinational

company there which has a very different

um age groups calling in um set of older

uh patients and then much younger uh set

of set of people and they want to serve

depending on the data the number that is

calling in serve different voice to that

group both in terms of how it speaks um

how it sounds but also the style in

which it speaks. Um, of course it's a

it's a you know it's a generalization

but roughly they wanted that if you if

an older person is calling in the voice

speaks much slower much calmer less

emotionality it's a younger person much

quicker a lot of higher amplitude of

emotions and I think this same pattern

will start happening across everything

where if you are calling in a specific

region you might have an accent of that

region if you are calling a restaurant

that's maybe representing a specific

cuisine you get a voice of that cuisine

speaking with you um and And maybe there

are like variations of all those

different types um which which which

which can work and then separately as a

person calling in to any of those

services you could pre-seelelect that

too. So if you are calling a bank and

you enjoy speaking always with the voice

of this specific style then you can

select it and that voice will be the

voice of your preference. We've seen

this uh happen in in an in a company in

also in in Asia where they created a um

effectively a a a travel agent or like a

Google maps competitive product where

you can select a voice that narrates

your direction and one of the voices

they selected became like viral and

everybody wants to use it now in the in

the in the in the travel directions

because it just made for such a better

experience. So if I extrapolate in the

future, I think there will be a lot more

both personalization but also selection

that you can choose into. I think 100%

true. You will have your own

authenticated voice that you can use for

your voice agent for your content

>> that has all the information.

>> Has all the information that you can

>> that's very interesting. I like that

part like having my voice call and be

authorized to use my data. How do you

talk about impersonation with voice?

like if there's if my voice is

authorized to use my credit card to buy

anything and then somebody just uses the

resemblance of it uh will there be any

metadata that could be detected by other

systems and how would it what would it

look like? Yeah, it's um so I think f

first of all I think it's it's it's

going to happen like I think the

assumption we should be going with is

that where um where you know you will

have good actors good technology trying

to avoid it but then there will be also

more permissive and and technology and

and and bad actors trying to abuse it

with any technology shift and already

now there is a lot of open- source

technology other commercial technology

which doesn't have the same safeguards

that could clone your voice and create a

mimicking and that sounds like you. Uh

so I think any system that we think

about devising in the future kind of

needs to uh assume that you can create a

clone of a voice and and and and and

make it a perfect replica. Now of course

if you like as I think about 11 laps we

can and we do add safeguards as you

create a voice. So you cannot do that or

if you do we detect it and moderate and

can flag it internally if we are not

sure. Um so whether it's it's being able

to trace everything back to to the

account or moderate what text was used

whether it was trying to do a scam. Um

but to core of your question like as you

think about the future the ideal system

and it would require cooperation from

number of parties would have three

different layers and then the first

layer is instead of trying to check for

AI you actually check for human. That's

easy for me to say. Of course, there's

like how do you check for humanness? But

a s simpler step or original step could

be that you on the devices that you use.

So on my telephone or on my uh laptop, I

am encoding that this is my phone, my my

laptop. When I'm calling from it, it's

being decoded on the other side. They

know that this is device I use. So most

likely this is me. That's the first

layer. Second layer is actually what we

spoke about earlier where and that's

that's possible. You watermark

authenticated AI. So if I'm using uh a

specific tooling, the tool the tools

that can add this watermark are known

and I watermark that within the content.

It's not um super straightforward

especially in audio because you if you

add a watermark in content it can affect

the quality of the content itself but

it's roughly roughly good and um and

that's the second layer. So you check

for authenticated AI and then the third

layer is by default is AI and you assume

it's AI. So if it didn't pass the first

or second layer and you see content that

hasn't been authenticated or proofed for

being a human, it's AI by default and

you don't trust it. And then you can add

more mechanisms on top of that third

layer where you like try to explicitly

check or add additional signal like ah

this is real. But that would be a

mindset shift where today if you look

for content you're like oh maybe this is

AI. It should be opposite where it's

like oh no this is definitely AI. Is it

maybe human or is it maybe AI that was

created with creators permission? And

then you have those cases in between

that will be interesting as as you of

course create the content. You mentioned

that sometimes if you need to re-record

you might create an AI voice with of

course with your with your with your

with your permission but then um do you

do that across the clip and maybe you do

that 1% or 5% of the content is AI

voice. Maybe in the future it will be 30

or 50%.

And at what stage would you say this is

like your AI delivery or or human

delivery?

>> You're you're a founder in AI? How do

you sleep at night when everything is

moving so fast? Uh what are your main

fears? What keeps you up at night?

>> I you know like I think there are two

parts to it. I think the first part that

I need to to mention is that it's it's

it's it's such an incredible opportunity

with the shift like it's a the biggest

shift or maybe bigger shift than the

internet and we are at 11 laps. So um

happy and lucky to be part of that shift

and be leading on the voice frontier. So

I I think the and I think that the team

and all of us are feeling that that we

have unique opportunity that never

happens in your life that you can create

a technology define how it will be used

and hopefully create value across across

whether it's voice agents and how voice

interface will look in the future

whether it's making content global

whether it's making content available in

audio. Um but of course with all of that

as we think about being at the frontier

it like also makes us carry some of the

responsibility for how we define that.

So um so a lot of our parts will will

stem from that. I think the first one is

we still think there's innovations on

the research level that you can bring

into the space at least one or two big

ones in audio and we've been able to do

it so far in text to speech speech to

text recently in music but we still want

to continue leading and continue being

better than some of the biggest labs in

the world whether it's uh some of the

new new AI companies or all in we think

we have that opportunity and uh and that

is motivating but of course definitely

causes less sleep at night. Uh um the

team is is is super hardworking too

which which makes for shorter nights. Um

then from the risks perspective we spoke

about some of those we uh we do feel

like it's our responsibility to make

sure that we avoid some of those risks.

So we are trying to invest a lot of time

in developing safeguards around that.

Then of course the third one with a lot

of the technology how the economy uh or

how the jobs in that economy will change

and we would like to do it in a way

which brings a lot of the people in that

economy together with the change rather

than it's change that will just affected

and disrupted but how can some some of

the people that want to be part of it be

part of that disruption too that's the

the voice ecosystem that we built is

part of that that that reason um uh but

you of course I think we need to we need

to keep hiring amazing I think people

keep keep pushing ahead as well while so

much is happening. I still think it's

very early. I may be biased and self-s

serving here but but it's it's still

very early.

>> You mentioned jobs that are being

replaced with voice technologies. What

do you think are the jobs that are at

most risk? I guess like customer support

and what should these people be doing

now to not get replaced in a couple of

years?

>> I think the the trope and and I think

it's very true is that or the people

that will be replaced will be replaced

by people that use AI. So I think this

is the key message that like you should

effectively go into trying a lot of

those tools um uh and products so you

stay at the at the frontier and then the

people that are in any of those jobs

that use AI I think can actually benefit

a lot a lot too and um even in customer

support of course a lot of that will

will will shift but for example what we

are seeing is that the simple manual

tasks of I know appointment taking or

doing and processing a simpler refund

And all of that is uh is like very

manual, very recipe based in most cases.

But then as you go to the more complex

parts, you need a human expert to help

close that gap. Um and that part of the

process is actually even more in need.

Whether it would be debugging a harder

problem that that you have in the

product, whether it's understanding your

what happens after the appointment,

there's a specific thing you receive and

you want to decide whether you need the

X or Y uh help which of course needs to

go for some of the regulation too. But

for all of those you kind of the the

pattern is that the expertise is even

more valued. And of course over time I

think the AI will start shifting and

taking more of that. So there will be

like some percentage that goes across.

Um but uh but that'll be my my my main

piece of like if you understand how AI

works, you can become more of the expert

and better knowledgeable yourself. Um

and and and help and that's also true in

a creative space too. I think in the uh

so so much you can do so you can iterate

so much more frequently. You can produce

to the wider audience.

>> You have to go faster and faster. That's

what I'm feeling with this. You can

definitely do faster iterations.

>> You have to run to stay where you are. I

don't know if you get this feeling, but

for me it's like the world is speeding

up every single day.

>> I do think it's speeding up, but at the

same time, I think it's not zero sum

where it's not uh by by speeding up in

this category doesn't take away from

another category. I think the entire

economy is just growing as well with

with a lot of that adoption. So there

will be more creative opportunity um

than it ever was before and yes to be

part of that creative opportunity you

probably need to move faster with a lot

of the innovation than you might have

needed to before but you I I think still

like a wide set of of of people can can

and will benefit but of course you know

it's going to a lot of the the

repetitive manual non-talented

intelligence non like basic intelligence

based work will be will be replaced with

well AI workflows. Um and the best the

best way to to avoid this is is is by

learning a lot of the AI tooling. So you

yourself are better and and maybe just

to finish off and maybe to summarize the

customer support piece thinking about it

slightly differently and outside of

customer support is that frequently if

you have a domain expertise whichever

domain that is then you that's that's

where you can um deliver even more

value. So combin combining your domain

expertise with AI is um is is is much

higher uh um value and and and and

output. And if you don't have domain

expertise then you probably want to gain

that domain expertise uh which which

which which would be

>> yeah I've seen a lot of graphs for like

future of jobs reports and uh there's

this section like your expertise plus AI

and it goes like this in terms of

demand.

What would be the tools that you would

recommend everyone to start using now?

Name top three AI tools.

>> Top three AI tools. Okay. Outside of 11

Labs, which you do need to try and use.

Uh I would say I really like Black

Forest Labs for for their for their

image uh image work. I mean the Mid

Journey has been cranking out for for so

many years, but Black Forest Labs I

really like as kind of the uh new

iteration. And I think they have a good

realism and I think they will go through

a set of additional iterations that that

are that are great from the classic

ones. Um I mean entropics cloud code I

think it's incredible. Uh where where

where I think it helps you like be

another level engineer or even if you're

not engineer try to be a little bit more

of the engineer. And then last one I

would really I really like lovable. Um

but similarly I mean vzero rep are

great. Yeah.

>> Uh uh but uh but given given we are in

Europe, I I feel uh lovable deserves the

the the

>> they're from Sweden, right?

>> They are from Sweden. Yeah. Uh but all

of them I mean it's it's just so

incredible to see like our go-to market

teams try whether it's lovable vis or

replet. Um I think now Figma also

launched their so I haven't tried it yet

but uh that's uh it's it's it's fun to

see how like people that haven't been

traditionally on the engineering front

are closer and they understand the

product pain points they understand the

use case all better. So there's both

this path of like prototyping showing

the clients which is amazing but then

also by extension they are effectively

getting closer to what is behind the

scenes on the product side too.

>> Yeah. And when when you mentioned

lovable, do you build something for

yourself or for 11 Labs?

>> Um both. So on the go to market side, we

frequently will do a demonstration to to

a customer of like let's say we were

doing the use case that you mentioned.

We could build a prototype on a mockup

website of how the checkout would look

like, how the agent would interact with

you. That's that type of um type of use

case all the time, whether it's on the

pre or conferences or with the client

calls. Uh but also on a personal side, I

recently tried with my two nieces to um

to they are five and seven years old. So

I have the best job of being fun or

trying to be. And they um uh uh they

were we were speaking about uh how they

could potentially create a story

generator for themselves where you would

type in the character names and the

story would be created.

>> You're an entrepreneur. You started this

company, spotted this opportunity. Do

you see any other areas aside from voice

where people should be doubling down?

Because um one of the founders I had on

this podcast told me that uh actually

co-founder of hugging face, he told me

that in the in the next 5 years you have

to be an entrepreneur or you're done. So

a lot of people are learning how to

become an entrepreneur. Can you name

some opportunities that you see that can

make people decent amount of money so

they can make a living like 10k a month?

Something that's immediate something

that you see a gap in the market. it

will be voice specific but I think it's

so so early that I think it's it's a

huge one is uh there's definitely a lot

of the infrastructure being built for

the voice agents we we we build it but

other companies are are are too um and I

think there is a big gap between voice

agents and then actually deploying them

in a lot of those businesses and you

don't have to have the engineering

expertise to deploy those voice agents

the platform now frequently will support

a relatively self-s served manner of

taking it but you can easily

take that voice agent and deploy that in

a specific domains and most of the

businesses in the world still don't know

don't not don't know know about it if

it's you know not um venture scale uh

business and you just want to make good

money I would try to take those voice

agents and go to um let's say local

doctor's office and help them

appointment schedule for for for the

dentist so they can take appointments

more easily and they can then focus more

on the work instead of nurse doing that

in between or missing appointments

that's actually one of the most common I

don't know the the percentage but so

frequently those those appointments

don't get booked because there's no one

on the phone and can take them um you

can go to local mechanics and help them

take appointments and I think there's

all of these require slight variation of

the domain piece that you need to know

and all of those businesses are in

thousands to tens of thousands of

dollars per month if you get it to the

few um the infrastructure is there you

just need to bring it to to to those

domains

>> yeah it's like B2B be automated

businesses with AI.

>> Yeah. And small businesses all

>> you don't have to be a coder.

>> You don't have to be the coder. You just

need to spend the time call them and ask

or or or go to them. Um and I think

there's like this category which might

not be um taken off by by some of the

biggest companies that will focus on

bigger enterprise uh elements. So like

you know classic uh uh this is like

small medium businesses rather than than

than the enterprise segment. And at the

same time most of those companies just

don't know this is possible. So like

next year or two is just a incredible

opportunity to do it. And of course you

know starters in English speaking but I

think the same is true for so many of

the of the countries and languages which

which might be uh given so much of that

work isn't always localized. I think in

our case uh we doing a pretty good job

there. You can you can bring it to a

local market and do exactly the same the

same work.

>> Absolutely love it. Thank you.

>> Thank you. So if you were a starting

company today and you're a brand new

entrepreneur, what would be your advice

for anyone who's starting out?

>> The first advice would be that you

deeply understand your your user um and

the problem that you're trying to fix.

Like I think that would be the first

piece. It's like do I know the problem

and do I know people have that problem?

because you you started 11 Labs because

you were uh you didn't like the

transcribing the translation of

>> this is a super um super crazy piece uh

that in Poland if you watch a movie all

the characters whether it's a male or

female character are narrated with one

single voice

>> with no inonation right

>> no inonation it's flat exactly exactly

>> I think it was the same as postviet

times and

>> exactly because and it still continues

today you know it was it was an kind of

obvious when we started looking into the

audio space and then realized that this

is still a problem something we grew up

with something that you ask any Polish

person or most of Polish people and they

will tell you how bad of an experience

that is as you can likely imagine it's

pretty bad and it will it will it will

change and um and it will take an

obvious okay if you think about the

future you will have all different or

different um uh original voices

represented so if the movie is streamed

you just hear exactly the same language

of course expanded from the dubbing to

to just uh voice overs and speech

because so much of the content isn't

available in audio in the first place

and and now a lot of voice and stuff but

it was a very clear problem and I think

as I think about starting a company or

if I were to start a company again I

would try to obsess about the problem

and then the second one is um do people

actually have that problem is it

actually burning a problem and in

dubbing it was a good example where we

we thought the dubbing is the biggest

problems but before we actually solved

the dubbing We realized from a lot of

conversation with users that there are

so many other problems that they would

like to fix first. The most common one

is actually one you mentioned where

people just wanted to repair lines after

recording um or just being able to

deliver voice over without speaking and

that was like the most common thing

after we tried to reach out to people

like oh before we had it ready it's like

hey we are almost finished with our

dubbing product would you like to dub

your movies and they most likely we

would get some small percentage of

replies and then in inside of those

replies it would be yes this would be

interesting but actually if you could

help me with just my voice and it yeah

that would be much much better. So then

we were like okay there's this

incredible opportunity that's smaller uh

component of the technology we want to

build that we should we should do

instead uh first and and and we did and

then we validated that again and people

were yes that's that's that's something

we would love and then um given we

started from um creators on on on social

media

uh after we heard this but then we

realized that there are actually other

people not on social media that also

want voiceovers being The biggest group

for us was book authors initially.

Everybody just couldn't record audio

books. Exactly.

>> Because that's like a few days in the

studio.

>> Few days in the studio. Very expensive.

So many people get tired with the voice.

So it's never as as expected initially.

So it takes more than that. And then

that turned out to be like second of the

first biggest ones. So

>> but you actually built the dubbing

product first and you realized nobody

wanted to pay for it.

>> Yeah. So we we we did the prototype. We

did. Yeah. psych was a little bit of a

like you know like a stitch up of not um

not it did it did have a little bit of

our own research but uh but it wasn't it

wasn't um months of work uh it was like

we we created a prototype we while we

were building the prototype we're

reaching out to customers like we we

were working on this do you want it we

had a good waiting list then we tried to

show them what it what it how it looks

and they were like oh this quality isn't

as But if you could actually help me

with this and this instead, it would be

better. Which is the same technology

because people notice that if you dab,

you can hear the voice of the person in

the other language, it still sounds the

same. And and it turned out that the

problem was even earlier. It's like, oh,

just my voice.

>> I love that. I love how you started with

the surface. Then you went deeper and

built the whole technology that solved

so many problems that were in the

surface as well. Yeah, that that's how

so I think yeah I think the

entrepreneurs building today if if they

understand the problem then and and and

of course the I'm in very lucky position

where I know my co-founder now for 15

years and know it know know him inside

out and he is the the genius behind a

lot of the the work we we we do but I

think that would be my second piece

where like you want to really pick your

uh co-founders and the early team as

carefully as you as you can as these

will be the people you will spend most

of the nights and and years ahead

success depends all of that the culture

depends on that. So um and then

similarly we're very very um happy to

have some of the best early joiners like

to one of the person on the growth side

we trusted inside out and two of our

engineers turned out to be some of the

hard most hardworking and smart

engineers we we have which set up the

culture bar very high.

>> Nice. Nice. Okay, I'm going to uh wrap

up with this question. As a person who's

been advocating learning languages, will

people still learn languages in 3 years?

If they can have their AI authorized

voice speaking any language, join any

Zoom call, the only thing that's left is

maybe a one-on-one conversation, but

then maybe we have a device that

translates everything.

>> The uh interesting one I think they

will, but the not always the primary p

purpose will be for for understanding

others. It will be frequently for uh um

just developing yourself as a more of an

enjoyable thing you want to do for your

own sake

>> like horse writing right from a

necessity to a hobby right

>> to to more of a hobby and of course

there are like parts that by learning

language you learn the culture you learn

and your your kind of your perspective

opens I think that still will be true

>> or if you're moving to another country

>> are you moving to the country

>> I mean like if you want to move to the

US you would still learn some English

right

>> hopefully will not need to do it and you

will still be able to understand a

culture in a level that you never could

before. So

>> hitchhikers it will be like a bubble

finish variation like headphone maybe

device maybe neural link but even in

those cases there will be some

processing time involved because you

need to finish speaking for the device

to pick it up and then translate it. So

language natively speaking will be

better. Uh but yes, I do think most of

that need will disappear for you to be

able to interact and and and understand

which I think will be a beautiful thing

and then hopefully you can you can learn

it for other purposes.

>> Interesting how the whole industry is

like might disappear or might transform

completely but it's it's happening not

to just language learn it's happening to

everything

>> 100%. But I think it will stay. Uh it's

uh I don't know if definitely it will

morph but uh but some some some of that

will definitely stay.

>> Thank you so much M. It was very

inspiring and very practical. I love

that

>> and thank you so much for being an early

user and all the feedback as well.

>> Thank you. And I'm hoping we're going to

integrate the sales part. I'm excited

about that. Talk to my team right now.

Let's go. Thanks.

>> Thank you. Thank you.

Loading...

Loading video analysis...