LongCut logo

Introducing Sora 2

By OpenAI

Summary

## Key takeaways - **Sora 2: A Leap in Video Generation**: Sora 2 represents a significant advancement in video generation, excelling in physical interactions, motion, physics, and body mechanics, marking a substantial increase in realism compared to previous models. [00:50] - **Sora 2 Enhances Narrative Coherence**: Sora 2 is notably improved at generating longer, more coherent narratives across multiple shots within a single generation, overcoming a previous limitation where video generation often required a shot-by-shot approach. [03:13] - **Cameo: Personalized Video Generation**: The new Cameo feature in Sora 2 allows users to insert themselves or others into any generated environment by observing a short clip, enabling personalized video creation with deep understanding of the subject. [03:50] - **Sora App: A New Communication Medium**: The Sora app is designed as a new video-based communication medium, featuring an AI-generated feed where content is posted by humans, offering a unique experience distinct from traditional social media. [04:37], [05:08] - **Controlling Your Likeness with Cameo**: The Cameo feature offers robust control over one's digital likeness, requiring explicit permission and a validation process to prevent impersonation, with users having full rights to manage their generated content. [09:21] - **Remix Feature Fuels Creativity**: The remix feature within the Sora app allows immediate participation in trends and storylines by enabling users to create their own variations of existing content, fostering rapid creative iteration. [11:10]

Topics Covered

  • How Sora 2 Redefines AI Video Realism.
  • Cameo Makes You Part of Any AI-Generated Scene.
  • Remixing AI Content: A New Form of Participation.
  • Is an AI-Generated Social Feed a New Medium?
  • How Sora Prioritizes User Safety and Control.

Full Transcript

One year ago, Sora 1 redefined what was

possible with moving images. Today,

we're announcing the Sora app, powered

by the allnew Sora 2.

It's the most powerful imagination

engine ever built.

and it's packed with new features. I'll

pass it to Bill for more details.

Now, every video comes with sound.

Sora 2 is also the state-of-the-art for

motion, physics, IQ, and body mechanics,

marking a giant leap forward in realism.

And we're introducing Cameo, giving you

the power to step into any world or

scene and letting your friends cast you

and theirs.

On the path to AGI, the gains aren't

just about productivity. It's about

creating new possibilities.

>> It's also about creativity and joy.

1 2 3 4

That's why we're launching Sora 2 inside

the Sora app, allowing everyone to push

the limits of their imagination and

create in ways we never thought

possible.

>> Welcome back to reality. I'm Bill. I'm

the head of Sora.

>> I'm Rohan. I lead the Sora product team.

>> I'm Thomas. I lead Sora engineering.

>> Back in February 2024, we introduced

Sora 1. We really view that internally

as being the GPT-1 moment for video

generation. It was the first moment

video really felt like it was starting

to work and simple behaviors like object

permanence started to emerge from

scaling up pre-training. Since then, the

Sora research team has been hard at work

delivering the next step function change

and model capability. And we're super

pumped to show you guys Sora 2 today.

Sora 2 is our flagship video and audio

generation system. And you just saw a

taste of what it can do. The first thing

you'll notice when you get your hands on

this model is how much smarter it is at

physical interactions than any prior

video generation system. In the past,

really complex dynamics like an Olympics

gymnastics routine or maybe doing a

backflip on a wakeboard were really

tough. Sora 2 is much more robust at

handling these types of complex

collisions and modeling dynamics in a

way that feels extremely natural. The

team's also done a lot of work to

improve the steerability of Sora 2

relative to prior models. Oftent times

you have to use video generation systems

in kind of a shotby-shot manner. It's

really tough to get out a longer

narrative that contains multiple shots

all in the same generation. Sora 2 is a

lot better at this and can tell longer,

more coherent stories all in one go. Of

course, the big feature here is audio

generation. This is the first Sora model

that simultaneously generates both video

along with audio. And it's a very

general purpose system. You can generate

dialogue in a variety of languages

spanning multiple speakers. You can

generate sound effects and even

soundscapes. We're really excited for

you guys to get your hands on this

model, but there's one feature above all

else that we're really pumped about.

It's a new feature called Cameo and it's

unique to Sora 2. The way it works is by

observing a short clip of say me, Rohan,

or Thomas. You can then take that

individual and insert them into any Sora

generated environment. You just saw a

few examples with me and Sam in the

previous video, but this is a very

general purpose capability that emerges

from our world simulation models. The

way it works is that ultimately by just

observing any clip of not even a human,

but even a pet or an object, the model

understands it really deeply and then

can inject it into any prompt as if it

were just another text token. We're

really excited about getting you guys

access to this model. But what we really

want to show is what we're doing on the

product side to really capture all of

the magic of this model. You know, in

the early days as we were developing

these features, Sora researchers really

felt like this was a new kind of

communication. What originally started

out as like text messages and then moved

to emojis or voice notes really felt

like it was progressing into a new

video-based medium with this Cameo

feature. And it became really clear to

us over time that we needed to develop a

new product surface to really capture

all these amazing capabilities of the

model and to really get this into as

many people's hands as possible. Rohan

and Thomas have been doing a lot of

awesome work here and I'll hand it over

to them to tell you a little bit more.

>> Great. So, I know everyone's eager to

see the app. Before we go in there, just

a little bit of play setting. So, what

you're going to see is a very familiar

interface if you've used social media

before. Uh there's a notion of identity.

Uh you have a profile. You can follow

other people that you're connected with.

Um, but all the content inside of it is

going to be AI generated. It's not

posted by bots. It's posted by humans,

but it's all AI generated. And it has

this very, very interesting feel that's

uh quite different than basically

anything else I've used. It really does

feel like a new medium. Um, when you see

the feed, you're going to see all the

fun we've been having on the Sora team.

Uh, some of memes have been emerging uh

as we play with the product. Uh, there's

always the constant need for GPUs to

keep up with the increasing demand. Uh

there's one about ketchup. I drinking

ketchup for some reason which

>> I think it's based on a true story

>> I've yet to understand, but it's there.

Uh and of course there's some fun ones

about perfume and other things that

stretch the model in different ways. But

Rohan, why don't you just give it a

>> Yeah, let's jump into the app. All

right, I'm going to click the Sora app

here and we're going to be dropped into

a feed.

>> Got it. Sora is pre-revenue. If you show

revenue, people will ask how much and it

will never be enough. The company that

was the 100x or the thousand Xer is

suddenly the 2x dog. But if you have no

revenue, you can say you're pre-revenue.

>> The app is indeed pre-revenue.

So, there's a couple cool things to note

here. So, this is an example of the

cameo feature I was talking about

before, but actually this is two cameos

in one. So, this is me talking with Sam

in the same scene. And you'll notice

lots of little details here that make

these videos feel really realistic.

These little shot changes back and

forth, the natural gestures and facial

expressions on both me and Sam's face.

The natural lip sync that really

captures the dialogue accurately. All of

this is brand new in Sora 2.

>> All right, let's keep going.

>> Okay, watch this. I'm going to turn the

lights off. Whoa, wait a second. Why am

I a cartoon? That wasn't supposed to

happen. The lights are still on. This is

kind of cool, though.

>> I really love this one. I think the the

dynamic range of Sora 2 is incredible. A

lot of prior models out there seem to

sort of collapse into a single

aesthetic, and Sora has such a wide

diverse range, uh, which is amazing. I

can't wait for the the creativity of the

internet to to have access to this.

Let's keep going. Back on the news

again, the man from last month can't

stop eating McDonald's ketchup. Straight

from the

>> Look, it's not about the ketchup. It's

about the experience. Health experts are

concerned. Back on the news again, the

man from last here. He feels alive like

a painting come to life. Every breed

carries a story. Stay close and we might

hear him. Let's see.

>> So, this cameo feature is really

general. Like I was saying, you can use

it on humans, but you can also use it on

pets. This is actually my real dog,

Rocket. Uh, render it in an anime style.

As Rohan was saying, this is a really

general model in terms of the stylistic

range that it can produce. Uh, and it

can cover anything from realism to anime

and everything in between.

>> Awesome. I'm feeling inspired. I think

we should run a generation. So, at the

bottom of the screen, you'll see this

plus button. I'm going to click there,

and we'll be dropped into our simple

composer. Here, you can describe any any

idea you have in any style, uh, any

scene, transcripts, all that kind of

stuff, and you can get a video. Um,

you'll see this tray above cameos and

you'll see me here, Rohan all the way on

the left and some friends I have on the

network here who've given me permission

to cameo them. Um, let's run a

generation with a cameo. Maybe someone

most people know. Let's do Sam. Uh, Sam,

what do you think?

>> He's got to be celebrating how well this

live stream

>> celebrating how amazing this live stream

is going.

>> He's screaming about it. pumping his

fists

>> and screaming.

>> All right, we'll fire that off.

>> Much like SO one, these generations can

take a couple minutes. So, while that's

happening in the background, I'm going

to walk you through our Cameo feature in

a little bit more detail. You're

probably wondering, how do you set this

up? What do the permissions look like?

How do we keep this safe? Um, so let's

jump in here. It's my profile. I'm going

to click edit Cameo.

In this screen, you'll see a number of

Cameo settings. Before we jump into the

settings, I want to talk about how you

actually upload your came. So, I'm going

to click retake here. Um, so in this

flow, you'll be asked to record a

dynamic audio prompt. Um, so, you know,

we'll present you with some random audio

challenge. Then there'll be a livveness

check. You'll be asked to move your head

in certain directions and then that'll

get sent to our systems where we do tons

of validation basically to ensure that

no one's impersonating you and um, this

is indeed you on the network. Once

you've done that and your cameo is

approved, uh you can set who can use

this cameo. You can decide only I can

use my cameo, people I approve, mutuals,

everyone. You are in full control of

your likeness on this network. There is

no way for someone to generate you

without you having given explicit

permission and gone through this cameo

flow which is a very important principle

for us. Um a couple other notes, you can

guide the model on how you want to be

portrayed. The model is amazing but it's

not perfect. Sometimes it might

hallucinate things, might give me skinny

jeans or a weird accent or something

like that. So, I can go into Cameo

preferences and sort of tune this as I

run generations, which you know, I

suggest everyone who sets this up does.

Um, and we'll be adding things like

advanced flows to give you more control

over this. Um, but in the meantime, we

have a couple of ways of doing this

already.

>> You can also use this for a lot of fun,

which is uh we've been having fun on the

team where we give ourselves kind of

funny hats or weird things that you do.

Rowan always has a gold chain which

you'll see later but you use those

instructions to sort of guide the model

in just different fun ways. Um another

thing that is really important to us is

the idea of ownership over your own

identity. So everything that uh is

created with your Cameo when you've

authorized somebody to give with

permission you have full rights over in

the sense that you can delete it. Uh

you're treated like an owner of that

video.

>> Yeah, exactly. Um cool. Let's go back to

the feed. See a couple more gems.

Sora 2, the new fragrance from Sora.

Fresh, clean, and unapologetic.

>> I don't use perfume, but if it were Sora

themed, I would consider it.

>> Possibility.

>> One of my favorite features of this app

and this model. And I think something

that is uniquely like made possible with

this technology is the ability to

immediately participate in a trend, in a

storyline, in some lore that you know,

some creator is is working on a universe

via this remix feature. So, I see this.

I'm feeling inspired. I want to see my

own variation of this. I can just click

this remix button here.

>> New fragrance from I can click the remix

button here and say, make this an ad.

Oh, make this an ad for any ideas.

>> Top hat.

>> A top hat for a top hat with

giant feathers.

>> Nice. Okay.

>> Um, and boom, I fire off that

generation. Uh, and Sora will go working

on my contribution here. Um, in the

meantime, let's actually see other

remixes of this perfume.

>> Sora 2, the new toothpaste from Sora.

Fresh, clean, and unapologetic for

whoever you choose to be. Sora 2, the

smile of possibility.

>> Smile of possibility.

>> I don't speak Korean IRL, but in Sora,

anything is possible.

All right, let's keep going in their

feet.

>> Guys, check my kick flip.

>> This is our coworker Minia doing a kick

flip. Uh, this is incredible physics. I

haven't seen anything like this with any

other video generation models. I've been

trying to do this myself for about 20

years. Still working on my kickflip. But

yeah, just in an incredible display of

physics by the model here.

>> The dream. Championship point for Rohan.

>> Yes.

>> Um, I want to thank the haters for for

fueling me.

>> You can see my gold chain in there. And

if Thomas were to make a gen, he would

unsuspectingly get a gold chain of me in

there, which is kind of the fun of this

feature a little bit.

>> The dance

ridiculous.

Ladies and gentlemen, make some noise.

>> And finally, it's a really good range.

>> Download the Sora app.

>> We'll tell you how soon.

>> Uh, cool. Let's check on our

generations, but they might still be

going, but Oh, we got Sam's.

>> All right,

>> this live stream is going so well. Let's

go. I'm fired up. It's amazing. We're

crushing it. Come on. Thank you all.

This is a great

>> Hey, thanks. Thanks. All right, I think

the other one's still generating. So,

while that's going, I'm going to hand it

back to Thomas to talk a little bit more

about our philosophy on this app.

>> So, thanks, Ron. Uh, I want to admit

that when we were initially going

through this project, we weren't really

sure that this would be something that

we wanted to go through as a company and

and commit to. Um, we were all a little

bit skeptical of the idea of having an

AI generated feed and what that would

feel like and whether you would lose

touch with like actual human

connections. Uh, I think once we've

started using this Cameo feature, it

really does feel different. It feels

like a new medium, like a new way of

connecting with your friends in a way

that I was even surprised by. It's a

very different mode of operation when

I'm scrolling a feed and I'm thinking

through like, oh, what could I do to

riff on that just a little bit or wait,

can I put myself in that video? Um, it

just does feel very very different. And

so I'm very very happy with the way this

is turning out on on the team with the

notion of like connections. Um, one

thing we're know we've noticed over time

is a lot of social media uh in general

has moved away from the idea of friends

and kind of family connections. Um, we

believe that Sora can lean into this

because it's just so easy to create.

It's so easy to create in a way that

wasn't possible before. And with that in

the feed, we're going to be heavily

prioritizing connected content. You also

have a following feed always available

where you can see just connected content

alone. And we also have some new

features that gives you some agency in

the way you control your feed. So

there's a beta feature on the top of

feed that we'll be working on uh where

you can select uh the type of content

you want to see. So if you're in like

say a relaxing mood, you can say I I

want to be relaxed.

>> Animals.

>> Animals. We always have fun with that.

Seeing only cute animals. Um and so you

can guide this the model to show you

content that is really aligned with what

you want to do at that time. Um we're

also going to be heavily optimizing the

feed to encourage you to be creative, to

inspire you in a way that's not just

about scrolling the feed.

>> Awesome. I think our Jen is done. Let's

look at that.

The new top hat with giant feathers,

bold, elegant, and unapologetic for

whoever you choose to be. Plume, the hat

of possibility. I would buy

>> cool. Um, yeah, and I'll talk a little

bit more about how we're sort of

approaching safety and moderation on

this on this network. Obviously, like

Thomas said, we've been, you know,

pleasantly surprised internally. We were

skeptics of a just a purely AI generated

feed, but have felt this human

connection. Um, we felt like this was

the best form factor, but we wanted to

make sure we amplified the good of form

factors like this and, you know,

mitigated the bad that often comes with

sort of short form video. Um, so there's

a couple things we have in place here.

One is for U8s, we have a separate set

of policies for U8s. There's no infinite

scroll by default. There'll be a

stopping period with a cool down um,

pretty quickly in your experience. Even

for adults, we'll nudge you sort of

later on in your scrolling process if

you're, you know, if we if we think

you're in sort of a doom scroll loop,

we'll nudge you to creation since we

think that is like a fun thing and

usually feels good on this app. Um,

another thing that's really important is

that we want this content to be clearly

labeled AI generated when it's ever off

our platform. So, we have several

provenence techniques. First and

foremost, things are visibly um

watermarked when you export them off our

app. So if these things are floating

around other networks, you'll see the

Sora animation there. Um we also have

some techniques internally to always

trace back generations we see to Sora if

we see things floating around the

internet. And we have C2PA as well. And

then we're working on top on top of all

the amazing moderation that's come with

Sora 1 and image gen. We have reasoning

models under the hood that make sure

it's very difficult to create harmful

content on this network. obviously

extremely important in the Cameo feature

that no one can create, you know,

X-rated or violent content and that is

uh the case with all the sort of guard

bells we put in place, which is amazing.

Um, obviously we're, you know, we're

starting a little conservative with our

moderation here. Uh, you might see

overblocking. We're sorry in advance.

>> Memes about that.

>> Yeah, people are memeing us internally

for overblopping blocking. So, we're

finding this balance, you know, of user

freedom and people who who might be, you

know, trying to do bad things on

network. And so we'll be working on that

over time. Um lastly, before I hand it

back to Bill, I want to talk about um a

couple of other services that we'll be

deploying. Sora one, sora.com, our

existing web app will get this new

model. There's a little bit of a

facelift you'll see, but we'll also

still have some awesome features like

storyboard launching soon that might

come in a week or so that lets you

really control like shot by shot um how

the model creates a scene. Like Bill

mentioned, there's so much

controllability and power in this model.

We really want to invest in amazing

creator tools so you can create amazing

content for our network. Um, and we'll

be launching an API in the coming weeks

as well. There's a long tale of use

cases where people can do amazing things

where we might not want to build like

fine grain editing controls, but others

might. People might want to integrate

this into their own video editors. And

now that's possible with Sora 2. Um,

yeah. Want to talk about how how we're

rolling this out?

>> Yeah. So today's the day you'll be able

to download the Sora iOS app in the App

Store starting later this afternoon. Uh

we're starting only on iOS. The team is

hard at work trying to get an Android

version up and running, but bear with

us. Uh we're launching initially in the

US and Canada and we're doing an invite-

based rollout. So like we've said, we

think it's really important that you

come into this app with your friends.

This is really best experienced in a

social way almost as a new form of

messaging. And so when you get off the

wait list and you'll get notified of

that when you download the app and get a

push notification, then you'll get four

invite codes automatically that you can

use to give to your friends to make sure

they come in with you. We're super

excited for you guys to get your hands

on these models. You know, we started

the Sora research program back in early

2023 to really build AI systems that

deeply understand the physical world. We

think that's going to be a paramount

capability in order to get to truly

generalist AGI. Along the way, we're

training a lot of models that we think

the world can have a great deal of fun

with and can bring a lot of joy. So,

we're really excited to see what you'll

ultimately create on this app. We'll see

you guys on Sora.

Ready

when you are. 3 2 1 go.

>> Dude, you all right?

>> I'm good.

Steady.

Damn it.

Loading...

Loading video analysis...