Sora 2 Tutorial: How to Generate Videos Easily
By Artlist
Summary
Topics Covered
- Sora 2: Reasoning Model Beats Fusion Models
- Unlock Sora 2 with Documentation Guide Hack
- Cinematic Prompts: Beyond Vague Requests
- Iterative Refinement: The Key to AI Video Success
- Embrace Patience: AI Generation Requires Iteration
Full Transcript
So, everybody's talking about Sora 2, but for good reason, because it's creating some insane videos that look like this. So, in this video, I'll take
like this. So, in this video, I'll take you behind the scenes to break down the artless list tools and the workflow that made these shots possible. And I'll show you a hack that will save you a ton of time and effort. And then I'll create
one from scratch, showing you every step, tweak, and regeneration along the way. And by the end, you'll know how to
way. And by the end, you'll know how to approach Soratu like a filmmaker, where you can actually create things that look and feel cinematic. But if you tried it yourself and you got some weird or bad results, don't worry. You're not alone.
Because getting good results isn't just about having a good prompt. It's about
understanding the entire process, which I'll be showing you guys today. But
listen up, guys. We are picking one person to win an entire free year of an Art List AI subscription. All you have to do to enter is comment what you found most valuable in this video. So, let's
break down how Sora 2 is different from other models. A lot of other models are
other models. A lot of other models are what we call fusion models. And what it basically does is it takes text and image data, mixes it together, and kind of guesses at what you want. But Sor 2 is different because it's a reasoning
model. It actually thinks through what
model. It actually thinks through what you're asking of it because it relies on the intelligence of chat GPT, which we can use to our advantage. So you can literally use chat GPT for anything. You
can brainstorm ideas, create reference images, you can even have it build prompts if you know how to train it correctly, which is what I'm going to teach you guys later in the video. So,
let's break down how I made these three cinematic sequences using art list and chat GBT.
>> You think an apology fixes everything?
>> No, but it's all I have left.
>> You had choices, David. You always did.
You just never chose me.
>> I did every time. I just didn't know how to stay.
>> It's not just motion. It's evolution.
Every step, precision. Every second
purpose. This is more than a sneaker.
It's the future redefined.
>> All right. Well, now that you've seen the videos, let's break down the process. For the first video, I use Art
process. For the first video, I use Art List to generate some images for reference. Key tip, you can take these
reference. Key tip, you can take these into Nano Banana and literally make any changes you want. But now, here's the hack that is going to save you a ton of time and effort. There is a Sora 2 documentation guide on Open AI. I'll
link it in the description, but basically what it does is it teaches chat GBT how to build Sora 2 prompts.
Now, before you run off thinking that you're done, it's not that simple. This
is just a way better starting point.
There's still a whole creative process to get some legit results. But let me show you how to do it. You can literally just take the link, paste it into chat GPT, and then type in something like these are the guides for how to prompt
Sora 2. And from that point on, ChatGpt
Sora 2. And from that point on, ChatGpt knows exactly how Sora 2 interprets prompts. It understands the right
prompts. It understands the right structure, the format, the terminology that it uses. Super helpful. But let me show you what the documentation guide actually does. So instead of saying
actually does. So instead of saying something vague like make it cinematic, which I'm sure we've all done, it teaches ChachiPT how to actually describe camera setup, lighting and pallet, action and beats, tone, and
mood. And that's the difference between
mood. And that's the difference between a random result and something that looks like an actual cinematic scene. So let
me give you guys an example. Instead of
saying a beautiful street at night, it becomes wet asphalt, zebra crosswalk, neon signs reflecting in puddles. But
anyways, after I uploaded the Sora 2 documentation to Chat QBT, I brought in my reference images, and then typed in my prompt. I put in create a 12second
my prompt. I put in create a 12second SOR 2 prompt that will result in a super cinematic scene of a man flying over a field, similar to these photos. It
should start with a man walking in the field and then start hovering and finally fly. The camera should track him
finally fly. The camera should track him as he flies over a field. Now, normally
you want to be a lot more detailed with this, but since we have reference images, it has a lot more information to pull from for the prompt. So, you're
going to hit enter, and this is what comes up. I just copied and pasted it so
comes up. I just copied and pasted it so you can see the entire prompt. But if
you can see from the beginning how extensive it actually goes from the little amount of information I gave it shows the duration, the aspect ratio, the style, ultra cinematic, photoreal, dusk palette. Then you have the
dusk palette. Then you have the character reference from the images that we gave it. But what I really love that it does is it time codes it and breaks it into different scenes to fit that 12 second period. But if you look at each
second period. But if you look at each scene, it's so detailed. You literally
have the environment, the man's actions, you have camera movement, the lens type.
But the main point being, if you tried to type this all out by yourself from your brain, it would take forever. But
even though we have all of these details, it's not going to give you a perfect result out of the gate. But if
you guys want to learn more about AI film making, please consider subscribing and hit that notification bell to always stay updated. All right, so let's copy
stay updated. All right, so let's copy and paste this into art list. And then
we're going to select Sora 2 Pro. There
are a ton of great models to choose from as you can see here, but today we're working with Sora 2 Pro. And this is what I got.
Well, that's just a good example that even with this crazy prompt, you can still have mistakes and you're going to have to make refinements and tweaks along the way. And so I went back, made some more tweaks, ran it again, still didn't get what I wanted, and I just
continued that process. But if you look here, these are all the different variations of the prompt and the tweaks that I made. And I'll go more in detail about specific refinements and tweaks when I build one from scratch, but I just wanted to show you guys the overall
workflow of how I created these three sequences. But also, key thing to
sequences. But also, key thing to understand is that you never get the same result twice. You can use the same prompt, generate 10 times, and get 10 different results. Just something to
different results. Just something to keep in mind while you guys are working on your stuff. So, for the other two videos, it's the exact same workflow, but we didn't have reference images for this. So, you want to be much more
this. So, you want to be much more detailed. You want to include details
detailed. You want to include details like environment, lighting and mood, time of day, emotions and poses, camera movement, dialogue, and how it's spoken, and texture and details. So, even though
the chat GBT will do a lot of the work, you still want to include as many details as you can to get close to the vision that you want. But also keep in mind when it comes to dialogue, it can do things like this.
>> Beautiful choices, David. You always
did. So, you just want to specify who's speaking. And also, another tip, if you
speaking. And also, another tip, if you don't have reference images, you can reference a style that you like. So, you
can even say something like in the style of a Nike ad or you can be more vague and say something like a luxury sports brand. But, let's get to the fun part.
brand. But, let's get to the fun part.
Let's create one from scratch. I really
do like dystopian worlds, stuff like Dune, Bladeunner, riding like a futuristic motorcycle, holographic map, pulls out a staff that extends into a
spear. Uh, let's try it. So, I actually
spear. Uh, let's try it. So, I actually already had this reference image from a previous piece I did. I'm just going to bring this into Nano Banana to put her on a bike. And then we're just going to use those as reference inside of Chat GPT. So, it kind of knows what to
GPT. So, it kind of knows what to describe. So, I need a 12second sore 2
describe. So, I need a 12second sore 2 sequence of this character riding a futuristic grungy motorcycle speeding through a barren land with natural motion blur tracking shot with natural camera shake. She stops, gets off,
camera shake. She stops, gets off, brings out a device that projects a hologram. Then she hears something,
hologram. Then she hears something, turns around, pulls out a staff, extends into a spear. full body framing. Then a
close-up of her eye dilating notes, dramatic lighting, slight haze, golden hour. Hit enter. And this is what it
hour. Hit enter. And this is what it gave me. Basically everything I showed
gave me. Basically everything I showed you before. Everything split up into
you before. Everything split up into time codes. Let's copy the code. We're
time codes. Let's copy the code. We're
going to bring that into the art list generator. Looks like Sora 2 12 seconds.
generator. Looks like Sora 2 12 seconds.
And then generate.
Okay. Oh, you're just frozen sideways.
Okay. Dismount was weird.
You look kind of cartoonish.
Okay, not terrible. Not great either.
Let's go back into chat GBT. I basically
said I want more speed on the motorcycle. Have the camera orbit around
motorcycle. Have the camera orbit around her while the ground blurs. Then I also said her stop needs to look intentional and she gets off smoothly. And then for the device shot, I want it angled over
her shoulder and to make it more photo realistic. Now, normally writing prompts
realistic. Now, normally writing prompts by yourself, you want to be very intentional and specific about the type of language that you're using. But when
you're typing it into chat GPT, it'll take your directions and then expand on that. So, you don't have to be so
that. So, you don't have to be so specific on that. Take this for example.
I said, "Have the camera orbit her." You
go down into the prompt and it says, "Came orbits her from front to left to right, rear, tracking tightly while she leans into the wind." And so, think of it like you're the director telling the cinematographer what to do and then he
handles all the nitty-gritty things of how to make that happen. if that makes sense. Okay. And this is what I got. I
sense. Okay. And this is what I got. I
like the orbit. Okay. And you're holding the spear. Weird. All right. So, I said
the spear. Weird. All right. So, I said the shot of her stopping is not working.
Let's replace that shot with the front on shot of her out of focus. And as she slows down and approaches the camera, she comes into focus. We need to be more specific with how she holds the spear.
Hold it with her right hand out stretched like she's pointing at something. All right, let's see what we
something. All right, let's see what we got. Orbit.
got. Orbit.
Okay. Did it from behind, not the front.
I'm going to trust that the next generation is going to fix that coming into focus shot. I feel like sometimes it just makes little angle mistakes. So,
I said remove the macro shot at the end.
Extend the scene with the staff. The
staff should be forearm length and extend it into a 6ft spear. Add more
mechanical elements of things snapping into place. And as she looks and sharply
into place. And as she looks and sharply turns her head, have her emotions be more surprised and a little more worried. What do we get? Oh, we've lost
worried. What do we get? Oh, we've lost the plot. What are your hands doing?
the plot. What are your hands doing?
So, key tip, patience. Results can take a lot of time to fine-tune. Before we
try and make any more, let's try and regenerate and see if that actually was a consistent thing. Okay. Yeah, I think that was just a one-off weird glitch. I
think I just need more energy. So,
that's what we're going to add into the next refinement. I type in always keep
next refinement. I type in always keep camera moving to keep the energy high for all scenes. And then FPV drone for the bike, handheld for the rest.
Oo, I like that. High energy
dismount. Okay,
not bad.
Ooh, she does look great. Ooh, that's
sick.
Okay. All right, we're getting closer to what we want. If you get to something where you're close, instead of making adjustments, try regenerating to see if you can get something that you like.
Ooh, that looks sick, actually.
That looks super real. That looks sick.
Holy crap.
Oh, we've done it, baby. That's fire.
Let's watch that again. That's pretty
much it. That's the full process from idea to final shot. You don't have to use chat GPT, but it is a super useful tool if you know how to use it correctly, which you now do. Well, I
hope you guys found this video helpful, and make sure you guys subscribe for more AI film making tips and tricks, and we'll see you next time.
Loading video analysis...