How to Create EPIC AI Films With WAN Animate & WAN 2.5 (Open Source and Uncensored)
By CyberJungle
Summary
Topics Covered
- Part 1
- Part 2
- Part 3
- Part 4
- Part 5
Full Transcript
One 2.2 Animate is by far the best character animation and replacement AI model on the market. Take a look at this video. What we are witnessing is
video. What we are witnessing is remarkable. We are at a pivotal moment
remarkable. We are at a pivotal moment in history where the line between what's real and what's artificial is becoming increasingly blurred. This isn't an
increasingly blurred. This isn't an output of a Hollywood studio with a massive budget. This is open source and
massive budget. This is open source and uncensored. This model is uncensored, so
uncensored. This model is uncensored, so I can say trash in an AI video.
>> This is something you can run on your own computer. With just a few clicks,
own computer. With just a few clicks, the results are so convincing that most people won't be able to distinguish them from reality. In this video, I will show
from reality. In this video, I will show you one animate capabilities, its strengths, and how to optimize for the best results. Another model from the
best results. Another model from the same one family, one 2.5 offers 10 seconds long 1080p cinematic shots with smooth camera movements, built-in audio
and lip sync similar to VO3.1 and Sora 2. So, in this video, in an epic prompt
2. So, in this video, in an epic prompt battle, I will compare 1 2.5 against other state-of-the-art AI video models
like Clink, VO3.1, Soro 2, and LTX2 Pro to crown ultimate AI video model with talking characters and sound effects.
Welcome to the crazy world of AI videos.
I'm using Van Animate on the official website of One. You can find the link in the description, but you can also access it on Hicksfield, for example. The one
is right here under the model selection.
You can see one 2.2. So to access one animate you need to come to the prompt box here and choose avatar option. Once
you pick avatar you have couple of options. First one is character swap and
options. First one is character swap and the other one photo animate. In this
video I will mainly focus on these two.
The character swap allows user to upload an existing video then upload a photo of a new person then one animate will accurately swap the new person in the
video. Essentially this feature enables
video. Essentially this feature enables user to put anyone into any video clip they desire. When generating with
they desire. When generating with character swap workflow involves uploading your source video where you basically just perform. And this looks a little bit crazy but just to show you
the capabilities I exaggerated little bit from emotion to emotion. And second
step is uploading the image of the person you want to swap in while generating on one's website. I highly
recommend you to pick pro option because it really makes a difference. And then
you can hit generate. Let me show you some of the results. So guys and girls like this is really incredible stuff.
You can see it tracks my whole body. It
tracks my mouth movements, my face expressions. It's really fantastic. And
expressions. It's really fantastic. And
you can create amazing AI films using this method. Don't stop believing.
this method. Don't stop believing.
[laughter] No, we will probably cut this part because it was cringe. But
>> yeah, and uh as I said, there's not much going on, but here we have actually my dog Lonnie. more important than Lor's
dog Lonnie. more important than Lor's butt. Um, and she loves cuddling and she
butt. Um, and she loves cuddling and she loves humans so much and she's enjoying
her time here. It come with me. Um, this
is called Moza. It's a town in Germany.
It's pretty chill here. Honestly,
there's not much going on. Um, I can show you few of the stuff. For example,
here we have an apple tree. And I really found this apple tree pretty cool because if you feel like some snack, you can just pick one of these and it's a
juicy, delicious apple. The results is pretty impressive. Body tracking, mod
pretty impressive. Body tracking, mod tracking, face expressions. It looks
pretty impressive. It's cool stuff. You
know, >> different AI video models have different superpowers. None of them wins
superpowers. None of them wins everywhere. I mix and match. One for
everywhere. I mix and match. One for
motion realism, VO3.1 for making my characters talk. Clink for consistent
characters talk. Clink for consistent frames. I need one place that speaks
frames. I need one place that speaks with all of them. That's Flora, the sponsor of today's video. Right now, I need highquality product photos for my e-commerce site and ads for my Cyber
Jungle merch collection. I want to drop my logos and designs onto hoodies and t-shirts. Generate studio quality photo
t-shirts. Generate studio quality photo shoots or vertical product ads for social platforms. No photographer, film crew or models required. Flora's
notebased UI makes it simple, but the cheat code is their pre-built workflows.
I can just clone one of the projects, swap my hoodie and t-shirt assets, tweak the copy, and hit go. Then system output
studio grade stills and vertical and horizontal ads that are social platform ready.
My typical run is I drop my logo, pick the garments and generate clean stills.
Once I have the stills, then I can spin the model 360 and autocut a vertical ad with captions. Export post. Done. For
with captions. Export post. Done. For
360 product hero shots, I use Cling's first frame and last frame inside Flora to keep the style and identity locked.
For voice and performance, Flora plugs into VO 3.1 and one 2.5 so UGC style spots can literally say what I type with natural feeling.
>> Want strangers to save your reels like recipes? Drop UGC in the comments. I'll
recipes? Drop UGC in the comments. I'll
send you a link. I can keep the same model across scenes, swap outfits, or change pauses and environments using Flora's nano banana integration. It
keeps character consistency, so the campaign feels like one brand, not a collage. If you or your team want to
collage. If you or your team want to give it a try, there's a 25% off link for the Cyber Jungle community. No code,
it applies automatically at checkout.
Just click on the link in the description or pin comment down below.
There are a few limitations you need to be aware of. First of all, if you have a long hair like me, it may struggle with characters who has no hair or bald hat like in this example with the rock
Dwayne Johnson or similarly in this example with Tanos. Original character
has no hair and in my performance I have a long hair so it really struggles in the edges. You can see here. Second
the edges. You can see here. Second
limitation is if character in your source image holding an object in the photo and while character swapping the model can struggle with rendering the hands properly because in the original
video performance you can see right here that I actually don't have anything on my hands. Then here you can see the
my hands. Then here you can see the model is struggling quite a lot trying to render the hands of the character.
It's essentially trying to add a baseball bat to the hands. At some point it just doubles. So, it's always better to upload an image where your character's hands are completely free.
The second option under Allatar is photo animate. So, photo animate brings still
animate. So, photo animate brings still images to life. It maps your voice and motion into a video clip featuring anyone with a single photo. It works by
letting the user pair a still image with a video of someone performing an action.
Here in the source video, you can see that I'm just performing holding your ball and running slowly. Once you upload this, it will pair your performance with
the source image of Rocky running and then it will create a performance like this. The character in the image will be
this. The character in the image will be performing your actions from the real life video. This process allows any
life video. This process allows any photo to move in in a really natural way. The workflow is very simple. You
way. The workflow is very simple. You
can see a few examples on the screen.
It's quite fun to use this feature and you can really create some nice AI film scenes and you can create some action which is extremely difficult to create
using AI video tools today. Here we have AI actors acting and performing in the ways that it's normally very difficult to achieve with normal AI video models.
There are of course few limitations here again. So in this video you can see that
again. So in this video you can see that the light coming from lightsaber looks really thin. This is due to the reason
really thin. This is due to the reason that object I'm holding was also pretty thin and the character reference image you can see that she's holding a pretty
thick lightsaber. Then in the end result
thick lightsaber. Then in the end result model is struggling to match the object's thickness with the object I'm holding. So you really want to ensure
holding. So you really want to ensure that size-wise the objects match each other. The second limitation is you need
other. The second limitation is you need to ensure while using photo animate that your legs your feet and your hands are always on screen. If it can't track your
feet or hands the output will struggle.
You can see in this source video at some point my right feet is behind the metal bar and then simply model is losing
track. I have this coherence problem in
track. I have this coherence problem in the final video where right feet looks like super weird. Now let's take a look at some of the practical use cases of
this feature. First thing comes to mind
this feature. First thing comes to mind is of course the VFX. I have this performance video and then I'm going to export the first frame of this video and
make some changes on it. In this example I use Nano Banana to add a fire to the tip of my finger. By doing this, I keep the original scene same and just change
an element from the scene. Then I'm
heading over to the one animate. I'm
uploading the image I created using the first frame of my performance video.
Then I'm uploading this performance video where I'm just simply moving my finger and then end result will look like this. So it will track my
like this. So it will track my performance and we'll just add a fire in one animate opens up whole new possibilities for AI acting. Here in
this video I'm acting and performing holding this object and now I'm heading over to Cdream 4 which is similar to Nano Banana. It's just like uncensored
Nano Banana. It's just like uncensored version of it. It is pretty impressive model. And I wrote the prompt turn this
model. And I wrote the prompt turn this scene to a man in World War II aiming with his rifle in the battle zone. keep
the man's face consistent. So in the end the scene will look like this. After
that I'm heading over to the hickfield and I'm going to use one animate model on Hicksfield. To do that you need to
on Hicksfield. To do that you need to choose Hicksfield animate and it will bring you to the one animate option.
Here I uploaded the image I created using Cream and I added my input video performance as a source video. You need
to ensure that you pick the correct one.
One animate, not the character swap it this time. And then I'm going to hit
this time. And then I'm going to hit generate. And this will create this
generate. And this will create this performance.
>> Now, where are the edies?
Find them and finish them. Where are
they?
>> So, a few things here. You will realize that some changes with the face details.
This hickfield option is actually using standard mode. If you want to minimize
standard mode. If you want to minimize facial changes, you can use pro mode on one's original website. And second cool thing is you will realize that it will also animate the background in a
realistic way. So the smoke and the fire
realistic way. So the smoke and the fire looks pretty awesome. Most of the time with this kind of AI acting models or character swapping models, the
background animation was also missing.
So this whole thing creates new possibilities for AI film making. The
next thing I want to show you is one model 2.5. To reach one 12.5, all you
model 2.5. To reach one 12.5, all you need to do is choosing video option. One
2.5 model supports image to video and text to video. You can generate 1080p videos up to 10 seconds. And it
generates audio built-in meaning similar to V3.1. It can generate dialogue, sound
to V3.1. It can generate dialogue, sound effects, and audio integrated videos.
So, it's better to have a prompt battle to understand capabilities, strengths, and weaknesses of a model. In the first test, I'm starting with an imagetovideo
prompt. In the battle, I have 12.5,
prompt. In the battle, I have 12.5, VO3.1, and LTX2 Pro, which is a model recently dropped. It's also an AI video
recently dropped. It's also an AI video model with talking characters. Sora 2 is not here because its image to video functions differently, but in the text
to video section, I added Sora 2 also as a competitor. Let's see the 12.5 result.
a competitor. Let's see the 12.5 result.
>> Hello my darlings. Welcome to my channel. I'm flying on a Phoenix today.
channel. I'm flying on a Phoenix today.
Drop a comment and like the video if you want me to jump from here.
>> The 3.1 result.
>> Hello my darlings. Welcome to my channel. I'm flying on a Phoenix today.
channel. I'm flying on a Phoenix today.
Drop a comment and like the video if you want me to jump from here.
>> LTX2 Pro result.
>> Hello my darlings. Welcome to my channel. I'm flying on a Phoenix today.
channel. I'm flying on a Phoenix today.
Drop a comment and like the video if you want me to jump from here.
>> Looking at the results, one 2.5 and V 3.1 is very close to each other. I think
they both did a great job. LTX2 Pro in this challenge seems little behind, but very strong performance from both models. Next challenge, one 2.5 result.
models. Next challenge, one 2.5 result.
>> Darlings, I'm parachuting into a battle royale. I hope it's not too dusty down
royale. I hope it's not too dusty down there. I don't want my hair getting
there. I don't want my hair getting dirty.
V3.1 result.
>> Darlings, I'm parachuting into a battle royale. I hope it's not too dusty down
royale. I hope it's not too dusty down there. I don't want my hair getting
there. I don't want my hair getting dirty.
>> LTX2 Pro result.
>> Darlings, I'm parachuting into a battle royale. I hope it's not too dusty down
royale. I hope it's not too dusty down there. I don't want my hair getting
there. I don't want my hair getting dirty.
>> Looking at the results with LTX2 Pro, there is always a strange noise in the end of the video. I don't know if this seems like a glitch or a bug in their system and it just randomly gave our
character a British accent which was not something we asked for. Looking at 12.5 result and V3.1 result, they both did a good job. Here, one minor issue with
good job. Here, one minor issue with 12.5 is at some point she's dropping the camera and we don't have the camera consistency throughout the scene. Here
on the view 3.1 side, we have camera consistently on the scene. She's always
holding it and it's like vlog style. In
the next challenge, we have a ASMR style video and she's whispering. Now, let's
see one 2.5 result.
>> Arlings, today's video is dedicated to all lions. Keep up the good work in the
all lions. Keep up the good work in the Jungle Kingdom. Did I mention before
Jungle Kingdom. Did I mention before that I'm also a wild cat?
>> Video 3.1 result.
>> Darlings, today's video is dedicated to all lions. Keep up the good work in the
all lions. Keep up the good work in the Jungle Kingdom. Did I mention before
Jungle Kingdom. Did I mention before that I'm also a wild cat?
>> LTX2 Pro result.
>> Darlings, today's video is dedicated to all lions. Keep up the good work in the
all lions. Keep up the good work in the Jungle Kingdom. Did I mention before
Jungle Kingdom. Did I mention before that I'm also a wild cat?
>> Again, overall LTX2 Pro Resolve feels little bit more artificial and it feels more like a stock audio. A V3.1 and 12.5
had much natural sounding talking character, but the 12.5 didn't necessarily have a whispering character.
It didn't feel like this ASMR style video where on the V3.1 side, it felt a clear whisper and ASMR style video. So
when it comes to that, I think V3.1 did a better job, but difference is not so huge. In the next prompt challenge, I
huge. In the next prompt challenge, I have quite a long prompt. Here I'm
testing a long spoken dialogue. So this
guy is talking about his truck and there's a long sentence with a southern accent. We will see how models handles
accent. We will see how models handles the long dialogue. Starting with one 12.5.
>> A truck. There's nothing like it. When I
wake up, I kiss my truck and greet it with good morning. I call her Sheila.
She's my baby. If you ever touch my truck, well, just nice knowing you.
>> Vo 3.1.
>> I call her Sheila. She's my baby. If you
ever touch my truck, [music] well, let's just say >> LTX2 Pro result.
>> My truck. When I wake up, I kiss my truck and greet it with good morning. I
call her Sheila. She's my baby. If you
ever touch my truck, well, let's just say it was nice knowing you.
>> Among all the results, only LTX2 Pro actually delivered the whole dialogue I wrote here. So, it did a fantastic job
wrote here. So, it did a fantastic job rendering the whole thing. The other
models unfortunately cut and removed some parts of the sentences and they didn't fully render the whole thing. Vio
3.1 at some point had some problems with the scale of the man. You can see the height is definitely off. He looks like a hobbit. And 12.5 really struggled with
a hobbit. And 12.5 really struggled with lip syncing. The synchronization was
lip syncing. The synchronization was off. So we can say that both Vio and one
off. So we can say that both Vio and one struggled with the long dialogue. Now
I'm switching to text to video. So Sora
2 can also join us. Here I have a dialogue between a man and a woman.
Let's see the one result.
>> Why do you talk to me like that?
>> Because someone prompted me to say that.
>> V result.
>> Why do you talk to me like that?
[laughter] >> Because someone prompted me to say that.
>> Soru result.
Why do you talk to me like that?
[laughter] >> Because someone prompted me to say that.
>> LTX to pro result.
>> Why do you talk to me like that?
>> Because someone prompted me to say that.
>> Here, all models delivered the dialogue nicely. In the one 2.5 result, she
nicely. In the one 2.5 result, she struggled little bit with the first sentence. Let's hear that again.
sentence. Let's hear that again.
>> Why do you talk to me like that?
>> So, it's like, why do you talk to me?
And then there's like a pass like that.
That's a bit strange. Besides that, all models gave us what we asked for. Some
of them actually edit the music because we also asked for tense music. There was
no music on 12.5. There were tense music on all other models. In the next prompt, I have a YouTube street interview style prompt. Let's hear the one result.
prompt. Let's hear the one result.
>> You got to do is answer one question, right? And you get 50 bucks.
right? And you get 50 bucks.
>> Ready? Yeah. Bring it.
>> Name a country that its name rhymes with >> Germany.
>> That That is a bad answer. You lose.
>> Quite the chaos here. We 3.1 result.
>> All you got to do is answer one question, right, and you get 50 bucks.
Ready?
>> Yeah. Bring it.
>> Name a country that it name rhymes with Germany.
>> Romania.
>> That is a bad answer. You lose.
>> Chaos again here. voices and order of things completely mixed up. Sort of too pro result.
>> All you got to do is answer one question right and you get 50 bucks. Ready?
>> Yeah. Bring it.
>> Name a country that its name rhymes with Germany.
>> Romania.
>> That is a bad answer. You lose.
>> This was perfect order of spoken sentences and it felt really really natural. It felt super realistic almost
natural. It felt super realistic almost in a creepy way that Sora is doing this so good. LTX2 Pro result. All you got to
so good. LTX2 Pro result. All you got to do is answer one question right and you get 50 bucks. Ready?
>> Yeah. Bring it.
>> Name a country that its name rhymes with Germany.
>> Romania.
>> That is a bad answer. You lose.
>> Again, there was a massive chaos. Yeah.
By far, Sora 2 Pro did the best job with this challenge. In this section, I will
this challenge. In this section, I will test some dance prompts just to see physics quality and anatomical rendering quality of models and how they compare
against each other. Let's see the one result.
Anatomically, it's fine, but I just feel like music is missing here. Without
music, it feels a little bit strange.
Let's see the Sora result, [music] very dynamic. We have also two woman and
very dynamic. We have also two woman and one man. Here we have three women. So
one man. Here we have three women. So
from prompt to understanding perspective, Sora did a better job.
Let's see the VR result.
>> Work that runway. Now dip
>> and freeze.
>> We also gave us two woman and one man.
And so far among everything I see anatomically it also feels very natural.
Now let's see the clink result.
Okay, the clink result feels a little bit more static in comparison to other models. Now, let's see the LTX2 Pro.
models. Now, let's see the LTX2 Pro.
In the end of the shot, we had a cloning problem. The man in the front just
problem. The man in the front just became two men and he cloned himself.
Consistency problems. Minimax hardly result nicely dynamic. And I also like that it
nicely dynamic. And I also like that it gave us a dynamic camera. Pretty
awesome. Again, among everything I see, the most dynamic result definitely coming from Sora 2 Pro. Minim highly
dynamism. I also really appreciate one 2.5 felt a little bit more static. V3.1
dynamism is also very strong but Sora is my winner here. Here's the prompt. We
have a cabaret style performance from 1920s. Lot of reds and some decorative
1920s. Lot of reds and some decorative details about the stage design. Let's
see the one result.
in the midnight >> for a result.
>> Video resulted.
Wow, >> this was a pretty strong performance from Vio. Clink result.
from Vio. Clink result.
Okay, this feels a little bit more like ballet than coveret for me. Mini makes
higher result.
The backdrop dancers are a little bit more static. LTX2 Pro.
more static. LTX2 Pro.
[screaming] It's a good model. It's also a very decent performance from LTX2 Pro. I felt
like one Sora V and LTX2 Pro really understood the assignment and gave us what we asked for. Cling and Miniax were a little bit off. So, not the best results from them, but rest of the
models did a great job. In the next challenge, we have some K-pop and good old hip-hop performance. Let's see the one result.
V3.1 resultage.
[music] The Sora result feels really like high speed. It's high action. All other AI
speed. It's high action. All other AI video models in comparison to Sora feels really in slow motion. Let's see the clink result again. Clink 2.5 feels little bit static
again. Clink 2.5 feels little bit static in comparison to other models. Feels
much like slow motion. Minimax harder
result closest to Sora in terms of dynamism.
Very impressive result. I just don't understand why do we have this piece of text here. But besides that, it's very
text here. But besides that, it's very good. RTX2 Pro. [screaming]
good. RTX2 Pro. [screaming]
[screaming] >> The music was off with LTX Sup Pro. And
this is a generally a pattern that I'm observing with LTX. It's sometimes audio dialogue and music is off, but dynamism of the scene, the camera dynamism and
how natural it feels. It's really here.
A pretty strong performance from LTX. So
when I look at all the results, I think Sora, MiniX, and LTX gave us the best dance performances. It looked very
dance performances. It looked very dynamic where other models felt little bit more slow. In terms of audio and musical quality, V3.1 did the best job.
So I have a monster confrontation challenge. The overshoulder shot of
challenge. The overshoulder shot of diver charging towards the monster. Then
in continuous tracking shot, he needs to step the monster upwards. Pretty good
result from one gave us everything we asked for. V 3.1 also gave us what we
asked for. V 3.1 also gave us what we asked for. Clink 2.5. It's not really
asked for. Clink 2.5. It's not really clear for me that clink gave us the stabbing part.
Miniax Halio. Okay, we had some problems with Miniax. The you can see these
with Miniax. The you can see these bubbles look a little bit artificial and at some point we lost the blade coherence and it became like a thick
blade. That's not so ideal. And stabbing
blade. That's not so ideal. And stabbing
is also a little bit chaotic. It's not
as clear as one and VO LTX2 Pro. Wow.
The quality of the bubbles fantastic.
Looks realistic and okay. The stabbing
part was little bit chaotic but somehow it gave us what we asked for. I think
among all the result I like the most one 2.5 and V3.1 results. In the next challenge, we have a surfing scene and at some point we want a giant whale to
be jumping into the scene on the background. Let's see the one result.
background. Let's see the one result.
>> Wait, what?
>> The whole shot felt like a vlog and she was like, "Wo, what? Wa!" The audio didn't feel so natural, but it's still nice that one is able to produce the
sound. Let's see the V3.1 result.
sound. Let's see the V3.1 result.
Again, a typical thing from Vio just makes everything slow motion and cinematic and majestic. This model is just perfect for slow motion stuff and
it gave us what we asked for. So, no
problem there. Clink 2.5 result.
Yeah, this is like very impressive stuff. It's a giant veil. The scale is
stuff. It's a giant veil. The scale is really majestic and this whole surfing action and waves look really fantastic and natural. Great result from Clink
and natural. Great result from Clink Minx result. Okay, this feels little off
Minx result. Okay, this feels little off and overall it felt less epic. LTX2 Pro
audio was horrible but the shot itself is really cool. Again, it's a slow motion shot. I think looking at the all
motion shot. I think looking at the all results, Clink did the best job with this prompt. Everything was on point and
this prompt. Everything was on point and it gave us a fantastic epic cool looking veil in the best possible scale that we can imagine. Hopefully this video was
can imagine. Hopefully this video was truly helpful for you. Don't forget to give a thumbs up and subscribe for more in-depth tutorials. If you want to learn
in-depth tutorials. If you want to learn more about future of AI storytelling, click
Loading video analysis...