What's REALLY Behind the Nano Banana Pro's Crazy Success?
By OpenArt
Summary
## Key takeaways - **Nano Banana Pro's Reasoning Power**: The strength of this model is its reasoning capabilities, powered by Gemini 3's LLM with very deep world knowledge understanding, enabling accurate interpretations from simple prompts like minimalistic logos made from realistic food. [01:24], [03:00] - **Vague Prompts Yield Legible Storyboards**: With a simple prompt 'create a storyboard for this scene' using an anthropomorphic banana family image, every letter of text is legible without specifying panels or shots, unlike other models requiring detailed instructions. [03:14], [03:25] - **Accurate Infographics from Vague Prompts**: A super simple prompt for a photorealistic solar system infographic with planet distances produced fact-checked accurate numbers in kilometers and legible text; similarly, a Filipino chicken adobo recipe infographic got the steps pretty damn accurate. [07:15], [07:58] - **Sketches Transform with Simple Prompts**: Using just 'make this drawing photo realistic' on kids' sketches like a bunny or family holding hands turns them into photorealistic scenes, preserving composition while enhancing textures and environments. [11:45], [12:35] - **Unmatched Character Consistency**: This model is probably the best for keeping characters consistent, maintaining face, jewelry, studded belt, and shirt pattern across six panels of various activities from a single reference image. [15:11], [15:50] - **Animate Characters Easily**: From a static character image, click image to video with a basic prompt like 'woman doing a swimsuit fashion shoot, camera slowly arcs around her' to bring the character to life in motion. [22:11], [22:50]
Topics Covered
- Gemini powers superior reasoning
- Vague prompts yield legible storyboards
- Infographics from minimal prompts
- Best-in-class character consistency
Full Transcript
What the heck is going on?
>> What's with the bananas?
Breaking news. The [music] Nano Banana Pro epidemic is spreading like wildfire.
>> Nano Banana Pro is live on Open Art and as you [music] can see, everyone is going bananas.
[music] Banana [music] Banana Pro.
>> Hello, good people of Open Art. Nano
Banana Pro is live on Open Arts and we're going to take a look at what all the hype is about. We're going to jump right into it. We're going to go over to
image here on to create image and under the model selection obviously you just want to make sure that you choose Nano Banana Pro right here you can choose
between three output resolutions 1K 2K and 4K and I would tell you off the bat if you're new to using this model I'd encourage you to watch the original Nano
Banana video I did just a couple months ago because everything on that video still applies now it's no surprise is that this model can do various styles
from realism, anime, 3D. Most models
nowadays should be able to do all these various styles. But the strength of this
various styles. But the strength of this model is its reasoning capabilities, and that's what we're going to focus mostly on today. For example, I have a prompt
on today. For example, I have a prompt here to create minimalistic logos. Each
is a fun fruit or vegetable word and I state make the letters from realistic food to express the meaning of this word and compose it on a plain white background. And if we look at the
background. And if we look at the results here, you see our first one is citrus made out of citrus fruits. The
same thing with berry, melon, herb, pea, beet, corn, and apple. And it does it very well. I did something similar but
very well. I did something similar but just with fun food as opposed to focusing on fruit and vegetables. And
you see similar thing here. We have
crisp with some apples and chips. With
melt, we have what looks like peanut butter and chocolate. A great
combination in my opinion. Reese's
pieces lovers. We've got fizz here mixed with some berries. And I don't know if you can see it, but that does look like carbonated water that forms the word fizz. And the rest of the words here you
fizz. And the rest of the words here you can see are very accurately depicted with the proper type of food in the shape of the word. That's really cool.
Next, I did a similar prompt using eight minimalistic logos and this time in expressive words. And I also wanted to
expressive words. And I also wanted to keep it simple. Flat vector black on a single white background. Based on this generation, you see that the model
understands exactly what we prompted for. And that's because this model is
for. And that's because this model is powered by Gemini 3's LLM. It has very deep world knowledge understanding. Now,
I wanted to step it up a level. So, I
have this image of a anthropomorphic banana family. And I simply put in this
banana family. And I simply put in this prompt, create a story board for this scene. And the first thing that sticks
scene. And the first thing that sticks out to me is that every letter of text is legible. Now, other models can do
is legible. Now, other models can do text well, but not with such a vague prompt. Typically, you have to put in
prompt. Typically, you have to put in your prompt panel one, quotes, establishing shot, so on and so forth.
But here, I didn't even put any of that.
Panel one, we see establishing shot, Paris Opera House at Dusk. Panel two,
character intro, the banana couple, and we see panel three and four here with clear text that is spelled correctly.
It's not always going to get the spelling right, but for the most part, it does a very good job. Here's another
variation done more like a pencil sketch. But I was really impressed on
sketch. But I was really impressed on how the model composed each panel, even down to the closeup of the banana baby.
And then the last panel, they're looking at the sunset and of course the supporting text. Then it got me thinking
supporting text. Then it got me thinking about comic strips. So I took this image for this one character and the second character which is supposed to be the
villain. These two images were done in
villain. These two images were done in seedream. I had done these for a recent
seedream. I had done these for a recent video I did on social media about start and end frame. So initially my prompt was create a sixframe storyboard with
two characters in image two and oh that should actually say image one and image two but it knew what I was talking about. And then I specified the ending
about. And then I specified the ending scene is the third image where they clash, which was this image here. So I
used three reference images. And as a result of that prompt, I got a nice six panel comic strip here. The smaller
details in the face get a bit lost, but that's going to happen a lot of times with smaller details, and you will have to correct that. But generally, overall,
the results are great. I actually ended up replacing this weapon by uploading another reference image and just focusing on this part. And then I took
that one comic panel as a reference image and added this prompt. Add
captions and speech bubbles to this comic strip that makes sense based on the image. I state here who's the
the image. I state here who's the villain and the hero and that's it. No
specific direction in terms of the dialogue. And we see that the model's
dialogue. And we see that the model's capable of developing this short little story timeline based on only the three images that I used. And I really like
this exchange because it's very typical of this style of comic. Such bravado
little hero. Let's see if you can match my power. Your reign of terror ends
my power. Your reign of terror ends tonight. And then we have the action
tonight. And then we have the action sequence where she attacks, she counterattacks, and then some final words. And as mentioned, you see I
words. And as mentioned, you see I replaced the weapon here. Infographics
has been such a focus on many of the recent models like Cadream Nano Banana 1. And now with the pro version of Nano
1. And now with the pro version of Nano Banana, we see an improvement with infographics. In this example, I have
infographics. In this example, I have distances between each planet to the sun in kilometers. And I did fact check if
in kilometers. And I did fact check if these numbers are accurate. So Mercury
we see 57.9 million kilometers away from the sun and this is accurate based on I think the middle of the planet or the outside maybe but in actual fact it's
more like 46 47 km from the nearest point. So technically it's correct and
point. So technically it's correct and once again in terms of the clarity of the words everything is spelled correctly and all the numbers are
legible. We do see here at 778.5
legible. We do see here at 778.5 million where it's a little bit hard to make out, but here it's very clear. And
once again, my prompt was super simple.
Create a photo realalistic infographic of the solar system complete with names of the planets and the distance between them. So, I didn't state whether it be
them. So, I didn't state whether it be kilometers or miles, but it chose kilometers. I don't have a lot of use
kilometers. I don't have a lot of use cases for infographics. As amazing as that is, it just isn't my thing. But
what is are food recipes. And as we take a look at the prompt here, you see I put create an infographic that shows the steps on how to make a Filipino dish,
chicken adobo. Being Filipino, this is
chicken adobo. Being Filipino, this is my favorite food. And yes, I know how to cook it. I was curious to see with a
cook it. I was curious to see with a simple vague prompt if it got the steps correctly. And I tell you, it's pretty
correctly. And I tell you, it's pretty damn accurate. You have your key
damn accurate. You have your key ingredients, obviously chicken. Now,
this could be pork, too. We've got
garlic, soy sauce, vinegar, peppercorns, bay leaves is kind of optional, and you can also use onions later on on step three. The directions on the bottom are
three. The directions on the bottom are very accurate from marinating it to boiling it, and then optional if you want to put a sear on the chicken and
then serving it over rice.
[music] Moving [music] on to poster ads. This is where I see where Nano Banana Pro could be very useful for marketing materials. And once
again, you're going to see a running theme where it really doesn't take a very complex prompt anymore to get the results that you want. In my prompt
here, I have create a poster ad for a beer called Beerbelly.
Fortunately, mine is hidden from view.
The setting is a London street at dusk with neon lights. And the tagline should be belch like no one is around. The can
design should be masculine and feel like you are at the bar. this generation. I
thought it was very creative where they took a man's arm flexing his muscle and made his fist a glass of beer with foam at the top. If that's not masculine, I
don't know what is. Here's another
variation where they really emphasize the beer belly portion. I really like this one. And we also see the trademark
this one. And we also see the trademark double-decker buses and the black cabs that are known to be in London. And in
this example, what stood out to me was the texture of the can. It's got these little dents on it. The fingers are very accurate, even down to the nails. That's
really impressive. And you could even see the fingerprints if you really look at it. I like that they put a dog here
at it. I like that they put a dog here wearing a hat and what could be a local bar in London. I don't know if this is true. I didn't bother to research it,
true. I didn't bother to research it, but it does sound like a local bar. So,
naturally, it got me thinking about merch. Here at Open Art, we've been
merch. Here at Open Art, we've been thinking about doing merch. Hey, if
that's something you're interested in, let us know in the comments below. As
you can see, I always like to wear my hat. And I simply just put in a logo of
hat. And I simply just put in a logo of Open Art here. And my prompt is create a merchandise line for Open Art using the logo in the reference images. Place each
one according to the merch complete with baseball cap, hoodie, sports pants, shoes, sunglasses, and an energy drink.
Now, I don't think we're going to come out with a drink. I just wanted six items. And not to be biased or anything, but I would wear this stuff. Obviously,
I love my hats. I love my hoodies. These
shoes are okay. I think the logo here is a little big, but some really cool sunglasses. And this energy drink looks
sunglasses. And this energy drink looks pretty cool. This time I prompted for it
pretty cool. This time I prompted for it to be more white dominant. And yeah, I mean I'm really happy with these results. The energy drink looks really
results. The energy drink looks really cool. I like the sunglasses. Not a big
cool. I like the sunglasses. Not a big fan about the shoes, but I do like this part where the logo is. And yeah, I mean the hat looks the same. I definitely
wear that. One of the things I love doing when I'm introducing generative AI to parents or even kids is transforming their drawings or sketches using
generative AI to make them photorealistic. Maybe they like anime or
photorealistic. Maybe they like anime or 2D comics, whatever the case may be.
Here's a reference image that I just pulled from doing like a Google search.
And the prompt was make this drawing photo realistic. That's it. It makes the
photo realistic. That's it. It makes the environment like the grass and the weeds and the sky look realistic, but it makes the bunny more like a plush toy. So
again, it's a very simple prompt, but it does look like I put a rabbit plushy on the grass and took a photo of it. I
changed the style a little bit, just adding 3D CGI Pixar style, and we get this very playful bunny character in a Pixar style. That previous drawing was
Pixar style. That previous drawing was actually pretty good for a kid's drawing, but this is more realistic for a very young kid. And depending on their
age, this is pretty good. I mean, if the kid is like 16 years old drawing this way, then he's probably not the best artist in the world. But I use this as a reference image. And once again, use the
reference image. And once again, use the same prompt. Make this drawing photo
same prompt. Make this drawing photo realistic. And other than the sun,
realistic. And other than the sun, everything looks photorealistic. Even
down to the composition of the girl holding her parents' hands. The texture
of the trees and the grass as well as the house looks very photorealistic.
Here's a different variation that also turned out very well. I then tried anime and this is the result. Although the
house looks pretty small, but it is kind of basing it on the drawing composition, right? searching for these image was
right? searching for these image was pretty hilarious and I came across this and I remember seeing this on social media. If you first glance at it, it
media. If you first glance at it, it doesn't look very positive. And then you see here we're snorkeling. So I wanted to convert this into an actual image.
And for the prompt, make this drawing photo realistic. The people are
photo realistic. The people are snorkeling in the deep ocean. Remove the
text. So I added just a bit more details. And in terms of composition, it
details. And in terms of composition, it got it pretty spot-on, and it made it more clear that they were snorkeling.
This time, I just added wearing scuba gear. He's even got a camera here. He's
gear. He's even got a camera here. He's
taking pictures. You can see the fish in the background. And for the most part,
the background. And for the most part, it still honors the original composition, just slightly adjusted. But
it's amazing what you can do with a simple prompt in this Nano Banana Pro model. Next, I took this furry light
model. Next, I took this furry light texture and combined it with this drawing of I would presume is a race car with sort of a flame graphic. And in the
prompt, I put transform the simple sketch into a realistic car. Follow
creative direction of the sketch and use the colors and texture from the uploaded image. And we get the sports car decked
image. And we get the sports car decked out with that furry texture. Here's a
different version where you can see the flame graphic. Even has some headlights
flame graphic. Even has some headlights and some fire at the back, but it does it very well. Another thing that was really cool for me that it recognized
foreground and background. And what I mean by that is if you look at the image here, the fist is not in focus, but the face is. And with a simple prompt, we
face is. And with a simple prompt, we can switch where the focus is. Now the
fist is more in focus. And we see how blurry the face is. And once again, a very simple prompt. Focus on the man's
hand. Blur his face. One very important
hand. Blur his face. One very important thing for me when it comes to generating images and developing characters is
consistent characters. And I will tell
consistent characters. And I will tell you this model is probably the best model in terms of keeping your character consistent. On the screen is a character
consistent. On the screen is a character that I'm personally developing and I use this one image as a reference and obviously I wanted to test the
consistency of her face and her clothing. My prompt reads, "Place this
clothing. My prompt reads, "Place this woman in a six panel image of her doing a fashion shoot in a studio with colorful lighting, keeping her outfit consistent." Looking at the panled
consistent." Looking at the panled images, we see everything from her jewelry and the cross here to her studded belt is all consistent. The
pattern of her shirt, as I mentioned earlier on in the video, you see her face details as well as here gets a bit lost, which is still kind of an issue
when it comes to full body perspectives.
Now, mind you, the size of this panel is very small. So, if you were to do a 2K
very small. So, if you were to do a 2K or 4K image of a full body shot, the results are going to be a lot better, which I'll show you in a bit. And in
this example, I prompted slightly different and basically being very general, saying a six panel image of her doing different things. Once again,
keeping her outfit consistent. We see
her DJing, skateboarding, singing, taking a selfie in a probably an unsafe neighborhood. She's painting and then
neighborhood. She's painting and then she's eating some tacos. Seems like a pretty fun woman, don't you think? But
once again, all the little details of the outfit are consistent to a tea. And
then to close this little series of images, this time within the prompt, I put doing different things throughout the day from morning to night. We see
her waking up in the morning in her pajamas. She goes for a workout wearing,
pajamas. She goes for a workout wearing, you know, gym clothing. She's working in a cafe with more of a casual outfit. We
do see one scene of her with her original outfit, which is fine. And hey,
she cooks, too. And then she's wearing pajamas, chilling out, watching some TV.
So, I really love how the model knows to adjust the outfit according to the activity she's doing. So far, we've covered a wide variety of things, and I've really just been showing you
examples and their prompts. To land the plan, I want to show you some use cases here, especially when it comes to character consistency. This model is
character consistency. This model is highly capable of consistent characters.
On the screen, you see the same character. This time, it's more of a
character. This time, it's more of a portraits aspect ratio. I'm going to use this as a reference image. And I just drag and drop the reference image in one
of the slots here. In terms of settings, I'm just going to do 69 just so it looks good on this YouTube video. And I'm
going to do 2K variations. You can do 4K if you want. And if you're new to prompting, I want you to keep this in mind where when you're prompting, you
always want to note the subject, the action, the environment. And if you have to specify lighting, which typically the environment does, you can do that. A
good example I like to use a lot is sunrise or sunset. And then you want to state the context and style. So here we identify the woman, which is the
subject, sitting on a park bench listening to music. So we're stating the action there. We're giving a bit more
action there. We're giving a bit more detail on that where she's wearing headphones and reading a book. And then
we say in the background is a playground. Also, you'll notice I don't
playground. Also, you'll notice I don't put photo realism or cinematic because it's going to adopt the style of the reference image. The first image we see
reference image. The first image we see it looks very accurate to my prompt.
She's reading a book, listening to music. The hands look great. Now, some
music. The hands look great. Now, some
will argue that she's missing a finger, but from this angle, the pinky is likely underneath her other finger here. And
once again, if we look at all the details of her outfit, from the necklace, the choker collar, the studded belt and chain, everything is very
accurate. You know, I also notice right
accurate. You know, I also notice right here, is that bird poop? [laughter]
If that's bird poop, that's pretty accurate for being a park bench. And we
also have our playground in the background with some kids playing on the swings. Very accurate. Going back to our
swings. Very accurate. Going back to our original reference image here. Let's
change it up a bit. And I often see questions in the comments about, "Oh, how did you get your character driving in the car?" or "How did you get your character doing this?" I'm actually
showing you the process. You just have to change the prompt. Just like what we previously did, we use the same reference image, and I'm just changing the prompt here. place the woman driving
in a convertible sports car in a busy street in Las Vegas at sunset. And wow,
we even see the Bellagio here. She's
driving a sports car top down. A
different perspective, although it looks like she better move soon. Looks like
she's sitting in the middle of oncoming traffic. Oh, Caesar's palace. Pretty
traffic. Oh, Caesar's palace. Pretty
cool. Or if you want her to change clothing, once again, you just have to prompt it. A woman wearing a fancy dress
prompt it. A woman wearing a fancy dress that she would wear attending the Oscar.
Now, you could be very specific and say a sparkly silver dress or whatever the case may be. You could do that, too. I'm
being very general because I'm curious to see again the knowledge and reasoning of the model. And I really like how they have the Oscar at the back. She's
presumably on the red carpet. We have
the media folks taking pictures, but other than that, what if you have a dress that you want to put on your character and sort of do like a swap?
Here's a swimsuit outfit that's based on her outfit now. And we're going to put that in one of the reference slots.
And the prompt, I'm just going to put place the woman wearing the swimsuit of image two. She's posing in a pool for a
image two. She's posing in a pool for a fashion shoot. And while that generates,
fashion shoot. And while that generates, what I'm going to do is bring in another outfit here that I really liked. And
we're going to hit confirm. And I'm just going to put swap the outfit from image two to image one. And here's the swimsuit outfit. Very accurate based on
swimsuit outfit. Very accurate based on the generated swimsuit. We have a different pose here. She kind of has like a Rihanna vibe almost. E like a young Rihanna.
And you see with the dress swap, it worked very well. And because we only prompted for the dress swap, it kept the background of the first image. So for
those of you that ask, I have my own clothing line and I want to create my own models wearing my outfits, this is how you do it. It's super simple. And of
course, if you're doing AI influencer type of content and you want to take these images and bring them to life, you can simply click on image to video. I'm
just going to leave it at cling 2.5.
It's a good all-around model to use. And
your prompt doesn't have to be super complicated because you're starting with an image. You don't have to describe her
an image. You don't have to describe her outfit and what she looks like. You just
want to focus on some basic things like the woman is doing a swimsuit fashion shoot, giving a context, posing in various poses, her action, and then we
want to state the camera movement.
Camera slowly arcs around her. And after
a few minutes, you see that we've brought our character to [music] life.
Now, we've only scratched the surface at this point with Nano Banana Pro. I would
say at the time of recording this video, the hype is definitely real. I'm sure
you're going to find some shortcomings with the model, but so far from what I've seen, it's very, very promising.
[music] And this era of multimodal image and video models [music] is truly a very exciting time. As always, my friends,
exciting time. As always, my friends, let me know what you think in the comments below. Make sure to subscribe
comments below. Make sure to subscribe if you haven't already because we will be [music] following up on more videos about Nano Banana Pro. But until that
next video, my friends, happy creating.
Loading video analysis...