LongCut logo

4 Reasons to Use ImageGen 1.5 Over Nano Banana Pro

By The AI Daily Brief: Artificial Intelligence News

Summary

## Key takeaways - **Infographics Avoid Nano Style**: Nano Banana Pro infographics have a particular flavor and style that people can spot from a mile away, so ChatGPT Images offers a competent visual alternative that doesn't look like every other one online. Both can create them from transcripts, but Nano adds useless citations while ChatGPT has minor errors like spelling 'bigger' as 'B I G E R'. [12:48], [13:21] - **Hyperprecise Complex Grids**: ChatGPT Images excelled at a 6x6 grid of 36 distinct Lovecraftian artifacts with exact specs like 1920s pulp style, inked lines, no text, and no overlaps, doing a phenomenal job on every square. Nano Banana Pro failed, producing an 8x5 grid mess with inaccurate individual squares. [14:36], [15:10] - **Superior Instruction Following**: ChatGPT Images reliably adheres to intent down to small details, changing only what is asked while keeping lighting, composition, and appearances consistent across edits. This enables believable clothing try-ons, stylistic filters, and conceptual transformations retaining the original essence. [01:44], [02:04] - **Fun Consumer Interface**: ChatGPT Images has a dedicated section with style presets like sketch, holiday, plushy, and idea prompts like 'me as a K-pop star' to solve the blank slate problem and encourage casual fun messing around. This targets average users having fun, unlike Nano Banana Pro's interface. [18:06], [18:52] - **Arena Leads Nano Banana**: GPT Image 1.5 holds a commanding 29-point lead on text-to-image and a narrow three-point edge over Nano Banana Pro on image edit in Image Arena, topping both categories preliminarily. Artificial Analysis tests also ranked it number one, surpassing Nano on text-to-image and editing like changing car colors or adding ducks. [06:16], [06:52]

Topics Covered

  • Competition Delivers Consumer Choice
  • Precise Edits Match User Intent
  • Parity Achieved, Preferences Subjective
  • Hyperprecise Prompts Favor ChatGPT
  • Consumer Fun Drives OpenAI Interface

Full Transcript

Today we are checking out OpenAI's new image generation model and looking at four contexts or reasons why you might want to use it instead of Nano Banana Pro. Another day, another new model.

Pro. Another day, another new model.

Look, this competition between the big labs may be stressful for the people working there, but for us consumers, it means nothing but more choice. Today we

are talking about OpenAI's latest image generation model and the new house they put it in, which they are calling ChatBT images. Now, overall, this is one that I

images. Now, overall, this is one that I kind of expected. You might remember in the December prediction episode, even before I think Sam Alman had declared code red, or at least before we knew

about it, my best guess for a response to Gemini 3 and Nano Banana Pro was an OpenAI image model. It had just been a really long time since we got an update on that, it was clearly an area where

they were pretty far behind, and it seemed like based on the fact that it had been so long since we got an update.

And knowing the speed at which OpenAI delivers, they had to be pretty close, one would think, to being able to release a new model. Now, I didn't expect a full 5.2 release, and that's obviously the first output of Code Red,

but yesterday on Tuesday, OpenAI dropped their new chatbt images. As benefits,

they point to stronger instruction following, precise editing, detail preservation, and a big speed boost as compared to before. So, let's talk a little bit more about what OpenAI points

to as the benefits here. A lot of this is about feature parity with Nanobanana Pro. Remember, the real value of

Pro. Remember, the real value of Nanabanana Pro was not just that it was an improvement in terms of raw generation capability. It was about the

generation capability. It was about the controls that the user had over it.

Whereas in the past, to get exactly what you wanted out of a generation, you just have to kind of prompt it over and over and over again and pick the one that was closest, Na Banana Pro allowed for more precise edits. That capability has now

precise edits. That capability has now come to ChatGBT images as well. They

write, "The model aderes to your intent more reliably down to the small details, changing only what you ask for while keeping elements like lighting, composition, and people's appearance consistent across inputs, outputs, and subsequent edits." Interestingly, they

subsequent edits." Interestingly, they point to some pretty consumer- ccentric use cases for that, which is a theme we'll come back to throughout this episode. They continue, "This unlocks

episode. They continue, "This unlocks results that match your intent. More

believable clothing and hairstyle tryons alongside stylistic filters and conceptual transformations that retain the essence of the original image."

Another capability they point to is adding subtracting combining blending, and transposing. For example,

taking a set of inputs and turning it into a single composition. Another

capability that they're really hammering is what they call creative transformations. Basically, taking one

transformations. Basically, taking one image and turning it into a different style preset, a movie poster, or turning someone into an 80s fitness instructor, taking someone's photo and turning it

into an ornament, etc., etc. Once again, and as we'll come back to, I actually think that they're highlighting this says a lot about who they are intending this product for. Other benefits they point to include better instruction

following up to an including much more precise prompting. And they also point

precise prompting. And they also point to much better text rendering. Now, this

was obviously one of the biggest changes that we got with Nano Banana is that in addition to just being able to have text with Nano Banana and then Nano Banana Pro, you could get a ton of highfidelity

text opening up new possibilities for things like infographics. One final

interesting thing from their announcement post is that while in most areas the model improved, they actually did find some regressions as well. For

example, they write, "The ability to generate some specific art styles has regressed from the previous version."

The example they give is draw me like I'm in a dark fantasy anime. With the

new version completely 100% not being that at all. There are other limitations as well. For example, when there's a

as well. For example, when there's a picture with a lot of different faces in it, keeping all those faces consistent between generations can be difficult.

Overall, they claim a big improvement, but still a lot more opportunity ahead.

So, what were people's first impressions? I think my sense is that

impressions? I think my sense is that people were kind of prepared to be somewhat underwhelmed. I'm not exactly

somewhat underwhelmed. I'm not exactly sure what the reason for that is. Maybe

it's a concern that because this was part of that code red that this and basically any other model that they might release would be a rush job. But

for a lot of people, even though they were prepared to be underwhelmed, they were, I would put it, kind of welmed.

Justine Moore from A16Z writes, "In early tests, this is a big step up in maintaining consistency of characters and objects from uploaded images. In

other words, your face still looks like you. It may be a real competitor to Nano

you. It may be a real competitor to Nano Banana Pro." Simon Smith from Click

Banana Pro." Simon Smith from Click Health wrote, "I wasn't expecting OpenAI's new image generator to be comparable to Nano Banana Pro, so I ran it head-to-head on prompts I tried with

NVP. Surprisingly, it did as well or

NVP. Surprisingly, it did as well or better, but it has a different personality, at least via Chat GBT. Less

whimsical, more professional. So here

are a couple of the examples he gave.

Research when prominent people, especially the leaders of big AI labs and forecasters, think we'll get AGI.

Then illustrate this on a timeline and put the faces of the people on the timeline on the years when they think we'll have AGI. Give this a fun kind of cartoony but not too silly feel. Now a

couple things. First, I think this is a good test to see how well integrated with the rest of the model image generation is. In other words, this

generation is. In other words, this requires not just image generation, but it's also reason and research. And the

second thing that this brings up is that inherently the challenge with all of this episode and by the way this is a good one to watch if you're just listening is that to some extent quality is going to be subjective. Although in

this case I certainly see why he prefers chatbt images I images version as opposed to nano banana. He tried

creating a cell cutout diagram which again is a little bit in the eye of the beholder but certainly holds its own alongside a skeleton anatomy chart and a prompt that said search up today's top headlines and then give them to me in

the style of an old newspaper. Now, the

two models in this case took the prompt in very different directions and I actually prefer aesthetically Nano Banana Pros. But overall, Simon says, "I

Banana Pros. But overall, Simon says, "I was prepared to be disappointed and I'm not." That's saying something because

not." That's saying something because Nano Banana Pro is amazing. I need more time to play around with the new image generator, but my first impressions are positive. He then came back and said,

positive. He then came back and said, "Slides, however, may be a weakness of GPT Image 1.5." before very quickly returning and saying, "Okay, I take it back. GPT image 1.5 can do gorgeous

back. GPT image 1.5 can do gorgeous slides. You just need to prompt it." I

slides. You just need to prompt it." I

gave it the same template in the above example, but use GPT 5.2 thinking instead of instant and a broader prompt.

He did point out, however, that there are real limitations to the aspect ratios that you can get with GPT image, which has always been an issue for chat GPT images. Still, all of this added up

GPT images. Still, all of this added up for Simon to him actually thinking that GPT image 1.5 has beaten Nano Banana Pro on his personal scorecard. And it wasn't

just Simon. Alamarina tweets, Image

just Simon. Alamarina tweets, Image Arena shakeup. OpenAI's GPT image 1.5 is

Arena shakeup. OpenAI's GPT image 1.5 is number one in text to image. Chat GPT

image latest is number one on image edit. GPT Image 1.5 holds a commanding

edit. GPT Image 1.5 holds a commanding 29point lead on text to image while maintaining a narrow three-point edge over Nano Banana Pro on image edit. Now

they do say that these scores are preliminary and we'll see where they settle but still I think this would surprise a lot of people. Artificial

analysis found something similar. They

wrote on both text to image and image editing, GBT Image 1.5 again surpassed Nano Banana Pro on their tests. They

gave a couple of different text image generation examples, a couple of editing examples like changing a car's color and inserting a family of ducks crossing a railroad. Ultimately again ranking at

railroad. Ultimately again ranking at number one. Now there are a million

number one. Now there are a million examples out there if you want to go see direct head-to-heads on Chad GBT versus Nano Banana Pro. And my strong suspicion is that if you don't have a particular

horse in the race or a set of biases that you're bringing in to start, you're likely to find some where you prefer Chat GBT and some where you prefer Nano Banana Pro. For myself, outside of just

Banana Pro. For myself, outside of just exploring a bunch of things that I thought were interesting, I ran a couple of tests for instruction following with multiple constraints. I asked for one

multiple constraints. I asked for one person standing and pointing at a screen. Two people are seated. The

screen. Two people are seated. The

screen shows abstract charts with no readable text. The room is modern and

readable text. The room is modern and minimalist. The color palette is black,

minimalist. The color palette is black, white, and light gray only. No windows,

no plants, no logos. In that case, both Nano Banana Pro and GPT Images were able to do it equally competently. On a test of photo realism, I asked for a photorealistic image of a hand holding a

clear glass coffee mug filled halfway with black coffee. The hand has to have all five fingers and have them all visible. The glass has to show realistic

visible. The glass has to show realistic reflections and refraction. The coffee

surface needs to be flat and level, natural indoor lighting in a neutral background. Again, in both cases, the

background. Again, in both cases, the models were pretty equally competent.

Getting into more stylistic and aesthetics, I asked for a 1950s retrofuturist style illustration with flat bold shapes, a limited color palette of teal, cream, and muted orange, clean lines, and an optimistic

mid-century modern aesthetic. Once

again, they were both competent, and ultimately the preference here is going to be in the eye of the beholder. One of

the challenges that this shows is that a single stylistic prompt can mean different things. These are both

different things. These are both examples of 1950s retrofuturism, but one is a little more Jetson and the other is a little more abstract. when we created a character and then put them in a

different setting. Both models had no

different setting. Both models had no problem keeping consistent from one to the next. And of course on YouTube

the next. And of course on YouTube thumbnails, a very common use case for me, but frankly, they were both pretty garbo, although I know for a fact that I could improve that with different prompting. As you can probably tell

prompting. As you can probably tell across my test then, what I found was pretty meaningful parody. Not

necessarily a clear or huge improvement over Nanabanana Pro, but clearly a huge improvement from where OpenAI's image generation model was before this.

However, it's not hard to find people who feel the opposite. If you go check out Twitter/X, there were many people who were just kind of generically underwhelmed. AI News by Small AI said,

underwhelmed. AI News by Small AI said, "Shipping anything is hard, so we rarely call out misses, and OpenAI rarely misses, but this was clearly a miss."

OpenAI image 1.5 claims to beat Nano Banana Pro number one across all arenas, but completely fails vibe checks. Theo

did a test and found that characterfaced accuracy was kind of lacking. Brand

designer Daria Cerova gave a base input image as well as a product package and asked both models to make the girl in the input image hold the bottle and said while it's better than before ChachiBT didn't get the scale and change the

product and the light and if I ask it to make some edits it reworks the whole image. We'll keep testing but for now

image. We'll keep testing but for now it's 1 to zero for Google. David Shapiro

provided a bunch of images of himself and asked both models to create a YouTube thumbnail which in this case undeniably Nano Banana smashed compared to ChatgBT. Some people were even quite

to ChatgBT. Some people were even quite flabbergasted with the arena and artificial analysis results. I am Emily 2050 reshared artificial analysis's post and said, "What a joke. I'm not going

into the conspiracy side, but this is really not looking good for artificial analysis." When someone said, "How that

analysis." When someone said, "How that can't be right," Emily responds, "OpenAI gained the benchmarks or paid them to say so." Which, hold aside the substance

say so." Which, hold aside the substance of that argument, I think reflects people's skepticism. The ex comments on

people's skepticism. The ex comments on both artificial analysis's post and the LaMarina post also show just tons of skepticism. So what to make of all this?

skepticism. So what to make of all this?

I think Peter Gstv from LaMarina is directionally correct when he writes, "My anecdotal impression of GPT 1.5 versus Nano Banana Pro is that they are pretty neck andneck overall. I find GPT

a lot easier to prompt. With Nano

Banana, you often had to iterate several times before getting a good result.

While with GPT, you typically get what you ask for. But I think Nano Banana has slightly nicer taste. EG for

infographics slides, Google has the advantage. I found GBT style quite heavy

advantage. I found GBT style quite heavy with the important point in the part I'm saying directionally correct being the pretty neck andneck overall. Jimmy

Apple's had an even simpler version of the same statement. Big upgrade over the previous model. It's not as smart as

previous model. It's not as smart as banana, but it's going to be subjective on what you like on style versus style.

Personally, it really hits the image in my head I have for this prompt. Just use

what you prefer. I'll be using both. And

that is exactly what my overall conclusion is. Before this, Nano Banana

conclusion is. Before this, Nano Banana was undeniably and very clearly better than anything OpenAI had going on with image generation. Now, it is not so

image generation. Now, it is not so clearly better, at least not in all cases. What that means practically is

cases. What that means practically is that for really highquality image generation, on Tuesday morning, you had one option and now in a lot of cases, you're going to have two. Now, one

interesting point that Swix made is that we may also be seeing the limits of how far we can go in image generation with current methods. He writes, "I think

current methods. He writes, "I think today's image 1.5 launch illustrates one of the reasons why people are betting so hard on explicit world models. For the

next level in realism, we're going to have to teach the models to see the world as we live it, not through occasional snapshots." He pointed to a

occasional snapshots." He pointed to a post on r/gbt that said, "The new image gen is nuts." Someone responded, however, yes, but also the details are a

little off. Why is one leg bare and the

little off. Why is one leg bare and the other covered by pants? What kind of car has a vanity table behind the front seat? Where is the passenger seat? Maybe

seat? Where is the passenger seat? Maybe

it's covered by her, but a lot of the background and context still seems off.

Still, the people at least look human and not like plastic anymore. So, as we round out, let's ask, is there anything that Image Gen I think does distinctly better than Nanabanana right now? And

while my answer is no, there's no one use case where I thought just in every test that I tried, immag crushed Nano Banana Pro or anything like that, there are four areas right now with a fifth

potential bonus area in the future that I think Image Gen may be a desirable alternative to what Nano Banana can do.

First up, let's talk infographics. One

of the incredible things about Nanobanana Pro when it was released is that all of a sudden this new capability of making infographics from text came online. I'm sure that you have seen a

online. I'm sure that you have seen a ton of these floating around the internet and indeed that ubiquitousness and commonality of style is exactly why I think in some cases you might want to

use chatbt images instead of nano banana to make your infographics for the simple reason that they don't look like a nano banana infographic which already has a particular flavor and style that people

can spot from a mile away. I dumped in a recent episode transcript to get an infographic based on it. And both models were able to do this, although they each had their own quirks. As it often does,

Nano Banana's first iteration gave a bunch of citation references, even though those are completely useless and wasted space on a visual infographic like this. Whereas ChatgBT images just

like this. Whereas ChatgBT images just had a few little mistakes here and there. For example, in the three biggest

there. For example, in the three biggest barriers to agentic AI section, it only has two barriers. There were also some random spelling mistakes like bigger being spelled B I G E R. Now perhaps the

better approach than using chat GBT images is just to try to prompt your way out of the standard look of Nano Banana Pro, but my point here is that you at least now have a competent visual

alternative. I might add to this use

alternative. I might add to this use case things that need really high text fidelity. That was one of the things

fidelity. That was one of the things that OpenAI called out in their announcement post and I did some tests around that as well. I asked for an over-the-shoulder shot of Abraham Lincoln sitting at his desk writing the Gettysburg address. Make the entire

Gettysburg address. Make the entire address readable. Although in this case,

address readable. Although in this case, I found both models able to do it. So

once again, we're back in stylistic preference area. A second area where I

preference area. A second area where I think genuinely chatbt images right now might have an edge is around hyperprecise instructions and

complexity. I took this 6x6 grid idea

complexity. I took this 6x6 grid idea and really ratcheted up the complexity.

I said, "Make a six columns by six rows grid of Lovecraftian artifacts and entities where each cell contains exactly one distinct illustration centered within its square and not overlapping grid lines. Overall style is

1920s pulp illustration meets a cult manuscript. Inked line work, muted sepia

manuscript. Inked line work, muted sepia and sea green tone, subtle paper grain, no modern elements, no text anywhere in the image. And then just to add another

the image. And then just to add another layer, I actually precisely gave it everything I wanted in all 36 squares.

It did just a phenomenal job. There

wasn't a single square that didn't have a strong competent version of exactly what I asked for. Nano Bananas Pro's version of this was an absolute mess.

Instead of a 6x6 grid, I got an 8x5. It

didn't follow the overall instructions as well. And tons of the individual

as well. And tons of the individual squares were just out of the blue and nowhere. Now, of course, this is just

nowhere. Now, of course, this is just one test, but I noticed a couple others also preferring Chat GPT images for some of these hyperprecise or complex instructions as well. Ethan Malik

writes, "I tried something fun that worked better with chatbt image generator 1.5 than Nano Banana Pro.

Point andclick adventure game me. You

are the parser. Make images as the output and taken commands. Make the

world super interesting. Keep track of inventory state, etc. So you can see it basically creates a screenshot from a video game and then Ethan prompts it to go to the next shot in the game. Look at

the laser. Cover the laser with map and inventory. Run through the portal."

inventory. Run through the portal."

Chiept did a really good job with this.

Nanab Banana Pro did not. In its first attempt, the second image was completely different than the first scene and then it just completely bowed out. And in the

second attempt, it sort of did it, but with a much much harder time. Then of

course there was Peter Gost again who tweets, "I know people like Nano Banana, but I have some important needs that it just cannot meet." His prompt was, "Create a square image of a hand with

six fingers, a wall clock showing 822, a glass of red wine full to the top." Nano

Banana Pro had a normal hand, a clock at 758, and a wine glass that was mostly but not entirely full. Whereas the new image gen model had a completely full

wine glass, 822 on the clock, and seven juicy weird fingers. A third area where I think you might prefer or at least want to test Chat GPT images as opposed

to Nano Banana Pro is for aesthetically focused and higher taste prompts. Flower

Shop showed a couple of examples where I think that the GPT images version is just a big step up visually from the nano banana version. Here's another

example with a logo and Aziz AI found something similar. The prompt that he

something similar. The prompt that he tried was create a clean look website in Apple style for Nike in a 4 to 5 aspect ratio. He said the winner was GBT and

ratio. He said the winner was GBT and aesthetics of UI and understanding the prompt. Now I will say very clearly here

prompt. Now I will say very clearly here that the point that I am trying to make is not especially in this case that I think the chatbt images will always be better. It's that because these models

better. It's that because these models are both so at the high end. Now when

you are trying to find something that matches your vibes and reaches the levels of the high taste that you're going for, you now have a couple of options. Images is in some cases going

options. Images is in some cases going to be better and in some cases going to be worse. But again that means you've

be worse. But again that means you've gone from one option to two options basically overnight. The fourth thing

basically overnight. The fourth thing that I want to mention in terms of an area where chatbt images excels as compared to Nano Banana is the actual interface for using it. And I think this

reveals quite a bit about how they're imagining usage of this tool. Certainly

myself and I'd be willing to bet many of you are coming at this conversation from a standpoint of a business or power user. You want these fine grain edited

user. You want these fine grain edited controls. You're imagining how you can

controls. You're imagining how you can use this for your solopreneur business.

But I think OpenAI is imagining that a lot of the usage of this is in fact just going to be people messing around and having fun. Whereas with Gemini, there's

having fun. Whereas with Gemini, there's absolutely no difference when you're creating an image other than you say create image in the chatbt web app. Now

there's a whole different section with slightly changed visuals and a whole lot more options. In addition to your

more options. In addition to your standard text prompt field, you also have a row of styles underneath that you can try on an image. Sketch, holiday,

portrait dramatic plushy baseball bobblehead, etc. Then below that, they also have a panel of ideas to just discover something new, like creating a holiday card. What would I look like as

holiday card. What would I look like as a K-pop star? Me as the girl with the pearl earring. And you get the sense

pearl earring. And you get the sense from this that they want to solve the blank slate problem and get people messing around with this, not for a business purpose, but just for fun. I'm

sure it's not lost on them that one of their biggest moments of user growth, if not their biggest moment of user growth ever, and certainly in 2025, was when we got the giblification trend where everyone turned everything into a Studio

Gibli image. These sort of interface

Gibli image. These sort of interface options are very clearly aimed at the average user who isn't thinking about business outcomes and ROI, but is just there to have some fun. Given how much

of Chat GPT's usage is regular everyday people, I can see why they're making that bet. So that is four areas where I

that bet. So that is four areas where I think you might want to try Chat GBT images either instead of or at least in addition to Nanabanana Pro. The bonus

however in fifth future area is of course when you want to make Mickey or Moana or a Disney character. Now right

now Chad GBT images is much more locked down in my test at least than is Nano Banana. I gave the prompt Sam Alman

Banana. I gave the prompt Sam Alman water skiing behind a boat driven by Andy Jasse. This obviously relating to

Andy Jasse. This obviously relating to the news that OpenAI might be doing a deal with Amazon. From Gemini, I got this cool Ralph Steedman looking image.

From Chad GBT, I got this. The image

generation request did not follow our content policy. Of course, we just

content policy. Of course, we just learned that OpenAI and Disney had done a deal. A deal that will explicitly

a deal. A deal that will explicitly bring Disney's characters into Sora. If

that extends into image generation, it could be a big deal as well. Simon Smith

again writes, "If OpenAI and Disney surprise everyone by allowing character generation with the launch of Images V2, pretty sure it will spark a ton of chat GBT use over the holidays. Parents alone

will burn up GPUs inserting characters into holiday messages for their kids."

Now, one thing Simon references there is V2. Remember that this is version 1.5

V2. Remember that this is version 1.5 and people are expecting a lot more in the relatively near future from an even better image generation model. OpenAI

staffers are indeed suggesting that this is just the start and that we are in for more image generation updates in the future, which as I said right at the beginning is nothing but good news for us consumers. So friends, that is my

us consumers. So friends, that is my first look at image gen 1.5. Hope this

was useful. Certainly if you haven't yet, get in there and start creating.

Santa continues to come early as we get more and more AI toys. For now, that is going to do it for today's AI daily brief. Appreciate you listening or

brief. Appreciate you listening or watching as always. And until next time, peace.

Loading...

Loading video analysis...