Claude + GPT-Image-2 is a CHEAT CODE for Design (Full Breakdown)
By Jay E | RoboNuggets
Summary
Topics Covered
- AI Can Now Generate Whole Magazine Spreads
- GPT Image 2 Beats Nano Banana by 200 Points
- One Prompt Creates Seven Design Assets Automatically
- Image References Slash AI Prompting Time in Half
- Batch Iterate and Let Human Taste Guide the Output
Full Transcript
GPT Image 2 just dropped, and it's so good that it's now overtaken Google's Nano Banana Pro in the image model leaderboards. And when you combine GPT
leaderboards. And when you combine GPT Image 2 with Claude as your AI agent, then that is where you unlock the true power of this new model. In this video, I'll show you what you can build with OpenAI's new image gen tool, what it
unlocks in the world of marketing, design, and content, and how to use it with Claude so that you can do the same.
And if you're new here, my name is Jay.
I spent over a decade working with brands you probably know, have been in AI since my master's in data science, and now I run our AI solutions practice in one of the largest AI communities globally. Let's get started.
globally. Let's get started.
[music] So, OpenAI just released GPT Image 2 today. And over at X, I'm already seeing
today. And over at X, I'm already seeing a lot of cool use cases and examples of how people are using this incredible new model. Because [clears throat] I think
model. Because [clears throat] I think the capability that Image 2 has unlocked in the image generation space is its innate ability to be really good when it comes to text rendering even at small
scales. So, for instance, this example
scales. So, for instance, this example by Kate. Basically, what she did is
by Kate. Basically, what she did is prompt GPT Image 2 to create the UI for Premiere Pro. And if you zoom into this,
Premiere Pro. And if you zoom into this, it's actually able to render not just the whole user interface, but if you read through the text here, it is incredibly good at rendering out the text as well. And this capability to
generate user interface and even understand user images is quite important because OpenAI for sure is doubling into its computer use capability. So, we'll likely see more
capability. So, we'll likely see more advances on that front in a bit. Now,
I've tried this model also myself. And
for these examples, take note that they're all mostly one shot. And what I basically did, which I'll show later, is connect Claude to GPT Image 2, give it this brand book of ours for reference on the colors as well as the font and
typography to use. And basically, just ask Claude to use GPT Image 2 to generate for me some examples on marketing, content, and UI use cases.
So, this one with social media posts actually turned out pretty well. So, it
was able to not only follow my guidelines on the typography, the colors, but if you zoom in here, you can see the words are all correct. Even the
fonts is properly well considered in there. And for this generation
there. And for this generation particularly, I don't think I made it at the highest resolution, which is why it is a bit pixelated here if you zoom in.
But if you take another example, so this one is a magazine page that I asked Claude to prompt for us. And for this image, I gave it that brand book, but apart from that, I also asked Claude to
max out the resolution for it as well as the quality just so we can observe how good the text rendering is. And I must say, even if you zoom in here and you try to find any grammatical or lettering
errors, I don't think it's that obvious.
I think it's at that point now where if you are designing for like a magazine spread, you can actually use this image now to generate whole infographics and pages for you. And if you need animated
social media posts, then what's great is you can also use these images as starting frames for something like CleanShot 0 or VO3.1, which I will also show how you can do that in just a bit.
Couple of other examples here, I think it did pretty well on generating these merch for Robo Labs. So, it was able to get the logo, was able to get the icon, and these pins for Rubrik is actually looking pretty good. If you need some
marketing material, let's say for your iPhone app, then that is an option that you can do via GPT Image 2. This mock-up
for a site component also turned out pretty well. And this is an image now,
pretty well. And this is an image now, but later I'll show how you can use this as a reference for an actual user interface that you can deploy live on your website. Charts and visuals and
your website. Charts and visuals and infographics like these actually render pretty well also for GPT Image 2. And
for this specific example, I did ask Claude to do some research here. So,
when it generated this image, it even included the source for each of these data points. So, you can imagine if you
data points. So, you can imagine if you are preparing a report or if you're preparing some sort of presentation, then you can pretty much one shot these sorts of charts using GPT Image 2. So,
depending on how you use it, you may prefer Nano Banana versus GPT Image 2, but at least if you look at the arena.ai leaderboards, which if you're new, that's basically a website where people
access these models for free and they sort of blind vote which image they prefer, you can see when OpenAI released this model, it basically overtook Nano Banana by Google by quite a wide margin,
so almost 200 plus points, which is pretty big. So, what that means is, at
pretty big. So, what that means is, at least right now, from the thousands of users who are voting on this website on which image they prefer, it does seem like GPT Image 2 is doing much better
based on users' preferences to achieve whatever it is that people are asking for. Now, for you to fully utilize the
for. Now, for you to fully utilize the power of GPT Image 2, I suggest that you connect it to an agentic platform like Claude. Because with that, you'll be
Claude. Because with that, you'll be able to create high-quality images like these without having to necessarily type in and think of the prompts yourself.
And you can just have Claude do that for you. Now, to use Claude, there's
you. Now, to use Claude, there's actually multiple ways to do it. If you
go to claude.ai/downloads,
probably the most approachable for people would be to download the desktop app, which you can do here. And when
you're on the desktop app, there's actually three sections to this. The
first one is chat, which is sort of like ChatGPT, so that explains things for you. The next one is core. So, core
you. The next one is core. So, core
work, apart from explaining things for you, it also can do things for you. So,
if you want to organize files or have Claude use the browser just like a human would, then core work would be your go-to. For today though, what we'll be
go-to. For today though, what we'll be using is Claude code, which is the most powerful amongst those three because it can build things out for you. Now, just
to mention it, in case you're a bit more advanced, the main way that I personally access Claude code is via an IDE software like VS Code. Or for this one, I use Antigravity. And it just have Claude code as an extension here, which
you can just find here in the extensions tab, and you can install Claude code in there, and that will be the same thing.
So, for the remainder of this tutorial, I'll be using my Claude code session from within Antigravity, but just know that the process to install GPT Image 2, whether you're using the VS Code or Antigravity extension or the desktop
app, is practically the same. Now, for
you to connect Claude to GPT Image 2, what I've done to make this as easy as possible for everyone is make this GPT Image 2 skill. So, if you give this to your agent, Claude code included, then it will unlock the text-to-image
capabilities for GPT Image 2, the image to edit where you feed in an image reference, and even a prompt library so that you instantly have access to 700 plus JSON structured prompts across
several categories. And if you're new to
several categories. And if you're new to this space, this skill is basically just a markdown file, so it's all just text.
And this is just a detailed reference for your agent to understand how to work with this model better. So, you can just grab this below so that it is much easier for you to integrate it with Claude. Now, real quick, we just
Claude. Now, real quick, we just released the agentic AI masterclass for our members at Robo Nuggets, which takes you from zero to mastery when working with agents. There's a link to the
with agents. There's a link to the community in the pinned comment below.
We've got founders in there who landed their first client in weeks, live build sessions where we create this stuff together, and the actual templates behind what I showed in this video. The
community is also the reason these lessons get made, so see that below if that's for you. And so, from within Claude code, once you send that GPT Image 2 skill install prompt, then you should receive a confirmation message
like this, which gives you a rundown of the different capabilities of this model. Now, in case you're a bit more
model. Now, in case you're a bit more technical and want to understand how we're connecting to this model in particular, you can see that the provider that we have here is File AI.
And if you go to file.ai, basically what they are is an AI model aggregator. So,
they have hundreds of models here, and you can see that GPT Image 2 is now available. And essentially, when you use
available. And essentially, when you use that skill and give it to Claude, it will just give Claude the skill to use this interface pretty much, where it will allow it to add reference images on your behalf, select the quality, whether
you want low, medium, or high, and even tweak these additional settings. So, if
you want Claude to define the image resolution for you, that is an option.
If you want it to have several images generated in one time or even tweak the output format, those are all parameters that you can now instruct Claude to do.
And so, now with that skill installed, you can simply tell Claude to use GPT Image 2 and make these sample images, which I'll go through in a bit. And I
also ask it to run me through the high-level steps that it's about to take, which is optional for you if you want to understand how it works under the hood. And the other thing that I am
the hood. And the other thing that I am going to feed it as reference here is this brand book, which is this one-pager design system that we have including our typography options, what are the color palettes, and so on. So, if you don't
have something like this, I previously published a skill, which I'll also link below, where if you give Claude your website, it will create this PDF for you. Or alternatively, you can use your
you. Or alternatively, you can use your website link as well, and that should work pretty effectively. So, I'll just copy the path to this so that Claude understands where to find it. And also a
note to convert it to JPEG first because GPT Image 2 does not accept PDFs as inputs, obviously. And the other thing
inputs, obviously. And the other thing that I want integrated across these different designs is actually the Claude icon. So, if I go into Rubrik here,
icon. So, if I go into Rubrik here, which is our AI command center, which is centralizes a lot of the references for the ones we use, I'll just copy the path to this one for the Claude logo, give that as part of this prompt so that the
path is copied, and Claude would understand where to look for that icon.
And now, with just one prompt, I'll be asking it to create this magazine spread, pricing cards, couple of charts, a brand kit, an ad campaign, and merch.
So, if you send that, basically what it will do is plan the prompts for each of those. And because it has that
those. And because it has that connection with File AI, then it'll be able to send those prompts to File AI along with these reference images that we gave to it to generate the images
using GPT Image 2. So, you can see here, it's now outlining to me the high-level steps because I asked it to do that, where it will prepare the references, convert that brand book into a JPEG, map each of those references for the images
that we need, write the prompts, and fire those nine generations for us in parallel. And once those images are
parallel. And once those images are finished, by default, what Claude code will give you are the file paths to each of those images, which you can just inspect one by one. But if I go to Rubrik, basically what we built here is
sort of like a Higgs field or Google Flow equivalent, where we can just see those generations for us cleanly laid out. So, these look pretty good. So,
out. So, these look pretty good. So,
I'll probably go through each of these in the intro, and you probably have seen them already in the intro. Now, when
working with GPT Image 2, there's actually a couple of techniques that you can employ in order to make the most use of it. One technique or hack that you
of it. One technique or hack that you can do is let's say you're crafting social media posts. When it comes to these models, it's important to have a lot of variation or volume in order for you to direct the model on which
generation is actually good or not. And
so this image is actually just one image. So it costed the same as one
image. So it costed the same as one image, but because I asked for a 5 by 5 grid of social media posts, I was able to invoke GPT image 2 to practically generate five images for me. And then I
also ask it to number each of these generations from 1 to 25 so that let's say the final ad or image that I want to generate would be this number eight.
Then I can just ask Claude to upscale or regenerate this number eight image, but put it at higher resolution. If you're
curious on the prompt that I used, basically it's a simple one where I asked Claude to use GPT image 2 to make a 5 by 5 grid of cinematic RoboLab social media ads. Number each cell 1 to
25 in a small white circle at the top left so that I can isolate them later.
And then of course I fed it with the references of the file paths as well for the brand book and the Claude icon. And
then let's say you really like image number eight here. You can just say, "Hey, so I think number eight turned out really good. Can you just upscale that
really good. Can you just upscale that and generate a separate image for that at the highest quality settings, please?" And if you send that, Claude
please?" And if you send that, Claude will now handle all of the prompt generation for you, identify what prompt it even used for that image number eight, and also set the image at the highest quality and the highest
resolutions depending on what spec it is that you want generated. Now, when it comes to media design, what's great about these images is that they also serve as good reference first frames for
video models like Kling 3.0. So these
ones for example, I generated them using Kling, and you can pretty much do it the same process in Claude as what we just did. So if you go back to file.ai and
did. So if you go back to file.ai and find the model that you want to use, let's say Kling 3.0 image to video, in order to upgrade your Claude to have the capability to call on Kling as well, all you need to do with these websites, they
usually have this LLM portion at the upper right. And if you copy the content
upper right. And if you copy the content here, and then on the same session that you just installed your GPT image 2 skill in, you can just say, "Hey, so I want to enrich this skill with Kling as
well, please." So the documentation for
well, please." So the documentation for that is given by file.ai below. Can you
integrate this new model into the same skill, please, so that I can pass on these GPT image 2 pictures as reference for Kling. And then if you go ahead and
for Kling. And then if you go ahead and paste the whole documentation from file.ai into this chat, if you send in that prompt, what Claude will now do is enrich that skill with the capability to
call on Kling as well. All right, so now it's done. If you go back to Rubric
it's done. If you go back to Rubric here, you can see that that image has now been properly blown up, and it actually turned out pretty well. Now
let's say you want to animate this social media post via Kling, then you can just give Claude that image, send it as a reference, and then say, "Hey, so this image turned out really well. Can
you now animate this via Kling? Just
make sure that the text is fixed and that the aspect ratio stays the same."
And if you fire that off, that Claude will figure out how to call on file.ai's
endpoint basically for Kling and animate this image for you. And obviously we are only doing it for one, but if you want to batch create a whole set of social media posts, then you can pretty much automate that just by talking to Claude
in natural language. And when that is now done, you now have a square social media post that is also animated for you. Now as a last one, if you are into
you. Now as a last one, if you are into user interface design, a great use case for GPT image 2 is for you to generate the images first of the components of the website that you want to build,
let's say this one, and actually feed it to an agentic platform like Claude Code in order to build this out for you. And
to do that, what you can simply do is once again give it the reference image.
You can see this PNG file is basically this one, and I just asked Claude Code to design this in HTML with this sort of liquid glass style. And I also ask it to guide me on how we can do this whole piece, and also ask it to open it in my
local host, basically my computer, so that I can inspect it. Now you can see how simple this prompt is, right?
Because if you were sort of designing this whole thing from scratch without an image, then you would need to somehow describe this whole piece verbally to Claude. But if you have an image
Claude. But if you have an image reference, then you can just feed it to Claude, and it will get you something like 50 to 60% there in terms of aligning with your agent on what your final intention is. Now to be clear,
because this is an image, it doesn't necessarily translate that well into HTML UI components that you can employ in web design that easily. Like this one that Claude generated for us, it is pretty much there in terms of the sizing
of the components and the elements. It
was also able to give us that sort of liquid glass effect to an extent, but it does need further refinement if you want it to look exactly like the image. The
other tip that is quite useful is similar to the volume of images, if you have the bandwidth to do it, you can ask Claude Code to actually generate for you several examples and iterations. So that
if you browse through these examples, then you as a human in the loop can incorporate your own taste to identify which of these is actually what you are gunning for. There you go, that is GPT
gunning for. There you go, that is GPT image 2 now automated by Claude. I hope
that was helpful, and if you want to understand more about how Claude Code can act as your 24/7 personal assistant, then you can watch this video next. I'll
see you guys next time. Thank [music]
you.
Loading video analysis...