ComfyUI Course - Learn ComfyUI From Scratch | Full 5 Hour Course (Ep01)

By pixaroma

Summary

Topics Covered

Visual Thinkers Master Comfy UI
Latent Space Speeds Diffusion
Diffusion Removes Noise Step-by-Step
Node-Based UI Reveals AI Process
Portable Install Avoids System Conflicts

Full Transcript

Learning Comfy UI is like opening a technical book at the last page.

Everything is there, but nothing makes sense yet. This course starts from page

sense yet. This course starts from page one. Before we go any further, I want to

one. Before we go any further, I want to be very clear about how this course works. This is not a shortcuts course.

works. This is not a shortcuts course.

It is not about copying workflows without understanding them. Each chapter

builds on the previous one. We start

simple, repeat the important ideas and only add complexity when it actually makes sense. You do not need any coding

makes sense. You do not need any coding knowledge. You do not need to be

knowledge. You do not need to be technical. If you can think visually,

technical. If you can think visually, you can learn Comfy UI. If you want to understand how AI image generation really works locally and how to use Comfy UI without feeling lost, this

course is for you. My name is Pixaroma and on this channel I focus on creating and teaching comfy UI workflows in a simple and practical way. I am a graphic

designer not a programmer and that is actually a good thing. Developers are

great at writing code but they often explain things in a very technical way.

This course is designed from a visual thinker's perspective. My goal is to

thinker's perspective. My goal is to explain Comfy UI logically and visually without needing any coding knowledge.

Even if comfy UI looks confusing right now, that is completely normal. We will

start from the absolute basics and build up step by step. But before we talk about Comfy UI itself, we first need to understand what AI image generation

actually is. Today, AI is not just one

actually is. Today, AI is not just one thing. There are many different AI

thing. There are many different AI models that can run locally on your own computer, such as stable diffusion, Flux, Quen, and many others. It is also

important to understand that Comfy UI is not limited to image generation. Comfy

UI is a general interface for running many different types of AI models locally. While it is most popular for

locally. While it is most popular for image generation, it can also be used for audio, music, video, animation, 3D, and more. As long as a model can be

and more. As long as a model can be connected through nodes, Comfy UI can be used as the interface to control it.

These models by themselves are like an engine. They are very powerful, but you

engine. They are very powerful, but you cannot really use them directly. To work

with them, we need an interface. An

interface is what allows us to send prompts, images, and settings to the model and then receive results back.

There are many free interfaces that let us interact with these models. Some

popular ones are Forge UI, Swarm UI, Invoke, Focus, and of course, Comfy UI.

They often use similar models but they work in very different ways. In this

course, we are going to focus on Comfy UI. Comfy UI is different because it is

UI. Comfy UI is different because it is node-based. Instead of hiding everything

node-based. Instead of hiding everything behind buttons and menus, it shows you exactly what is happening step by step.

You can see how prompts, models, samplers, and images are connected together like building a system. Think

of it like this. The AI model is the brain. The interface is how you talk to

brain. The interface is how you talk to that brain. Comfy UI is like building

that brain. Comfy UI is like building your own control panel exactly the way you want. Do not worry if this still

you want. Do not worry if this still feels complex. Understanding comes from

feels complex. Understanding comes from seeing things connect, not from memorizing nodes. In this course, I will

memorizing nodes. In this course, I will explain what each node does, why it exists, and how everything connects together. Before we install anything, we

together. Before we install anything, we need to talk about how you will actually run Comfy UI. There is more than one way to use Comfy UI and the right choice depends on your system and your

expectations. Let's go to the official

expectations. Let's go to the official Comfy UI website to see the available options. The official website is

options. The official website is Comfy.org.

Comfy.org.

If we go to the products section, you can see that there are two main options, Comfy UI Cloud and Local Comfy UI. Comfy

UI Cloud runs online on their servers and it is a paid service. This option

can be useful if your computer is too old or not powerful enough to run AI models locally. Local Comfy UI is free

models locally. Local Comfy UI is free and runs directly on your own computer, assuming you have a reasonably capable system. This is the option we will focus

system. This is the option we will focus on in this course. So, let's click on local comfy UI. Here you can see three main installation options. Download for

Windows, for Mac OS, and install from GitHub. All of these options install

GitHub. All of these options install Comfy UI, but there are important differences between them. In this

course, I will focus on Windows operating system using the portable version of Comfy UI. All the workflows, tools, and installers I show are tested

on Windows using an Nvidia graphics card. On AMD graphics cards and on Mac

card. On AMD graphics cards and on Mac OS, performance is usually slower, and some features or custom nodes may not work exactly the same way. So, if you

are using Windows with an Nvidia card, it will be much easier to follow this course step by step as I show it.

Because there are many different AI models, hardware requirements can vary a lot. Some models are small and can run

lot. Some models are small and can run on a graphics card with 6 to 8 GB of VRAM. Other models are much larger and

VRAM. Other models are much larger and may require more than 24 GB of VRAM. For

this first episode, I tested the workflows on two different systems. One system uses an RTX 2060 with 6 GB of

VRAM and 64 GB of system RAM. The second

system uses an RTX 4090 with 24 GB of VRAM and 128 GB of system RAM. For the

workflows in this episode, a graphics card with 6 to 8 GB of VRAM should be enough to follow along. In later

episode, we will explore newer and larger models that may require more powerful hardware. Now, let's talk about

powerful hardware. Now, let's talk about which version of Comfy UI you should install. As I mentioned earlier, I am

install. As I mentioned earlier, I am using the portable version of Comfy UI.

If we click on install from GitHub, we are taken to the official Comfy UI GitHub page. Here you can find detailed

GitHub page. Here you can find detailed installation instructions, but they require more manual steps and setup.

Over the past year, I have been using a portable version of Comfy UI that includes additional tools to make the installation process much easier. This

installer installs the original Comfy UI, but it also adds helpful tools so you can get up and running much faster.

This installer is called Comfy UI Easy Install. You can find it on this GitHub

Install. You can find it on this GitHub page. You can find the creator on our

page. You can find the creator on our Discord community under username IVO.

Thank Ivo for this installer. This

entire course is built around this version of Comfy UI. You can still use Comfy UI Desktop or Comfy UI Cloud, but some things may look different or behave differently compared to what you see in

this course. If you want the exact same

this course. If you want the exact same setup that I use and the easiest way to follow along, I recommend using the same version. Let me show you how to install

version. Let me show you how to install Comfy UI. So, we are on the easy install

Comfy UI. So, we are on the easy install GitHub page. This is the complete link.

GitHub page. This is the complete link.

If we scroll down, you can read more about this installer. Even if you might not understand what each of these things means yet, it will make sense later as you learn more about it. I will talk

later about the Pixaroma Discord server where you can get more help and answers to your questions. So, this installer will install Git, which is a tool that

tracks changes to files in Comfy UI. It

helps developers safely update the main app and custom nodes, fix bugs without breaking everything, and lets you update or roll back to an earlier working version if a new update causes problems.

Then it will install the Comfy UI portable version. A portable version

portable version. A portable version means the program is fully self-contained in one folder, does not need a normal system install, and can be run, moved, backed up, or deleted

without affecting the rest of your computer. in Comfy UI portable. This

computer. in Comfy UI portable. This

means Python, libraries, models, and settings all live inside the Comfy UI folder. So, you can copy it to another

folder. So, you can copy it to another driver or PC, update it safely, and avoid breaking your system Python or Windows setup. Python embedded means

Windows setup. Python embedded means Comfy UI comes with its own built-in copy of Python already included inside the Comfy UI folder instead of using the

Python installed on your system. Then it

will install all the nodes that are useful and that I tested over the last year. It might not make sense for you

year. It might not make sense for you yet if you are a beginner, but do not worry. Take it as general knowledge for

worry. Take it as general knowledge for now and it will make sense later. Then

it will add an add-ons folder with more advanced stuff we can use later to speed up our generation, plus some extra tools that can be useful and then more technical stuff explained for each one.

But all you need to know for now is where to download it from and how to run the installer. It is important not to

the installer. It is important not to run it as administrator. That means you just doubleclick to run it, not right-click and run as administrator, but you will see that in a minute. Also,

avoid system folders and make sure your NVIDIA drivers are up to date since some things work only with more recent NVIDIA drivers. Okay, let us go back to where

drivers. Okay, let us go back to where it says Windows installation and let us download the latest release from here.

Then depending on how your browser is configured, it will either download it to the downloads folder or ask you where to download it and you can decide where to put it. As you will see over time, it

needs a lot of space if you download big models. So I suggest downloading and

models. So I suggest downloading and installing your Comfy UI on a hard disk that has a lot of free space and preferably on a solid state drive

because it will load the models faster.

I will go to my D drive and I will create a new folder called Comfy UI, but this does not really matter. The name

can be anything easy to remember.

Sometimes I put Comfy UI followed by the month so I know when I downloaded and installed it. So I will save this zip

installed it. So I will save this zip archive in that Comfy UI folder.

Now let us go to the place where we saved the file. Since this is a zip archive, we need to unzip it. You

rightclick on it and depending on what you prefer, you can use the Windows integrated option and select extract all. I like to delete the folder name at

all. I like to delete the folder name at the end so I do not end up with a folder inside another folder. When I click extract, it will extract these two files. Let me delete it really quick and

files. Let me delete it really quick and show you. If you have WinRAR like me,

show you. If you have WinRAR like me, you just choose extract here and it does the same thing. Once we extracted everything, you can delete the easy install zip file. Now we are left with

two files. A BAT file that is the

two files. A BAT file that is the installer and a zip file that contains extra resources that it will use. When

you run it, you might get a security warning. That usually happens with BAT

warning. That usually happens with BAT or executable files because they install files on your system. This one is safe.

I installed and tested it and I personally know EVO, the creator of this installer. You can rightclick and scan

installer. You can rightclick and scan it with your antivirus and you will see it is clean. So let us double click on the BAT file then press run and it will

start installing. If you already have

start installing. If you already have git installed it will update it. If not

you will get a window like this and you have to press yes to continue the installation. After that it will

installation. After that it will continue the installation of Comfy UI and everything it needs to run. You can

take a break for 3 to 5 minutes depending on your internet speed and your computer. So how do you know when

your computer. So how do you know when it is ready? You will see a message that says installation complete along with the time it took. On my PC, it took 247

seconds. After that, you can press any

seconds. After that, you can press any key to exit. So, let us recap really quick. From the GitHub page, you

quick. From the GitHub page, you download the zip archive of the easy installer. You create a folder named

installer. You create a folder named Comfy UI and place the downloaded zip archive in that folder. You extract the contents of the archive. You run the

Comfy UI easy install BAT file and if it asks to run GitHub you press yes. In a

few minutes the installation is complete and you get this screen. Now after the installation we can see that inside our Comfy UI folder a new folder appeared

called Comfy UI easy install. This is

portable which means you can copy this entire folder and move it to a different drive or folder and it will still run Comfy UI. Basically, after you install

Comfy UI. Basically, after you install any Comfy UI portable version, you should end up with a similar folder structure to this. Since Comfy UI is based on the Python programming

language, you will see many Python files and BAT files inside these folders, which are used to run those Python files. The easy installer will also

files. The easy installer will also create some shortcuts on your desktop.

If you use other versions of Comfy UI, they might not do this, and you would need to create the shortcuts manually.

That is one of the reasons I prefer the easy installer. It makes everything

easy installer. It makes everything easier for us. Basically, we just extracted an archive and ran a BAT file and we now have Comfy UI. If we right

click on this shortcut and go to properties, we can see the target of the shortcut. If we open the file location,

shortcut. If we open the file location, you can see that it is connected to this BAT file that starts Comfy UI. In other

versions of Comfy UI, the name might be different like run NVIDIA GPU or something similar. Are you ready for

something similar. Are you ready for your first Comfy UI launch? To start

Comfy UI, you either use this BAT file called Start Comy UI or from the desktop you use this shortcut called Comfy UI easy installer. E and Z stand for easy

easy installer. E and Z stand for easy and I stands for installer. So double

click on it and when it starts it looks like this. The first time it will be a

like this. The first time it will be a little slower, but after that it will start much faster. If you are curious by nature, you can find all kinds of

information about your Comfy UI and your system when it runs. For example, you can see what operating system I am using, what Python version is running, and where that Python is located, what

the path to the Comfy UI folder is, where the user directory is, how much VRAM you have, how much system RAM you have, and the PyTorch and CUDA versions

that are running. When it starts, this command window will be minimized to your taskbar, and Comfy UI will open in your default browser. The first time it

default browser. The first time it opens, it will show you some templates made by Comfy UI that you can load. If

you have run a workflow before, it will open that workflow by default. So the

workflow you see open is the last one you used. A workflow is a set of

you used. A workflow is a set of connected nodes that tells Comfy UI what to do step by step. Let us close this for now. You can open these templates or

for now. You can open these templates or workflows from here later. Comfy UI is made of a few main areas. You do not need to memorize them. I am naming them

so we can talk about the same things later. If we go to the top, you can see

later. If we go to the top, you can see it says unsaved workflow. Basically, it

is like a document that is empty at the moment since we did not add any nodes yet. You can have multiple documents

yet. You can have multiple documents open similar to what we have in Photoshop and other programs. We can click on this plus icon to create a new blank workflow. All these tabs on top

blank workflow. All these tabs on top are open workflows and we can close, save and edit each one. Now this grid like empty space is called the canvas.

Instead of drawing on it, we will arrange blocks or nodes like using Lego pieces and connect things together to create a working workflow. You can use

your mouse wheel to zoom in and out on the canvas. Then we have this top bar.

the canvas. Then we have this top bar.

Depending on what extensions you have installed, it might look different and have more options. Things like the manager or the run button, which lets us run workflows, are usually here. On the

bottom right, we have view controls. For

example, we have a select tool that lets us select nodes and a hand tool that lets us navigate the canvas. You can fit a workflow in view, but right now the canvas is empty. We also have different

zoom controls that you can use if you do not want to use the mouse wheel or if you do not have one. For me, the mouse wheel is the fastest and the one I prefer. Then we have the show mini map

prefer. Then we have the show mini map option. This shows a small map that we

option. This shows a small map that we can use to navigate when we have very large workflows. There is also hide

large workflows. There is also hide links, but since we do not have any nodes or links yet, we will see that later. An important one is the main menu

later. An important one is the main menu which you open by clicking on the letter C, the Comfy UI logo. We also have more options on the left sidebar for nodes,

models, and workflows, which we will explore soon. Back to the main menu. If

explore soon. Back to the main menu. If

we click on the C, we get this menu. New

creates a new workflow, but it is faster to use the plus sign from the top bar.

For file, it allows us to open, save, and export the workflows we create. For

edit, you can undo actions like moving nodes or changing something in the workflow, clear the workflow and unload models. For view, we can enable and

models. For view, we can enable and disable different panels. And we also have zoom in and zoom out controls. Just

like in Photoshop, we can do the same things in multiple ways. It is the same with comfy UI. For theme, you can change how it looks, but at the beginning, I

suggest leaving it on default so it is easier to follow tutorials. Nodes 2 is in beta at the moment of this recording and still has some bugs, so I suggest leaving it off until it is more stable.

You can browse templates and open settings, which we can explore later.

For now, the default settings work fine.

Templates and settings can also be accessed from here. So again, there are multiple ways to access the same things.

In some newer versions, some people might use a newer manager and it might appear somewhere else instead of here.

For now, I am using the old manager which appears here. Under help, you can also find help options, but you will see later in this video how to ask questions

and get help. We also have a console sometimes called the bottom panel where you can see exactly what has happened since we opened comfy UI. If we look at the taskbar and open the command window

from there, it shows the same information. One view is at the bottom

information. One view is at the bottom and the other is in the taskbar. To

close Comfy UI, I recommend opening the command window from the taskbar and closing it. You will then see a

closing it. You will then see a reconnecting message in the browser because it cannot find Comfy UI running anymore. After that, you can close the

anymore. After that, you can close the browser window. You can also close the

browser window. You can also close the browser window first and then close the command window. It is time to test our

command window. It is time to test our first ready-made workflow. Later, I will explain in detail what nodes are and what they do. So let us start Comfy UI,

wait for it to finish loading and get the interface. To open a workflow, you

the interface. To open a workflow, you have different options. You can drag a workflow directly onto the canvas, or you can go to the menu, then file, and

choose open. All workflows for Comfy UI

choose open. All workflows for Comfy UI have the extension.json.

JSON means JavaScript object notation.

It is a simple text format used to store and share data in a way that both humans and computers can easily read. In Comfy

UI, JSON is important because workflows are saved asJSON files. These files

store all your nodes, connections, settings, and prompts so you can reload, share, or edit a workflow later. I will

include these workflows for free on Discord for those who use a different Comfy UI version. For example, I can open this first workflow and you can see that it opens with all the nodes and

links ready to be used. You can use your mouse wheel to zoom in and out to see the entire workflow. You can click outside the nodes somewhere on the

canvas and drag to move around. You can

also use this hand tool, which I personally never use. With the hand tool, you can pan around the canvas.

With the normal mouse cursor, you can select nodes and move them around. We

will talk more about that later. Now we

have the workflow open in this tab and you can see its name at the top. With

the X button, you can close the workflow and go back to a new empty one. If we go to the sidebar and click on workflows, I can open this folder called getting started which I prepared for you for

this episode. Only the easy installer

this episode. Only the easy installer comes with these workflows. So, if you are using a different version, you can get the workflows from Discord. You can

also make the sidebar wider if you want to see the full text. I added a few workflows here to test in this episode.

This one is just a help file with notes and useful information that we will use later in the video. Let us close it and open the one with number one in front

called Juggernaut Reborn. If I click on workflows again, the sidebar collapses.

Now let us move around using the mouse.

Left click and drag to see it better.

Each of these blocks is called a node.

All nodes are connected to each other using links. Those small cables that go

using links. Those small cables that go from one node to another. Usually a

workflow is built from left to right.

When you run a workflow, it processes from left to right. If something does not work or the workflow is broken, Comfy UI tells you where the problem is.

Think of it like a car dashboard. If a

door is open or a light bulb is not working, you get a warning icon. It is

the same here. Errors look like this. It

says prompt execution failed and it also tells you something like value not in the list. These are some of the simplest

the list. These are some of the simplest errors to fix. It is like the car telling you a light bulb is missing. In

Comfy UI, it means a specific value, object, or file could not be found. In

our case, it could not find the checkpoint name, which is the model name, the brain as we called it. Comfy

UI workflows include all the nodes and settings, basically the interface, but they do not include the models themselves. Those brain or engine files

themselves. Those brain or engine files are not included. Since workflows are just JSON text files, they cannot include images or large files like

models. In this node called load

models. In this node called load checkpoint, the checkpoint is just a model file. The brain we talked about.

model file. The brain we talked about.

Even if I click here, I cannot select anything because it is not in the list.

That means the model is not downloaded yet or it is downloaded but placed in the wrong folder. Since I did not download any models yet, it is clear I

do not have it. That is why when I share a workflow with you, I include a note that tells you exactly what you need to download for the workflow to work. Not

everyone on the internet does this, but most good workflow creators do. The way

I organize it is like this. I tell you where the model needs to be downloaded and which node loads it. It says load checkpoint, which is the node name. Then

it tells you the model name you need to download. There is a button that says

download. There is a button that says here and then it tells you exactly which folder to place it in and which folder to create if it does not exist. That is

enough theory. Let us download the model. You already saw where it needs to

model. You already saw where it needs to be placed, but how do you find that folder? You need to find your Comfy UI

folder? You need to find your Comfy UI folder. This depends on where you

folder. This depends on where you installed it, on which drive, and in which folder. You navigate until you

which folder. You navigate until you find the Comfy UI folder. In our case, it is inside the Comfy UI easy install folder. If we go inside, we see many

folder. If we go inside, we see many folders that Comfy UI needs to run. We

have an output folder where generated images are saved. We have an input folder where input images are stored. We

also have a models folder where all downloaded models go. Inside the models folder, you can see many subfolders for different types of models. Over time,

you will learn what each one is for.

That is why I included the note so you know exactly where to put the model without guessing. For this workflow, the

without guessing. For this workflow, the model goes into the checkpoints folder.

We could just save it directly there and it would work. But from my previous tutorials, I learned that over time, you will download many models and it becomes hard to keep track of them. That is why

I like to organize models in subfolders.

In this case, I know this model is based on stable diffusion 1.5. So, I will create an SD15 folder and place the model inside it. Now, we wait for the

model to download. Some models are a few gigabytes in size. After that, we go back to Comfy UI. You can see I placed the model exactly where the instructions

said, so Comfy UI can recognize it. If

Comfy UI was closed, reopening it would automatically detect the model. But

since Comfy UI is already open, it will not see the new model yet. We need to refresh it. To do that, press the R key.

refresh it. To do that, press the R key.

You will see that the node definitions update. Now, when I click here, I can

update. Now, when I click here, I can see the model name and select it. Right

now, there is only one model, but later you will have a drop-own list with many options. Now, the model is selected and

options. Now, the model is selected and it is time to run the workflow again. By

the way, you can move the run button anywhere you want on the canvas using the small dots on its side. If you

prefer it docked, you can dock it back to the top bar. Let us run it again and see if it works. Everything turns green.

Each node runs from left to right and no red nodes appear. That means the workflow ran successfully and we generated our first image. The model we use in this chapter is quite old and

small. Later we will use smarter and

small. Later we will use smarter and more advanced models. For practice, this one is good enough because it is fast and can run on smaller computers that do

not have a lot of VRAM. Each time I press run, I get a new image because we have a random seed here. Do not worry about this yet. I will explain it later.

So now we can generate an image with comfy UI and all of this comes from a simple text called a prompt. Basically

we used a few nodes with specific settings and a model trained for this type of image generation. We can change the prompt for example photo of a cat

closeup. Now when I run the workflow I

closeup. Now when I run the workflow I should get a cat. The more VRAM you have the faster it will generate. We can see the generated images here, but they are

also saved locally. If we look at the output folder, we have a shortcut to it on the desktop. Inside that folder, we can see all the images we generated so

far. Let us go back to Comfy UI and

far. Let us go back to Comfy UI and close this workflow. I do not want to save it because I liked the prompts and settings it had before. So, I choose no.

Now, we are left with an empty workflow.

or you can click on the plus sign to create a new blank workflow. Before we

move to the next chapter, I recommend taking a short break. Research shows

that short pauses help your brain process and retain new information. Grab

a coffee, get some water, or take a quick bathroom break, then come back and continue. This chapter is about

continue. This chapter is about understanding the building blocks of Comfy UI and how they connect to form a workflow. We are in Comfy UI and we have

workflow. We are in Comfy UI and we have this blank canvas and workflow. To add a node, you doubleclick on the canvas and it will open a search box that lets you

search for a node. For example, if I type the word load, it shows me load image, load checkpoint and all kinds of nodes that let us load something. If we

click on the load image node, it will be added to the workflow. The position

where it is added depends on where you doubleclick on the canvas. You can also move it after. You just leftclick on a node, hold the left mouse button, drag it to where you want it, and then

release the button. To deselect a node, you just click anywhere on the empty canvas. For me, that is the fastest way

canvas. For me, that is the fastest way to add a node. But there are other methods. For example, I can write click

methods. For example, I can write click on the canvas in an empty area and get this menu. From here, I can go to add

this menu. From here, I can go to add node. Then I see different categories.

node. Then I see different categories.

If I click on the image category, I can find the load image node. It is right here. And if I click on it, it gets

here. And if I click on it, it gets added to the canvas. After that, you can move it and arrange it wherever you want. Another method is to use the node

want. Another method is to use the node library in the left sidebar. Here we

have all these categories. If I click on the image category, I can see the load image node. This is a good option if you

image node. This is a good option if you do not know exactly which node you are looking for. You can also search for a

looking for. You can also search for a node here to filter the list, then add the node or drag it onto the canvas. Out

of all these methods, my favorite is still the double click on the canvas.

Once a node is selected, you also have the option to delete it using this icon.

You can also use the delete key or the backspace key to delete a node after you select it. The load image node is how we

select it. The load image node is how we bring an existing image into Comfy UI so other nodes can work with it. Each node

has a title at the top that tells you what it does. Below that, it has controls, inputs, and outputs that connect it to the rest of the workflow.

Let us double click on the canvas again and add another node. This time, I will search for crop. And we get this node called image crop. You can probably guess what it does. It crops the image

that we loaded. You can change the image using this button and upload any image you want. If something goes into a node,

you want. If something goes into a node, it is called an input. If something

comes out of a node, it is called an output. The load image node has two

output. The load image node has two outputs but no inputs because the image comes directly from your computer, not from another node. The image crop node

has one image input and one image output. It receives an image, modifies

output. It receives an image, modifies it, and then sends out a new image. If

we leftclick on one of the outputs from the load image node, we can drag a connection or a cable to the next node and connect it. Because the output and the input have the same color and the

same name, it is easy to see that they belong together. In most cases,

belong together. In most cases, connections work between the same colors and different colors usually do not connect. There are a few special cases,

connect. There are a few special cases, but we will talk about those later. Now,

if I try to connect the green output, it does not connect. That is because the green output is a mask and the input on this node expects an image which is blue. In the beginning, this color

blue. In the beginning, this color system helps you quickly understand which nodes can be connected. If two

nodes cannot connect, it usually means they are not meant to be connected.

Sometimes you will also find nodes that act like adapters or converters. These

nodes take one type of output and convert it into a different type so it can be used by another node. Now

basically we have a workflow but is the workflow complete? How can we test it?

workflow complete? How can we test it?

It is simple. We run it and see what message we get. In this case it says the prompt has no output. Even if you do not understand exactly what that means yet

try to figure it out from the words. We

do not have an output node. So let us close this message. If we look at the workflow, the image is loaded from the computer. Then it goes into the image

computer. Then it goes into the image crop node which crops the image. But

after that nothing happens. There is no output. Think of this like editing a

output. Think of this like editing a photo in Photoshop. You load an image, crop it, but if you never save or export it, the work exists, but you do not get

a file. In Comfy UI, the save image node

a file. In Comfy UI, the save image node is the export step. So let us doubleclick on the canvas again and search for save. We have this save image

node. We can see that the image output

node. We can see that the image output color matches. So we can connect it.

color matches. So we can connect it.

Even if the label says image or images, it still works. Now let us make the connection. If we run the workflow, it

connection. If we run the workflow, it will process from left to right. The

image is loaded, cropped, and then saved in the output folder. We can also see it directly inside the save image node. All

nodes can be resized using the corners.

You will see small arrow indicators on the corners. You can click and drag to

the corners. You can click and drag to resize a node. In this case, I want to see the image preview bigger, so I resize the node. To remove a connection,

you can leftclick on the output dot, drag the cable out onto the canvas, and it will disconnect. You can also leftclick on the small dot in the middle

of the connection and choose delete. You

also have the option to add a reroute. A

reroute is like an extension cable or a cable organizer. It does not change the

cable organizer. It does not change the data at all. It only helps you route connections more cleanly and keep your workflow readable. From that reroute

workflow readable. From that reroute node, you can add another link to another node if you want. You can also have multiple reroutes on the same link, so you can arrange nodes, links, and

reroutes in a way that looks visually clean or helps you see faster which node connects to which node. This is very helpful when you have a lot of nodes in a workflow. To remove a reroute, you

a workflow. To remove a reroute, you just select it and press the delete key.

Let me arrange them and remove all links so we can see it better. So in Comfy UI, we have different types of nodes. First,

we have nodes that only have outputs.

These nodes usually load something from outside Comfy UI like a file or some text. In this case, the load image node

text. In this case, the load image node loads an image from your computer. So it

does not need any inputs, only outputs.

Then we have nodes that have both inputs and outputs. These nodes are usually

and outputs. These nodes are usually placed in the middle of a workflow. They

receive something from one node, process it, and then pass the result to the next node. Finally, we have nodes that

node. Finally, we have nodes that usually sit at the end of a workflow.

These nodes only have inputs, and their job is to show or save the result, for example, by previewing an image or saving it to disk. There are also nodes

that do not have any inputs or outputs at all. For example, if I search for a

at all. For example, if I search for a note node and add it to the canvas, you can see that it is onlyformational.

These nodes are used to write notes and make workflows easier to understand and remember. They do not affect the

remember. They do not affect the workflow at all and are just for organization and clarity. On the top left side of a node, next to the title, you have a small gray dot. If you click

it, the node collapses, similar to minimizing it. I often do this for nodes

minimizing it. I often do this for nodes where I already know the settings and do not need to change them. Collapsing

nodes helps save space and makes the workflow easier to read. If you right click on a node, you get a menu called the node context menu. This menu shows

options related to that specific node.

Each node has a slightly different menu depending on what that node does. In

this case, we have options like opening, saving, and copying the image, different properties, resize, and colors. We can

also collapse the node from here instead of using the gray dot. And there are many more options. Try a few of them. If

you do not like what you did, you can undo it with controll + z. You can also change the title of a node. If you

double click on the node title, you can rename it to anything you want. This

does not change how the node works at all. It is only for your own

all. It is only for your own organization and to make the workflow easier to understand. You can also rightclick on a node and choose title to rename it. This is another way to change

rename it. This is another way to change the node name. We already know that we can move a node around once it is selected. But when a node is selected,

selected. But when a node is selected, you will also see a small floating bar at the top. From here, you can delete the node using this icon. You can also

click on this dot to change the node color. This lets you choose from

color. This lets you choose from different colors, which is useful for organizing your workflow or grouping nodes by function. Changing the color does not affect how the node works. By

default, there is no color, the gray one. This small eye icon is the node

one. This small eye icon is the node info. If you click it, a properties

info. If you click it, a properties panel opens on the right with more information about the node. Here you can see what the node is supposed to do and what the values mean. From this icon,

you can also close the properties panel.

All nodes, especially the default comfy UI nodes, should have some kind of info unless it is a custom node and the creator did not add any documentation.

You can drag this side panel and resize it the way you like. Here you can see all the information about the image crop node and what each setting does. Let us

close it for now. If you hover over an icon, you can see more information about what it does. For example, because this node works with images, it lets you open

it in the mask editor. If we click on it, you can see that it opens in the mask editor. This will be useful later

mask editor. This will be useful later when we do inpainting and image editing, but that is for another episode. For

now, it is enough to know that it is here and you will learn more about it over time. These numbers are just

over time. These numbers are just settings that you can change in Comfy UI. They are called parameters.

UI. They are called parameters.

Parameters control how a node behaves and how it processes its input. Let me

reconnect the nodes. So, we have a working workflow again. Now, if I run it, you can see what these parameters actually do. We are cropping a 512x 512

actually do. We are cropping a 512x 512 pixel area from the original image starting from the X and Y coordinates set to zero. That means the crop starts

from the top left corner of the image.

So basically we are taking this small corner from the original image. The

original image was 1,024x 14 pixels and now the result is 512x 512 pixels. Even if it looks bigger here in

pixels. Even if it looks bigger here in the node preview, it is not actually larger. That is just the preview size.

larger. That is just the preview size.

The real image resolution is smaller. So

let us change some parameters or settings or however it is easier for you to remember them. Values is also fine.

And run it again. Now we have a different crop. Let me speed up the

different crop. Let me speed up the video while I try different values so you can see how the result changes.

Let us remove this middle node, the image crop node. Once it is removed, something interesting happens. Comfy UI

tries to keep the workflow connected and automatically reconnects the nodes directly. That happens because the

directly. That happens because the output of the first node and the input of the last node use the same type. If

the first and last nodes had different input and output types, the connection would disappear when the middle node is removed. Now let us doubleclick on the

removed. Now let us doubleclick on the canvas. You should remember by now that

canvas. You should remember by now that every time you see this search bar, it means I doubleclicked on the canvas. Let

us search for invert and select invert image. This node has an image input and

image. This node has an image input and an image output. But it does not have any parameters. That is because this

any parameters. That is because this node is designed to do one specific thing. Invert the image. Even without

thing. Invert the image. Even without

parameters, the node still performs a function. Let us connect this node into

function. Let us connect this node into the workflow. Watch what happens when I

the workflow. Watch what happens when I connect it to the input. You can see that the previous connection is removed automatically. That is because an input

automatically. That is because an input can only have one connection at a time.

An output on the other hand can connect to multiple nodes. You can think of it like electricity. An output is like a

like electricity. An output is like a power strip. It can send power to many

power strip. It can send power to many devices. An input is like a wall socket.

devices. An input is like a wall socket.

It can only accept one plug at a time.

If we run the workflow now, we can see that the result is an inverted image. So

until now with these small workflows, we did not use any AI. We only used simple nodes like simple code to modify images.

We will see more in the next chapter when we build a bigger workflow that uses stable diffusion to generate an image from text. But these small steps help you understand how things work. At

least I hope they do. You can always ask any questions you have on Discord as we will have a special section for this episode on the Discord forum. So we

learned that save image is usually the last node in a workflow since it does not have any outputs. And because the output is an image that goes to disk, not to another node. But that does not

mean we cannot continue the workflow. It

only means we cannot continue from that node. We can still continue from the

node. We can still continue from the previous node which has the same image just not saved to disk yet. Let us clone this node and use it again. One simple

way is to press the alt key and drag a copy of this node where you want it. Let

us delete it and try again. Now we will use controll + c to copy. And when I use controll +v it will paste that node where the mouse cursor is. Let us delete

it again. And now let me show you

it again. And now let me show you another shortcut. Press the control key

another shortcut. Press the control key and make a marquee selection over the nodes you want to select. Now we

selected two nodes. With Ctrl + C, we copy all selected nodes. And with Ctrl +V, we paste them. If you click and drag from any of the selected nodes, you can

move them together. If you press delete while both are selected, it will remove both. If we use controll + shift +v, it

both. If we use controll + shift +v, it will paste the nodes together with the links they had in the workflow. Now we

have this extra link here. So basically

from one image we got two invert image nodes and both do the same thing invert the image. If the invert image node had

the image. If the invert image node had more parameters we could change the settings in one and get different results. Let us delete those again.

results. Let us delete those again.

Practice this a few times. Press control

and select the nodes. Presstrl + c to copy and ctrl + v to paste. Move them

into position. Now look at what I am doing. I am continuing the workflow from

doing. I am continuing the workflow from the last invert image node and then I save the result. A workflow can have many branches like a tree. The root

starts with the image. Then it inverts the image and from there on another branch it inverts it again. Can you

guess what happens when I press run? The

image is inverted again and looks like the normal image. The original image is inverted. Then that inverted image is

inverted. Then that inverted image is inverted again and we get the original result. We can continue the workflow

result. We can continue the workflow even more. From the image that was

even more. From the image that was inverted twice, we connect it to an image crop node. Now, instead of double clicking and searching for a node, we can drag a connection and release it.

When we do that, a context menu appears with suggested nodes. From here, I can easily pick the save image node and it is added already connected. Let us

delete it and try again. Drag the

connection and release it. Then select

search. When I select the save image node, it is added already connected. Let

us place the nodes properly and run the workflow. You can see how many

workflow. You can see how many operations are now in this workflow.

With a single image, we can invert it, invert it again, crop it, and save it.

This is similar to a small program or an action in Photoshop, but with more control and much more flexibility. There

are nodes for images, audio, 3D, and many other things. This is where you start to see the power of comfy UI. Now

let us select everything. Hold controll

and drag to select all nodes. You can

press delete to remove everything or let us cancel that and do it another way. Go

to the menu then edit and choose clear workflow. It will ask if you want to

workflow. It will ask if you want to clear it. Click okay. And now we have a

clear it. Click okay. And now we have a blank workflow again. Do you like math?

I know you do not like it, but I just want to show something quick to see the different things it can do and help you understand Comfy UI better. Double click

on the canvas and search for math. You

will see a few math nodes. If we look on the right, you can see different names like Comfy UI core, KJ nodes, easy use.

These names are the names of custom nodes or extensions. By default, Comfy UI comes only with the nodes you see under Comfy UI core. With the easy

installer, you also get a few extra custom nodes already installed. We will

talk more about custom nodes later when we get to the manager. If you use the easy installer like I showed at the beginning of this video and install the same version, you should have the same

nodes. So again, Comfy UI core nodes are

nodes. So again, Comfy UI core nodes are made by the Comfy UI team. Let us get back to the math nodes. We will start with something simple called math int.

int comes from the word integer which means whole numbers. This node works only with whole numbers like 1 2 10 and so on, not decimals. All custom nodes

have an extra label on the top right that shows which custom node pack they belong to. This makes them easy to spot

belong to. This makes them easy to spot compared to built-in nodes. These math

nodes are used for simple calculations similar to a calculator. I personally do not use math nodes very often, but they can be very useful for automation. For

example, you might load an image, read its width or height, and then use math nodes to calculate a new value based on that size. This allows you to

that size. This allows you to automatically adjust things like resolution without manually changing numbers every time. In this case, we have letter A, letter B, and an

operation. The default operation is add.

operation. The default operation is add.

Let us set A to five. and B to three.

For the operation, we will leave it on add for now. Let us add another node that I use often called preview as text.

You can see it comes with comfy UI. This

is one of those special nodes I mentioned earlier that can be connected to almost anything. Even if other nodes cannot connect directly, this node will convert the value to text and display

it. If I run the workflow, you can see

it. If I run the workflow, you can see the result. Even though they look the

the result. Even though they look the same, one is actually a number and the other is a text display of that number.

This makes more sense if you have coding experience, but we will not get into technical details here. What is

important to remember is that we can use this node to see a result as text. It

also has options for how the preview is displayed. Let us move this node down

displayed. Let us move this node down and make a copy of the math in node.

Now remember, we have a result in this node, but it is not visible unless we use a node to preview or save it. Here

is something interesting. We do not see any inputs on this node, but when we drag a link to it, two input dots appear. This means we can actually

appear. This means we can actually connect values directly to these fields.

You will see this behavior with many nodes that have number fields or text fields. We can copy a value from an

fields. We can copy a value from an output and feed it into these fields to use it in the workflow. Notice how the field for letter A is grayed out. That

means it is no longer using a manual value. Instead, it is taking the value

value. Instead, it is taking the value from the previous node which is 8. Now

let us change the operation to multiply.

We now have 8 multiplied by 3. Let us

add another preview as text node to see the result. When we run it, the result

the result. When we run it, the result is 24. As expected, let us remove that

is 24. As expected, let us remove that preview node and arrange the layout.

This small workflow does something simple. It adds two numbers and then the

simple. It adds two numbers and then the result is multiplied by three. That

three could also come from another node and so on until you build more complex workflows. If I change the value to four

workflows. If I change the value to four and run it again, we get the correct result for that formula. I hope this was not too much math. Now let us select the middle node that does the multiplication

and right click on it. We have a function called bypass. When we enable bypass, the node is temporarily ignored as if it is not part of the workflow. By

the way, you can also access bypass quickly from this icon when the node is selected. Now, if I run the workflow,

selected. Now, if I run the workflow, you can see it ignored that node and only did the addition. If I enable it again and run the workflow, it takes that node into account again. You can

see that the node changes color and becomes purple and semi-transparent.

This visual change tells us that the node is deactivated. Bypassing a node is useful when you want to test a workflow without removing the node completely.

Hold the control key and select all three nodes. Now, if we rightclick on an

three nodes. Now, if we rightclick on an empty area of the canvas, we have the option to add a group. If we choose add group, it will create an empty group.

But since we already selected the nodes, it is better to choose add group for selected nodes. This creates a group

selected nodes. This creates a group that contains all those nodes. You can

think of a group like a folder that holds multiple nodes together. One very

important thing to remember is that if you want to move the group with all the nodes inside, you need to drag it using the group's top bar. If you select and move an individual node, it will move

outside of the group. Groups can also be resized. You can see a small triangle in

resized. You can see a small triangle in the corner that lets you change the size of the group. If you rightclick on a group, you also have the option to bypass all the nodes inside it. This is

very useful when you have multiple workflows on the same canvas. For

example, you can deactivate one workflow and enable another so only one workflow runs at a time. This becomes important as workflows get more complex and models

get larger because running multiple workflows at once can require more resources than your system can handle.

If you double click on the group title, you can change the group name. Enough

with math. Let us work a little with text as well. When we use AI, we give it prompts. And sometimes it helps to

prompts. And sometimes it helps to combine text from different sources to get a better prompt.

Now I am searching for concat. And you

can see that there are multiple nodes with similar names. That is because concatenate is a general concept and it exists for different data types. This

one here, concatenate, works with strings, which means text. It simply

takes multiple pieces of text and joins them together into a single string. That

is why I added this cat made from multiple pieces joined together to make it easier to remember. Even if you search for cat, you can easily find the

concatenate node. Let us add it to the

concatenate node. Let us add it to the canvas. For example, for string A, I add

canvas. For example, for string A, I add the word home and for string B, the word car. When I connect them, the output

car. When I connect them, the output becomes a single piece of text, a string. Let us drag a connection from

string. Let us drag a connection from that string and search for a node that can preview it. We can use the same preview as text node again. Now, because

the first workflow is bypassed, it will only run this workflow with concatenate.

You can see how it joined those two words, first home, then car. We can also use a delimiter. For example, I can add a space here and run it again. Now the

result has a space between the words or I can add a comma and a space. And now

the result looks like proper text with separation. Let us move this node down

separation. Let us move this node down and hold alt and drag a copy of the concatenate node. You can move nodes

concatenate node. You can move nodes around to make room so the connections are easier to see. Here we have these text fields. And like you saw with the

text fields. And like you saw with the math nodes, we can connect outputs directly into these fields if they are the same type. When I drag a link from the string output, you will see input

dots appear showing that I can connect there. I can connect to the first field,

there. I can connect to the first field, the second field or even the delimiter.

Let us connect it to the first field. So

now the home and car result becomes the first input. And for the second field,

first input. And for the second field, let us add the word flower. I will hold alt again and drag another duplicate since that is faster. Then I connect it.

Can you guess the result? We now get home, car, and flower. So basically this is how people create workflows. You

connect nodes like Lego pieces. Some

nodes can be connected together because they share the same input and output types and you get a result. Over time,

you can build more complex workflows that can save you a lot of time. Let's

add another node. Double click on the canvas and search for primitive. This

node is called primitive because it represents the most basic types of values. Things like numbers, text, and

values. Things like numbers, text, and true or false values are considered primitive values. The primitive node is

primitive values. The primitive node is used to manually create a value inside comfy UI instead of getting it from another node. You can use it to type in

another node. You can use it to type in a number, write some text, or define a simple value that can then be connected to other nodes. Think of it like writing a value by hand and injecting it into

the workflow. You can see here it says

the workflow. You can see here it says connect to the widget input. So we can drag a link from there. Now you can see we have a lot of inputs where we can connect this value. If we look at this

text, notice what happens when the connection is complete. It changes to the type of value that was connected, a string. Now we can manually insert any

string. Now we can manually insert any text value there. When we run the workflow, the result will include that value. This is useful because sometimes

value. This is useful because sometimes you want to use the same value in multiple nodes. Instead of typing it

multiple nodes. Instead of typing it manually each time, you add a primitive value once and connect it to all the inputs that need it. Let us right click

on this group. Usually nodes have a bypass option to disable or enable them.

But for groups, this option is called set group nodes to always. Now the nodes inside the group are enabled and we can run that workflow if we want. Let us add

another primitive node to see how it adapts. Last time when we connected a

adapts. Last time when we connected a primitive node to a text field, it automatically converted the value into a string because that input expected text.

Now if we connect a primitive node to a math int node, it adapts differently.

This time it is converted into an integer value. You can see that now we

integer value. You can see that now we can only enter numbers. It does not allow text because this node expects an integer. Right now the value is set to

integer. Right now the value is set to five. Let me resize the node so we can

five. Let me resize the node so we can see it better. You can clearly see that this one is an int and the other one is a string. If we change this number and

a string. If we change this number and run the workflow, Comfy UI will rerun all the workflows on the canvas using the new values. In this simple example,

it runs almost instantly. But later when we use larger models, you will see that some workflows can take minutes to generate instead of seconds. So if you look at all the nodes in these

workflows, you can clearly see that we use this easyuse custom node and all the rest do not have that label. That means

they come with comfy UI by default. Let

us go to the menu then edit and choose clear workflow. Now I want to do a quick

clear workflow. Now I want to do a quick recap just to make sure you assimilated some of the basics. We double click on the canvas to bring up this search option. So we can search for nodes. You

option. So we can search for nodes. You

type a word to search like load and then you can select a node. For example, the load image node. This node is used to load an image from your computer. If we

click choose file to load, we can navigate our computer and load an image.

By the way, I asked EVO to include some images for this first episode in the input folder. So you can have the same

input folder. So you can have the same images I am using. The path is Comfy UI easy install Comfy UI input. Let us say

I select this helmet but it can be any image. We can choose open or we can

image. We can choose open or we can double click on the image and it will open. Now that the image is loaded, let

open. Now that the image is loaded, let us add another node, the image crop node, and connect it from left to right from output to input. To remove a connection,

you just drag from the output and release it somewhere on the canvas. It

is like unplugging a cable and leaving it on the floor. Let us redo the connection. You can also click on the

connection. You can also click on the small dot in the middle of the connection and choose delete and it does the same thing as unplugging. Let us

connect it again. We can hold control and select multiple nodes by dragging with the left mouse button pressed. This

lets us select multiple nodes at once.

You can also hold control and click on the nodes one by one. If you select a node by mistake, just click on it again to deselect it. Once nodes are selected,

we can move them together. If you plan to move them a lot, you can also add them to a group. If you click on the canvas and drag, you are moving the canvas itself. This is useful when you

canvas itself. This is useful when you have long workflows and want to see different parts of the workflow. Let us

move these nodes to the left. Now,

double click on the canvas and add a node called preview. This time, it is not preview as text. We could add that too, but it would show numbers. Here we

add preview image. This node is similar to save image, but it does not save the image in the output folder. It is useful when you just want to preview parts of a workflow and do not need to save the

image. If you like the image, you can

image. If you like the image, you can still save it. You can rightclick on the image and choose save image. Then save

it anywhere you want on your hard disk.

Let us cancel that and remove the preview image node. This time let us add a save image node so we can save the result and then run the workflow again.

Now the result is saved. Let us go to the desktop. The easy installer comes

the desktop. The easy installer comes with a shortcut to the output folder. We

double click on it and now we are in the output folder. You can see the path at

output folder. You can see the path at the top. Here you can see all the images

the top. Here you can see all the images generated with Comfy UI. These images

are saved in this folder. You can delete them, move them to different folders, or organize them however you want. I

usually pick the images I need, move them into the project folder, and then delete the rest because over time you will end up with thousands of images.

Here we can see the helmet image we just generated, but not the previous one from the preview image node. If we go back one folder to the main Comfy UI folder,

we can see a temp folder. Comfy UI uses this folder to store preview images temporarily. The contents of this temp

temporarily. The contents of this temp folder are deleted when you start Comfy UI again. So, you can still recover

UI again. So, you can still recover preview images even if you did not save them right away, as long as you did not close Comfy UI yet. We can collapse a

node using the top left gray dot and click it again to expand it. Once a node is selected, it has multiple options at the top. One very useful option is the

the top. One very useful option is the info icon, which gives you more information about the node. You can

close the properties panel from here. We

can change the color of the node. And

this symbol here is for subgraphs.

Subgraphs are a bit more complex. So

maybe in a later chapter or another episode, we will talk more about them.

This arrow lets you bypass the node. And

if you click on these dots, it shows even more options for the node. For

example, you can change the shape, change the color, or pin the node so it is fixed and cannot be moved. If we move these nodes apart to see the links, we

can also hide the links from here. Be

careful with that because it can look like no nodes are connected. I never use this option because I like to see how nodes are connected. It helps me

understand the workflow better. If you

do not like how the links look, there are ways to change their shape. I like

the default look, but some people prefer other options. If we go to the bottom

other options. If we go to the bottom left or open the menu and go to settings, we can change this. Let us

click on settings. Here we have many settings we can change. Let us search for link since we want to change how links are displayed. You can see the current one is called spline under link

render mode. If we change it from spline

render mode. If we change it from spline to straight and close the settings, you can see the links are now straight, but they still adapt when you move the

nodes. Let us go to settings again and

nodes. Let us go to settings again and change it to linear. Now the links are always straight lines.

Let us change it back to the default which is spline. Now let us remove all the nodes.

I personally prefer the spline view because it reminds me of sci-fi scenes with lots of cables hanging around.

Let us double click on the canvas and add a load image node again. Now let us add another node and search for upscale image. We will select the node called

image. We will select the node called upscale image by and move it closer so we do not waste space on the canvas.

Then we connect it to the workflow and add a save image node at the end. So we

have a complete workflow. If we run this now we get the same image as before.

That happens because some nodes have default values that do not change anything. They only start doing

anything. They only start doing something once you change their parameters. In this case, the upscale

parameters. In this case, the upscale value is set to one. Scaling by one is like multiplying a number by one. You

get the same result. Now, let us increase the scale by value to two. When

we run it again, the image is upscaled by two times. So, we get double the resolution. This is similar to resizing

resolution. This is similar to resizing an image in Photoshop. In this case, it does not use AI to add new detail. It

simply enlarges the image. These upscale

methods are different ways of resizing an image. Nearest exact copies pixels

an image. Nearest exact copies pixels exactly, so it is very fast and keeps hard edges, but it can look blocky.

Bilinear smooths pixels together, giving a softer result that can look slightly blurry. Area is mainly meant for

blurry. Area is mainly meant for downscaling and is not ideal for upscaling images. By cubic uses more

upscaling images. By cubic uses more surrounding pixels to produce smoother and better looking results. Lancos

preserves detail and sharpness the best, but it is slower than the others.

Let us say you do a lot of changes to a node like titles, values, and colors, and you forget how the default values were. You can add the same node again

were. You can add the same node again and redo all the values and connections, or you can rightclick on the node and choose fix node, recreate. If you select that, the node goes back to its default

state. If you rightclick again, you also

state. If you rightclick again, you also have the option to clone the node and move that clone wherever you want. You

can also do this faster by holding alt and dragging the node. You can remove a node from this menu, but pressing the delete key is faster. We also have this

pin option. If you use it, a pin appears

pin option. If you use it, a pin appears at the top of the node. And now if you click and drag, the node does not move.

It is pinned to the canvas. To move it again, you need to rightclick and select unpin. Sometimes when you get workflows

unpin. Sometimes when you get workflows from the internet, some people stack many nodes on top of each other and pin them. It can look like there are only

them. It can look like there are only two nodes, even if there are 10 nodes behind one. I do not recommend doing

behind one. I do not recommend doing this. If you do not want people to use

this. If you do not want people to use your workflow, just do not share it.

We also saw that we need to change values on some nodes for them to work.

If we bypass a node, the workflow still runs and ignores that node. The links

are still there and the connection passes through the node. This is

important because there is another mode where the connection does not pass through. Let us enable the node again

through. Let us enable the node again using bypass. Now right click on it and

using bypass. Now right click on it and go to mode. You will see the option always which means the node is active.

There is also an option called never.

This mutes the node and makes it behave as if it does not exist. You can see that the node is now gray, not purple like bypass. When we run the workflow,

like bypass. When we run the workflow, we get an error. That is because the node is not passing anything through. So

the next node does not receive the image it expects. It is like cutting the cable

it expects. It is like cutting the cable where that node was. Let us remove that node and delete the link as well. If we

run the workflow again, the result is the same. The image is missing because

the same. The image is missing because there is no connection. Let us go to the menu then edit and click undo. You can

also use Ctrl + Z multiple times until you get back to the state you want. I

will undo until everything is active again. Let us zoom out with the mouse

again. Let us zoom out with the mouse wheel and add a group. Name your group in a way that explains what the nodes do. Do not name it something generic

do. Do not name it something generic like my group. You can move the group around and you will notice that it works like a magnet. If a node is inside the

group area, it stays inside. Let us make the group larger and move it around. So

you can see that nodes are sticking to it. Now adjust the group size and move

it. Now adjust the group size and move the nodes so it looks cleaner. Workflows

can get quite large, so I like to optimize the space to make them easier to read and navigate. Once the nodes are positioned, hold control and select all the nodes. You will see that only the

the nodes. You will see that only the nodes are selected, not the group. Now

rightclick and choose fit group to nodes. The group will resize to fit the

nodes. The group will resize to fit the nodes tightly. Now we can move the group

nodes tightly. Now we can move the group and it does not take much space. Groups

can have more options especially when using custom nodes like RG3. If we go to settings in Comfy UI, we have general settings, but we also have settings for

custom nodes. For example, for RG3, we

custom nodes. For example, for RG3, we have extra settings here. I can click this button to open them. RG3 is

installed when you use the easy installer.

If you install Comfy UI manually, you need to install RG3 from the manager. We

will talk more about custom nodes later.

You can also access the same RG3 settings directly from here, which is faster. If we scroll down, we have

faster. If we scroll down, we have settings for groups. For example, there is an option called show fast group toggles in group headers. Let us enable

it. You can choose when to show it

it. You can choose when to show it always or on hover. I will leave it on hover and save the settings.

Now when I hover over the group, you can see extra buttons in the top right. From

here, we can bypass all the nodes in the group easily. We also have an option to

group easily. We also have an option to mute the group. This is similar to setting nodes to never. When a group is muted, the nodes inside it do not run at all and the workflow behaves as if they

do not exist. Bypass still lets the workflow run through the nodes. Mute

does not. Let us make the group active again. We can also run the workflow

again. We can also run the workflow using the play button on the group. We

can change values, for example, use smaller values to get a smaller image, maybe half the size. There are many more things you can do with groups and switch style nodes, but we will cover those in

later episodes. I told you at the

later episodes. I told you at the beginning to leave the nodes to option turned off. At the moment of this

turned off. At the moment of this recording, it is still in beta and has some bugs. Maybe over time they will fix

some bugs. Maybe over time they will fix everything and it will become stable. If

I activate it, you can see that it changes how the nodes look. For most

nodes, things still work in a similar way, but this change exists. So, they

can add more functionality to nodes.

With the current system they use, there are limitations in what nodes can do.

And the new node version should give them more possibilities to build better nodes. Instead of the gray dot, you get

nodes. Instead of the gray dot, you get an arrow that points down and then to the right when the node is collapsed.

The inputs are placed on the edge of the node and some nodes have more options.

For example, in load image, you can see previews of images from the input folder. And you can also browse for

folder. And you can also browse for another image on your disk. However, for

older workflows, this can slightly change node sizes and mess up the layout. Some nodes might not work yet

layout. Some nodes might not work yet until the node creators update them.

Because of that, until everything is fixed and stable, I recommend leaving nodes 2 turned off. Just a quick reminder that from mode you can mute a node by setting it to never. This is

useful when a workflow is big and has a lot of branches. You can mute a branch of that workflow and it will still run without errors as long as there are no nodes after the muted ones that expect

an input. To turn the node back on, you

an input. To turn the node back on, you go to mode and choose always. We also

have shape options for nodes, but these are only decorative. They just change the corners of the node. By default, the corners are rounded. There is also the

card option which rounds only two corners. Personally, I do not think it

corners. Personally, I do not think it is worth spending much time on this. To

remove a group, you can select it and press the delete key or you can write click on it, choose edit group, and then remove. This only removes the group

remove. This only removes the group container, not the nodes inside it, unlike folders in other systems. Let us select these two nodes while holding control and press delete to remove them.

Then add an image crop node and connect the link to it. After that, let us add a preview image node. Since we are only testing with these settings, we get a

crop from the top left corner of the image. Let us arrange the nodes. Then

image. Let us arrange the nodes. Then

hold control, select both nodes, and press control + C to copy and then plus shift + V to paste them with the links.

Since we have them copied, let us paste again to get a third branch. Right now,

all the settings are the same. So, all

three give the same result. We can

change the x coordinate on one to get the bottom right corner of the image.

For another one, let us change the y-coordinate to get the bottom left corner. Now, when we run it, we split

corner. Now, when we run it, we split the image into three pieces. You could

add another one for the bottom right corner to get the missing part. That is

homework for you to figure out the correct coordinates. Now if we change

correct coordinates. Now if we change the input image and run it again, you can see how useful this can be. In a

later episode, we will learn how to load multiple images from a folder and automate this process so we can apply it to all images in a folder. Now select

all the nodes in the workflow using the shortcuttrl + a and then press delete to remove everything. It is time for

remove everything. It is time for another short break. This has been a long chapter and I want to make sure you have time to absorb the information.

Take a few minutes, press pause, get a drink, or step away from the screen, and then come back. Now, I do not know what learning method works best for you, but

I can tell you one method that usually works very well for video tutorials.

First, watch the entire tutorial from start to finish without stopping too much. This helps you build a general

much. This helps you build a general understanding of what is possible and how things fit together. Then watch it a second time, pause the video and follow

along step by step inside Comfy UI.

After that, try to repeat the same steps without the tutorial playing just from memory. Once you are comfortable, start

memory. Once you are comfortable, start experimenting.

Try changing nodes, parameters, or settings that were not covered in the tutorial. And if something does not work

tutorial. And if something does not work or you get stuck, that is completely normal. You can always go back to the

normal. You can always go back to the tutorial, rewatch a part or ask questions on Discord. Learning Comfy UI is not about speed. It is about understanding.

It is time to build a workflow from scratch. But first, let us go to

scratch. But first, let us go to workflows. Open the getting started

workflows. Open the getting started folder and open workflow number one, the one we used in a previous chapter. What

I want to do now is give you an analogy so you can understand what is happening here with all these nodes connected. So

it makes more sense. I will use a note node and add some info next to each node. You do not have to do that. Just

node. You do not have to do that. Just

watch and pay attention. I will open the same workflow but with those notes added next to each node and I will explain each one in detail. You probably noticed by now that when we generate images with

AI, we usually download a file called a model. Sometimes people call it a model

model. Sometimes people call it a model and sometimes they call it a checkpoint.

In practice, they usually mean the same thing. A model is the trained AI itself.

thing. A model is the trained AI itself.

It contains everything the AI learned during training like styles, shapes, and how images are formed. The word

checkpoint comes from machine learning.

During training, the model is saved at different points in time called checkpoints. Those checkpoints are what

checkpoints. Those checkpoints are what we download and use. So when you hear model or checkpoint, you can think of them as the same thing, the trained AI

file that does the image generation. In

Comfy UI, you will often see the term load checkpoint, but what you are really doing is loading the model you want to use. We can think of the model as the

use. We can think of the model as the photographer we want to hire. The load

checkpoint node is the step where we actually hire that photographer.

Depending on what the photographer learned during training, they will be good at different types of photos. That

is why there are so many different models available. Just like in real

models available. Just like in real life, some photographers specialize in portraits, others in landscapes, macro photography, or food photography. AI

models work in a very similar way. The

better and more complex the training of a photographer, the more expensive they usually are. In our case, that cost is

usually are. In our case, that cost is not money, but computer power. Larger

and more advanced models usually need more VRAM and a stronger graphics card to run properly. To keep things simple for now, we are hiring one photographer.

We will use a model called Juggernaut Reborn. And this is the photographer

Reborn. And this is the photographer that will generate our images. So now

that we hired the photographer, what comes next? We need to give instructions

comes next? We need to give instructions to that photographer about what we want to get and what we want to avoid. These

instructions are called prompts. We

usually use a positive prompt to describe what we want to see in the image and a negative prompt to describe what we want to avoid. In Comfy UI, we use the same node for both. I just

colored one green for the positive prompt and one red for the negative prompt so they are easier to recognize.

The node we use is called clip text encode. This node takes our written text

encode. This node takes our written text and translates it into a form that the model can understand. In simple terms, clip text encode acts like a translator

between human language and the AI. It

turns words into instructions that the photographer can follow during the photo shoot. Besides giving instructions on

shoot. Besides giving instructions on how the photo should look, we also need to decide how big the photo will be. For

that, we use the empty latent image node. This node is like choosing an

node. This node is like choosing an empty photo paper before taking the photo. Here is where we decide the width

photo. Here is where we decide the width and height of the image. We are defining the size of the photo before it even exists. At this stage, there is still no

exists. At this stage, there is still no image. It is just an empty space where

image. It is just an empty space where the photo will be created. Once the

photo shoot happens, the final image will always respect the size we set here. Now, it is time for the photo

here. Now, it is time for the photo shoot. The case sampler node is the

shoot. The case sampler node is the photo shoot itself. This is where the photographer follows the instructions from the prompts and uses the empty photo paper to take the photo. Each

different seed is like taking a new photo of the same scene. The idea is the same, but the result is slightly different every time. The K sampler

controls how the image is generated. It

decides how many steps the photographer takes, how much randomness is allowed, and how closely the final photo follows the instructions. You do not need to

the instructions. You do not need to understand every parameter right now.

What matters is that the K sampler is the core of the workflow where the actual image creation happens.

Everything before the K sampler prepares the photo shoot. Everything after it finishes the photo. After the photo shoot, the image is created, but it is

not visible yet. That is because the K sampler does not produce a normal image.

It produces something called a latent, which you can think of as a hidden version of the photo. It contains the information of the image, but it is not in a format we can actually view. This

is where VAE decode comes in. The VAE

decode node is like the dark room in photography. The photo already exists,

photography. The photo already exists, but it still needs to be developed to become visible. So, the VAE decode takes

become visible. So, the VAE decode takes that latent result and converts it into a real image that we can see, preview, and save. Without this node, the

and save. Without this node, the workflow can still generate something, but you would not be able to view the final photo because it is still in that hidden latent form. And finally, the

save image node is where the finished photo is delivered to the client. After

the VAE decode step, we usually add a node that either previews or saves the image. Preview nodes let us see the

image. Preview nodes let us see the result inside Comfy UI, while the save image node writes the final image to disk. Without one of these output nodes,

disk. Without one of these output nodes, the workflow has no final result. In our

photo studio analogy, this is the moment where the developed photo is either shown to the client or delivered as the final file. Now, let us zoom out and

final file. Now, let us zoom out and look at the entire process. First, we

load a model from our disk. This is like hiring a photographer. Then, we give instructions. The positive prompt

instructions. The positive prompt describes what we want. For example, a close-up portrait of a pet. The negative

prompt describes what we want to avoid.

For example, saying we do not want dogs.

Next, we decide how big the photo should be using the empty latent image node.

This is where we choose the size of the photo before it is taken. Now, let us run the workflow. You can see that all these instructions are passed into the K sampler where the image is actually

created. The K sampler is the photo

created. The K sampler is the photo shoot. It uses steps and different

shoot. It uses steps and different settings similar to camera settings like shutter speed or aperture to decide how the photo is taken. After that, the

image goes through VAE decode where it is converted from latent space into actual pixels. This is like developing

actual pixels. This is like developing the photo in a dark room and finally we save the image. This is when the photographer delivers the finished photo

to the client. Every image generation workflow in Comfy UI follows this same basic idea even when it becomes more complex. Let us do some quick

complex. Let us do some quick experiments. What happens if I change

experiments. What happens if I change the negative prompt the instructions where we say what we want to avoid. For

example, if I say I do not want a cat, it will probably give me another pet that is not a cat and we might get a dog instead. If we run it again, it is like

instead. If we run it again, it is like taking another photo of a pet because the seed is random. Now we can change the seed to be fixed. When the seed is

fixed, each time we use the same prompt, the same settings, and the same seed, we should get the exact same image. If I

try to run it again, you can see that nothing happens. The result would be the

nothing happens. The result would be the same. So, Comfy UI does not even bother

same. So, Comfy UI does not even bother to generate it again. If we change a setting like the seed, then it lets us generate again and we get a different image. If we go back to the previous

image. If we go back to the previous seed, we are back to the same image we had before generated with that seed. Now

that we kind of understand how it works, let us click on this plus sign and build the same workflow from scratch. Double

click on the canvas and search for load.

Usually it is either load checkpoint or load diffusion model. But in some cases there are special loaders for specific models. Now that we have the node, we

models. Now that we have the node, we select the model. Since we did not download more models yet, we only have one juggernaut reborn. So we hired our photographer. Now let us give it

photographer. Now let us give it instructions. Search for prompt. And we

instructions. Search for prompt. And we

can find this clip text end code node.

Let us move it next to the other node. I

like to change the color to green for the positive prompt. Right click and clone the node or just hold alt and drag the node to make a copy. For this second one, let us change the color to red.

Again, this does not influence how it works. It is the same node. It is just

works. It is the same node. It is just visual. For the positive prompt, I will

visual. For the positive prompt, I will add closeup portrait of a pet. For the

negative prompt, I will add cat. Not all

models use negative prompts. Some older

models like this one still use it, but you will see later that some newer models are smarter and do not need a negative prompt and they work better when the negative prompt is disabled.

Can you guess how these are connected?

We have clip on both input and output.

So, we can only connect clip to clip. If

we try to drag from the model, you can see it does not work. And the same if we try from the VAE. So, let us connect the clip output from the model to both of

the text encoders. Now we have the instructions for how the image should look but we still need to define the size. Let us search again using the word

size. Let us search again using the word empty and add the empty latent image node. There is also a newer one that we

node. There is also a newer one that we will use later for newer models but for this workflow we will use this simple one. I like to change the color of this

one. I like to change the color of this node to purple but you can leave it as it is if you want. Now we have width and height. Because we work with computers,

height. Because we work with computers, most models work better with values that are multiples of 64 or 8. That is why we see values like 512 instead of 500. I

know that this model was trained with square images at 512x 512 pixels. So I

use these values to get better results.

Some newer models are trained with larger images and can generate bigger images. But that comes at a cost. Just

images. But that comes at a cost. Just

like printing a big photo costs more than a small one, a bigger image takes more time to generate and sometimes your PC cannot handle it. More about that later. Now, let us add the most

later. Now, let us add the most important part where the magic happens, the K sampler. As you can see, this node has four inputs where it takes all the

instructions and one output. First, we

connect the model since it has the same color and name. Then, we connect the conditions. The instructions are yellow.

conditions. The instructions are yellow.

Even if the names are different, we connect the positive output to positive and the negative output to negative.

That is how it knows which one is positive and which one is negative even though they come from the same type of node. The last input is the empty latent

node. The last input is the empty latent image which defines the size of the image we want. Now we have everything needed to generate the image but it is

still in latent format. We need pixels to actually see it. So let us drag a link from the output and you can see that it suggests VAE decode. We select

it and now the image is decoded like a dark room where the photo is developed.

Here we also have a VAE input. In this

case the VAE model is included inside the main model which is why we can connect it directly. In some cases the VAE comes as a separate file and then we

use a load VAE node. You will see that later. Now the last step is to save the

later. Now the last step is to save the image. So we add the save image node.

image. So we add the save image node.

Let us run the workflow and see if it works or if we forgot something. If

everything turns green, it worked without errors. There are cases where

without errors. There are cases where the image does not look right. Even if

there are no errors, that usually means some settings are not ideal. People who

create AI models usually provide recommended settings, especially for the case sampler. Just like in photography,

case sampler. Just like in photography, macro and landscape use different camera settings. The same idea applies here. If

settings. The same idea applies here. If

we look at the previous workflow, we can see recommended settings for this model.

Steps 35, CFG7, this sampler, and thisuler.

So let us change steps to 35, CFG to 7.

For sampler, we use DPM++ 2M and foruler we use caris. Now let us run it again.

For this seed, we get some small deformations, but for next seed, it looks fine. We will see later how to

looks fine. We will see later how to improve the results even more. Let me

show you what happens when we try to generate an image that is much bigger than what the model was trained to handle. For this example, I will double

handle. For this example, I will double the image size. On the first try, I did not even get a pet. Sometimes you might get something that looks okay, but most

of the time you will see problems. you can get strange deformations, things that do not make sense, or visible mutations. If I increase the size even

mutations. If I increase the size even more, these problems become even more obvious. It also takes more processing

obvious. It also takes more processing power and more time to generate the image. The reason this happens is

image. The reason this happens is because this model was trained mainly on 512x 512 pixel images. When we ask it to generate a much larger image, it

struggles to understand the full space.

You can think of it like the model trying to generate the image in parts.

One part might look okay, but then it tries to continue the image next to it, almost like stitching pieces together, and that is where things break. That is

why you sometimes see double heads, repeated objects, or strange structures in large images. Bigger images are not always better if the model was not trained for that size. But if a model is

trained to handle larger images, you can get more details and better results. Let

us say I add ugly to the negative prompt. So we push the result toward

prompt. So we push the result toward more beautiful images. For the positive prompt, let us be more specific. We want

a dog and we want it to be beautiful.

Now when we run it, we get a more beautiful dog. Because this model is

beautiful dog. Because this model is really old, like I told you, it is good for practice. But today, we have much

for practice. But today, we have much bigger and more accurate models. They

produce better results with fewer deformationations, but they are larger and need more VRAMm to run properly. Our

desktop computers are very similar to Comfy UI because they are both built around the idea of connecting specialized components together where each one does a specific job. The CPU

acts like the central processor just like the sampler or the model does the main work in Comfy UI. The monitor is like preview and output nodes that show results. The keyboard and mouse are

results. The keyboard and mouse are inputs just like prompts and parameters.

Printers and speakers are output devices like save image or audio nodes. Routers

handle communication similar to data links between nodes. The reason we design systems this way is because breaking complex tasks into smaller

connected parts makes them easier to understand, easier to control, easier to upgrade, and more flexible. That is

exactly why Comfy UI uses nodes instead of hiding everything behind a single button. Now that we know how to create a

button. Now that we know how to create a workflow, we also need to learn how to save it. If you look at the top, you can

save it. If you look at the top, you can see it says unsaved workflow. That means

none of these settings or nodes are stored yet. If you want to reuse the

stored yet. If you want to reuse the same workflow later without recreating everything from scratch, you need to save it. If I click on this arrow next

save it. If I click on this arrow next to the workflow name, you can see there are several save options. Personally, I

prefer using the main menu. So, I go to file and here we have save, save as, and export. When you click save and the

export. When you click save and the workflow has never been saved before, Comfy UI will ask you to give it a name and choose where to save it. If the

workflow was already saved and you just made changes, clicking save will overwrite the existing file with the same name. Save as lets you save the

same name. Save as lets you save the same workflow under a different name.

This is the option I use the most, especially when I want to create variations of a workflow. Export is very useful because it is not limited to the Comfy UI workflow folder. It allows you

to save the workflow anywhere on your computer, even outside the Comfy UI folder. The API option is mainly used

folder. The API option is mainly used when working with online or cloud-based workflows. So, we will not use it here.

workflows. So, we will not use it here.

So, let us click export. Now, it asks for a name. Choose a name that makes sense to you. Click confirm. Then,

choose where to save it. For example, I can save it on my desktop. You can see that the file is saved with theJSON extension. This JSON file contains all

extension. This JSON file contains all the nodes, connections, and settings of your workflow. This file is your

your workflow. This file is your workflow, and you can open it anytime, share it with others, or modify it later. JSON files are simple text files.

later. JSON files are simple text files.

You can open them with any text editor like Notepad. JSON stands for JavaScript

like Notepad. JSON stands for JavaScript object notation, and it is just a structured way of writing text so both humans and computers can read it. In

Comfy UI, the JSON file stores things like node types, connections, parameters, and settings, all written as text. That is why workflow files are

text. That is why workflow files are small in size and easy to share. They do

not include images or models, only instructions. If we go to workflows, you

instructions. If we go to workflows, you can see I have that folder with workflows saved there. You can do that, too.

Go to the menu. Go to file. Choose save

as. Give it a name and confirm. Now if we go to workflows,

and confirm. Now if we go to workflows, we can see the workflow is saved there.

Right now it is not organized into any folder. It is just in the main list. But

folder. It is just in the main list. But

you can add a folder name in front of the workflow name when you save it. For

example, folder name, then a forward slash, then the workflow name. Let us

see where it is saved. Go to your Comfy UI folder. Then inside the Comfy UI

UI folder. Then inside the Comfy UI folder, go to user, then default, then workflows. Here you

can see your saved workflow and also the folder I created for this course that comes with the easy installer. You can

create your own folder manually. For

example, I can create a folder called my workflows, then drag that workflow into it. Now, if we go back to Comfy UI,

it. Now, if we go back to Comfy UI, nothing changes immediately because Comfy UI usually reads this when it starts. But we can refresh using this

starts. But we can refresh using this refresh button. Now our folder appears

refresh button. Now our folder appears there and we can see the workflow inside it. I suggest organizing your workflows

it. I suggest organizing your workflows like this because over time you will have a lot of workflows and it becomes hard to keep track of everything. By the

way, you can also use the search bar to search for a workflow by name. We also

have a bookmark icon. If we click it, the workflow is added to the bookmarks at the top. So the ones you use the most stay there. If you click the bookmark

stay there. If you click the bookmark again, it is removed from the favorites list. Let us collapse this and I will

list. Let us collapse this and I will show you one more thing. If we go to the desktop and open the shortcut for the output folder or if we go directly to the output folder, you can see all the

images generated so far with Comfy UI.

The last one is this dog. You probably

did not think about this yet, but if you open an image generated with Comfy UI in Notepad, you can actually see some code at the beginning. Just like with workflows, this happens because Comfy UI

attaches the workflow to the image when it saves it. After that, there is the image data which we cannot really read.

This means that every image has the full workflow embedded in it, including all the settings and prompts. Let me drag this image onto the Comfy UI canvas so you can see what happens. Now you can

see that it loads as a workflow with the file name. If we generate again, we get

file name. If we generate again, we get exactly the same image because it uses the same seed and settings. Let us go back to the output folder and drag a different image. For example, this

different image. For example, this robot. Now it loads that workflow. And

robot. Now it loads that workflow. And

if we run it, we get the exact same robot. This is very useful. Another

robot. This is very useful. Another

thing you might notice is that all images start with the word comfy UI followed by a number. This happens

because in the save image node, the prefix is set to that value. We can

change it. For example, I can set it to pixa. And now when I run the workflow,

pixa. And now when I run the workflow, the image file name will start with that word followed by a number. As you can see here, if you hover over the prefix field, you can get more information

about how to format it. You can include things like the date and other values in the file name. Now, let us change it again. I will add a folder name. for

again. I will add a folder name. for

example, my images, then a forward slash, then the image prefix.

When we run the workflow now, the images will be saved inside that folder. Let us

go to the output folder. You can see we now have a folder called my images. And

inside it, we have the images that start with the prefix we set followed by a number. Now, I will go back to the

number. Now, I will go back to the workflows folder that we created earlier and delete it.

Back in Comfy UI, if we refresh the workflows list, you can see that the folder is gone. We are left only with the getting started folder we used for this episode. When you create your own

this episode. When you create your own folders and organize your workflows, I suggest naming them in a way that makes sense. You can name them by base model

sense. You can name them by base model like SDXL workflows or flux workflows or by function like text to image workflows, inpainting workflows or video

workflows. Choose whatever makes the

workflows. Choose whatever makes the most sense to you, but organizing your workflows early will save you a lot of time later.

In this chapter, I want to show you how Comfy UI is organized on your disk. This

is important because sooner or later, you will need to know where to place models, images, workflows, and custom nodes. Do not worry if this looks

nodes. Do not worry if this looks overwhelming at first. You do not need to understand everything right now. I

will focus only on the folders you actually need as a user. This is the main Comfy UI easy install folder. Think

of this as the main workspace that contains everything Comfy UI needs to run. The most important things here are

run. The most important things here are the Comfy UI folder, the Python embedded folder, the add-ons folder, and the batch files used to start or update

Comfy UI. In normal usage, you will

Comfy UI. In normal usage, you will mostly work inside the Comfy UI folder.

If you have a different version of Comfy UI, you will not have the add-ons folder and some of the BAT files will be named differently, but pretty much everything else should be similar. When we open the

Comfy UI folder, we see many files and folders. Most of these are internal

folders. Most of these are internal files used by Comfy UI itself. As a

beginner, you do not need to touch most of these. The important folders for us

of these. The important folders for us are models, input, output, custom nodes, and the user folder. The models folder

is where all AI models live. This

includes checkpoints, Laura files, VAEs, control nets, upscalers, and more.

Inside the models folder, everything is organized by type. For example,

checkpoints or diffusion models for main image generation models, loris for Laura files, V for VAE files, control net for

control net models. When a workflow tells you to download a model, it will also tell you exactly which subfolder to place it in. If a model is not placed in the correct folder, Comfy UI will not

see it. The input folder is where you

see it. The input folder is where you place images that you want to load into Comfy UI. For example, images used for

Comfy UI. For example, images used for image to image control net masks or reference images. Any image you place

reference images. Any image you place here will be visible inside Comfy UI when using a load image node. The output

folder is where Comfy UI saves generated images by default. Every time you use a save image node, the result will appear here. This makes it very easy to find

here. This makes it very easy to find all your generated images in one place.

The custom nodes folder is where all custom nodes are installed. These are

extra features added by the community.

Each folder here represents a custom node package. For example, we already

node package. For example, we already used the RG3 node and we will use more later. When you install nodes using the

later. When you install nodes using the manager, they usually end up here automatically. If a custom node is

automatically. If a custom node is missing or broken, this is usually the first folder you should check. Inside

the user folder, we have user specific data. The most important part for us is

data. The most important part for us is the workflows folder. This is where Comfy UI stores workflows that you save from inside the interface. These

workflow files are saved as JSON files.

The add-ons folder is specific to the easy install version. It contains extra tools, optimizations, and helper scripts. You usually do not need to

scripts. You usually do not need to touch this folder unless a tutorial specifically mentions it. You do not need to memorize this right now, but this structure might change as new tools

are created by IVO. For example, this BAT file lets you link a folder with models from another Comfy UI installation. This one installs the Naku

installation. This one installs the Naku node and this one installs Sage Attention. There are also different

Attention. There are also different torch pack versions for more advanced users who need a specific version for certain custom nodes. You will also find

extra tools like one for Windows 10 that enables long paths so Comfy UI can download models even if the path is very long. There is also an update folder

long. There is also an update folder with BAT files, but as you will see later, for the easy install version, I recommend using different update BAT files. The Python embedded folder

files. The Python embedded folder includes a self-contained Python installation. This helps avoid conflicts

installation. This helps avoid conflicts with other software and makes Comfy UI easier to run and update. As you use Comfy UI more, this folder structure

will start to make sense naturally. In

the next chapters, I will always tell you exactly where things need to go. Let

us talk a little bit about updates and custom nodes. What you are seeing here

custom nodes. What you are seeing here is the Comfy UI easy install folder.

This setup already includes everything needed to update Comfy UI safely. The

most important rule is this. Always

close Comfy UI before updating. Never

update while it is running in the browser. Start Comfy UI BAT. This only

browser. Start Comfy UI BAT. This only

launches Comfy UI. It does not update anything. Update Comfy UI BAT. This

anything. Update Comfy UI BAT. This

updates the core Comfy UI code. Use this

when you want the latest features or fixes. Update Comfy UI and Nodesbat.

fixes. Update Comfy UI and Nodesbat.

This updates Comfy UI and all installed custom nodes. Update easy install.bat.

custom nodes. Update easy install.bat.

This updates the easy install system itself. When should you update? Update

itself. When should you update? Update

when something is broken. Update when a node requires a newer version. Update

when you want new features. Do not

update right before an important project. Updates can sometimes break

project. Updates can sometimes break workflows. If something breaks after an

workflows. If something breaks after an update, you can usually fix it by updating again or removing the last custom node you installed. One important

reminder, Comfy UI moves fast. Stability

comes from not updating every single day. If everything works, it is okay to

day. If everything works, it is okay to stay on your current version. At some

point, you will mess up Comfy UI. Maybe

a node breaks or some dependencies get messed up or an update has bugs. But

remember, you can always do a fresh install when that happens. Just create a new folder and reinstall using the easy installer. Let us doubleclick on update

installer. Let us doubleclick on update easy install. This updates only the easy

easy install. This updates only the easy installer and adds extra tools and add-ons. As we move forward in this

add-ons. As we move forward in this series, more models will appear, new nodes will be added, and IVO likes to create scripts that make these installations easier. When you see that

installations easier. When you see that the installation is complete, you can read more about the new release using this link or press any key to exit. You

may not see any changes immediately, but if we go to the add-ons folder, you can see that we now have more BAT files than we had in the first chapter. Now, let us go back to the main folder and try to

update Comfy UI to see if everything still works or if we break some nodes.

Nodes sometimes break after an update because Comfy UI itself changes how things work internally. Many custom

nodes are made by independent developers, not by the Comfy UI team.

These custom nodes often rely on specific Comfy UI behavior, internal APIs, or extra Python libraries and dependencies. When Comfy UI updates,

dependencies. When Comfy UI updates, those assumptions can change and the node stops working until its creator updates it to match the new version. So,

Comfy UI started after the update, but let us open the command window to see if everything worked correctly. Usually,

after startup, you can see import times for custom nodes. As you saw before, all custom nodes are inside the custom nodes folder. But look what happened here.

folder. But look what happened here.

After the update, one of the custom nodes installed like TC failed to import. That means if you have a

import. That means if you have a workflow that uses that node, it will not work. If you do not use that node,

not work. If you do not use that node, you can ignore it and try updating Comfy UI again in a few days to see if it gets fixed. I will close Comfy UI now and try

fixed. I will close Comfy UI now and try something else. Sometimes there are

something else. Sometimes there are newer versions of the custom nodes and if the author fixed the issue, updating the nodes can fix the problem. So this

BAT file updates only comfy UI and this one updates both Comfy UI and the custom nodes. This process can take a while

nodes. This process can take a while because it updates all the nodes. So I

will speed it up. Comfy UI started. So

let us check the command window to see if the issue was fixed. The node was still not fixed. This means that at the time I recorded this video, the update from that day broke that node. When you

watch this video, it might already be fixed and work for you. Either because

Comfy UI fixed a bug, the node creator patched the node, or a new developer created a replacement node. There is one more thing I want to try. We can go back to an older version of Comfy UI that did

work, a version that had the right conditions for that node. The downside

is that if Comfy UI released new features or nodes for newer models, those might not work on the older version. So, it is always a compromise.

version. So, it is always a compromise.

You have to choose between keeping a specific custom node working or using the latest Comfy UI updates. IVO hid

this option so beginners do not accidentally mess up their Comfy UI. Let

us go to the add-ons folder, then to tools, and here we have the version switcher. When we run this BAT file,

switcher. When we run this BAT file, Comfy UI is downgraded to a previous version. In my case, it went from

version. In my case, it went from version 0.7 back to version 0.6.

If you run this script again, it upgrades Comfy UI back to the latest master branch. Let us press any key to

master branch. Let us press any key to close this. Now that we are on an older

close this. Now that we are on an older version, it is time to check if that node works. Let us start Comfy UI. Wait

node works. Let us start Comfy UI. Wait

for the interface to load. Then open the command window and check the custom nodes. Now it is fixed and there are no

nodes. Now it is fixed and there are no errors with the nodes. In a few days I will try updating again to see if it gets fixed in the newer version. But

this is basically how you update and downgrade Comfy UI using the easy installer. Other Comfy UI versions might

installer. Other Comfy UI versions might require you to run commands manually, but I keep pushing IVO to create BAT scripts for these tasks. I want to spend my time generating, not typing lines of

code. You will see that we have the

code. You will see that we have the manager here. In other versions, you

manager here. In other versions, you might find it somewhere else in the menu. Let us open the manager and see

menu. Let us open the manager and see what we have here. We also have update and update comfy UI. These are similar to the BAT files, but the BAT files have

something extra. They take into account

something extra. They take into account some dependencies needed for certain custom nodes to work, which Comfy UI itself does not handle when updating.

For example, for the nunchaku node to work, it needs specific dependencies, like a certain version of a library. The

BAT file updates Comfy UI, but then adjusts or downgrades those dependencies to the versions required by the custom nodes we use. IVO tries to maintain these BAT files and keep them updated so

they stay compatible with the versions needed to run the workflows shown in these video tutorials. Because I am using the easy installer, I did not touch these update buttons inside the

manager. I only use the BAT files. If

manager. I only use the BAT files. If

you have a different version of Comfy UI, you will need to use these update options or use a BAT file from the update folder instead. In the manager, you can also find the latest Comfy UI

news, such as what was fixed, what is new, and recent changes. At the bottom, you can see the Comfy UI version and the manager version. Most of the time the

manager version. Most of the time the manager is used for managing custom nodes. If we go to the custom nodes

nodes. If we go to the custom nodes manager, we can see all the available custom nodes created by different developers. There are a lot of them. I

developers. There are a lot of them. I

personally try to keep the number of installed nodes to a minimum and install only what is essential or what I use most often. Some people install hundreds

most often. Some people install hundreds of nodes, but the more nodes you install, the harder it becomes to keep everything compatible because each node can have its own dependencies and

requirements. If I filter by installed,

requirements. If I filter by installed, you will usually not see many nodes here besides the manager itself. However, I

asked Ivo to include a few essential nodes that I use most often. One example

is the RG3 custom node, which includes the image compar node that is very useful for comparing images side by side. Each custom node has a title and a

side. Each custom node has a title and a version number. You can switch versions

version number. You can switch versions if needed, for example, when an older workflow only works with a specific version of a node. For each node, you

also have several actions available.

Update only that node, switch the version, temporarily disable it, or uninstall it. You can also see how many

uninstall it. You can also see how many individual nodes are included in that custom node package along with a short description. Some nodes mention possible

description. Some nodes mention possible conflicts with other nodes. If you click on that yellow warning text, you can read more details about those conflicts.

These conflicts usually matter only if you use both conflicting nodes in the same workflow. You can also see the

same workflow. You can also see the author of the node and the number of stars it has on GitHub. Stars are given by users and usually indicate how popular or trusted a project is. Some

developers are wellknown and consistently release highquality nodes.

That said, there have been cases in the past where certain nodes had security issues, so it is still a good idea to be careful. You can also see when the node

careful. You can also see when the node was last updated. To switch versions, you click the version selector, choose a version from the list, click select, and

then follow the steps shown. We will not do that right now. As you remember, every custom node that gets installed ends up in the custom nodes folder. Here

you can see all the custom nodes that come with the easy install version at the time of this recording. Now, let us install one node as a test just to see

how the process works. Open the manager.

Go to custom nodes manager and search for a node called align. We will use this as a test because it does not require special dependencies. So in

theory it should not affect Comfy UI too much. Each node entry has a title. If

much. Each node entry has a title. If

you click on it, it opens the GitHub page for that node. On GitHub, you can see the code because every custom node is basically Python code and supporting

files. You can also check the issues tab

files. You can also check the issues tab where users report problems and sometimes solutions are discussed. If

you scroll down, you usually find important information like required Comfy UI versions, Python versions, or other dependencies. These are the

other dependencies. These are the dependencies I mentioned earlier, things the developer relied on when creating the node. You also see installation

the node. You also see installation instructions either through the manager which we are doing now or manually using commands like git clone which simply copies the code into the custom nodes

folder. Before installing any custom

folder. Before installing any custom node, it is a good habit to read this information. Some nodes require things

information. Some nodes require things your system might not have and then they will not work. Now let us install this node. Click the install button. You will

node. Click the install button. You will

be asked to choose a version. So select

the latest version. The button changes and installation begins. When it

finishes, you will see a restart button.

Comfy UI needs to restart for the node to become available. Click restart and confirm. Comfy UI shuts down. You will

confirm. Comfy UI shuts down. You will

see the browser trying to reconnect while Comfy is restarting. After a few moments, you get a confirmation message.

Click confirm. The node is now installed. If you go back to the

installed. If you go back to the manager, open custom nodes manager and search for the align node, you will see that it now shows an uninstall button.

If installation had failed, you would see an import failed message instead. If

you look inside the custom nodes folder on disk, you will now see a new folder for this node. It is simply the same code you saw on GitHub copied locally.

This code is what adds new nodes to the Comfy UI interface. If you deleted this folder manually, that would also uninstall the node. However, let us

uninstall it properly using the manager.

Go back to the manager, click uninstall and confirm again. You will be asked to restart Comfy UI. Confirm. Wait for the

restart and then confirm the browser reload.

Now, go back to the custom nodes manager and search for the align node again. You

will see the install button again which means the node is no longer installed.

If you check the custom nodes folder, you will also see that the folder for this node has been removed. This is the basic workflow for installing, updating, and uninstalling custom nodes using the

manager. Sometimes when you download a

manager. Sometimes when you download a workflow from other people on the internet, you will have missing nodes because they used custom nodes that you do not have installed. When you do not

know what nodes they used, you can use the install missing custom nodes button.

This will give you a list of missing nodes and the option to install them.

That said, I personally prefer to install nodes manually so I have full control over what gets installed. That

is why I usually include a note node in my workflows explaining exactly which custom nodes are required. Now let us look at templates. If we open templates,

we can see different workflows created by the Comfy UI team. If we filter by image generation workflows and select something like a Z image turbo text to

image workflow, Comfy UI will first tell us that we have missing models. These

are the AI models required for the workflow to generate images. Usually, it

tells you exactly which folder the model needs to go into and gives you the model name along with a download link or a download button. In this example, you

download button. In this example, you can see it needs a VAE model and a few other models. Once you download those

other models. Once you download those models and place them in the correct folders, the workflow should work, assuming you have enough VRAM to run it.

In this case, there are no missing nodes. So, let us close this. Now, let

nodes. So, let us close this. Now, let

us go to menu, then file, then open, and open a workflow that I know uses missing custom nodes. You will see a message

custom nodes. You will see a message saying the workflow uses custom nodes that are not installed. At first, you might not see any red nodes on the canvas. That is because this workflow

canvas. That is because this workflow uses subgraphs. Subgraphs are basically

uses subgraphs. Subgraphs are basically nodes that contain other nodes inside them. If you have experience with

them. If you have experience with Photoshop, you can think of them like smart objects. When you see an icon with

smart objects. When you see an icon with a square and an arrow, you can click it to enter the subgraph. Once inside, you can see the red node that is missing. If

we now open the manager and click install missing custom nodes, Comfy UI detects that node and offers to install it. For many nodes, this works

it. For many nodes, this works perfectly. However, some nodes like

perfectly. However, some nodes like Nunchaku require additional dependencies and extra setup. We will talk about those in a future episode. The important

thing to know is that for many workflows, install missing custom nodes can quickly fix the problem. Let us

close this for now. If we open the manager again, you will also see a models manager. This lets you browse and

models manager. This lets you browse and download models by type. Personally, I

rarely use this because a model without a workflow is not very useful. In my

tutorials and on my Discord server, every workflow comes with notes explaining exactly which models you need and where to put them. The Comfy UI templates also clearly list required

models and folders. So, let us do a quick recap.

Use update Comfy UI.bat to update only Comfy UI. Use update Comfy UI and

Comfy UI. Use update Comfy UI and nodes.bat to update Comfy UI and all

nodes.bat to update Comfy UI and all custom nodes. Use update easyinstall.bat

custom nodes. Use update easyinstall.bat

to update the easy install system and helper scripts. The update folder exists

helper scripts. The update folder exists for users with other Comfy UI versions.

The add-ons folder only exists in the easy install version. Inside add-ons,

the tools folder includes the version switcher, which lets you downgrade or upgrade Comfy UI if needed. This is

useful when a new update breaks a node you rely on. Inside the Comfy UI folder, the custom nodes folder contains all installed custom nodes. If you delete a

folder from here, you uninstall that node. Sometimes if a node fails to

node. Sometimes if a node fails to install correctly, deleting its folder and reinstalling can fix the issue. I

know this is a lot of information. Do

not worry if it does not all stick right away. Practice, experiment, and come

away. Practice, experiment, and come back to this tutorial in a month. You

will be surprised how many things suddenly make sense that you missed the first time. Regarding the tcash node,

first time. Regarding the tcash node, after a few days, Comfy UI was updated again and the problem was still not fixed. There is now version 8 and even

fixed. There is now version 8 and even if you downgrade to version 7, it is still not fixed. Comfy UI keeps adding updates and at some point some custom

nodes will stop working. If that node is not important for you, you can delete it or uninstall it. You can also just disable it from the manager or drag the TCH folder into the disabled folder so

it is disabled. You can move it back out of the disabled folder anytime you want to try it again. In this chapter, I will try to simplify this complex world of

diffusion and AI a little. Do not worry if you do not understand everything that is happening. Like I said before, you do

is happening. Like I said before, you do not have to be a mechanic and know all the engine parts to know how to drive a car. This is the core idea behind

car. This is the core idea behind diffusion image generation. The model

does not draw an image all at once. It

starts from pure random noise. This

noise looks like static on a television.

The model then runs a sequence of small refinement steps. At each step, a small

refinement steps. At each step, a small amount of noise is removed. Early steps

reveal very rough shapes. Later steps

reveal clearer forms. Final steps add fine details and texture. Image

generation is therefore a gradual process. It goes from noise to less

process. It goes from noise to less noise to recognizable shapes and finally to a finished image. This slide is a simplified visualization. The real

simplified visualization. The real process is more complex. In practice,

most diffusion models work in a compressed latent space rather than directly on pixels. A neural network predicts what noise should be removed at each step. Even though the real math is

each step. Even though the real math is more advanced, this simplified view is enough to understand how diffusion works. It's like sculpting. You start

works. It's like sculpting. You start

with a rough block and remove material until the shape appears. or like a foggy window clearing up step by step. You

don't instantly get a sharp scene. It

resolves gradually. Let's open comfy UI.

Go to workflows and from the getting started folder, pick workflow 1, which is the basic text to image example. Even

if we cannot fully see what is happening inside the K sampler stepby step, we can still get a good idea of the overall process. Remember what we see here is a

process. Remember what we see here is a simplified representation of what is actually happening under the hood. First

we want a fixed seed. We will see later that each seed starts with different noise. Right now we are using 35 steps

noise. Right now we are using 35 steps which is enough for this model to produce a clear image like this robot.

If we change the steps to one, you can see that the model does not have enough time to remove the noise. So the image is very unclear with these settings. If

we add another step, the change is subtle. Adding another one, you can

subtle. Adding another one, you can start to see something forming. By step

four, we can almost see a face. We can

automate this process to see the changes faster. Double click on the canvas and

faster. Double click on the canvas and add a primitive node. Like you saw in an earlier chapter, we can adapt this node for different fields. Drag a connection from the primitive node and connect it

to steps. Now we have control over the

to steps. Now we have control over the steps including what happens after each generation. Instead of fixed or random,

generation. Instead of fixed or random, choose increment. After each run, the

choose increment. After each run, the value increases by one. So now we have five steps. If we run it again, we get

five steps. If we run it again, we get six steps and the image starts to change more. As more steps are added, more

more. As more steps are added, more noise is removed and the image becomes clearer. Next to the run button, there

clearer. Next to the run button, there is a small down arrow. From here, select run instant. This means we can click run

run instant. This means we can click run once and it will keep running until we stop it. You can see the workflow now

stop it. You can see the workflow now runs automatically. On each run, more

runs automatically. On each run, more steps are added and the image keeps refining. You may also notice that as

refining. You may also notice that as the number of steps increases, it becomes harder for the computer. Just

like climbing many stairs, more steps mean more effort. So, generation becomes slower and slower. Soon we reach around 35 steps which is recommended for this

model to get a nice clear image.

Although some results already look good around 20 steps. Now we want to stop this. Click the arrow again and switch

this. Click the arrow again and switch back to run. After the current generation finishes, it will stop. There

is also another way to see a small preview of what is happening inside the K sampler. From the menu, you can go to

K sampler. From the menu, you can go to settings, but it is faster to access the settings from here. In the settings search bar, type preview. You will see

an option called live preview method. By

default, it does not show anything. But

if we set it to auto, we can see a small preview during generation. Let's delete

the primitive node. Then change the seed to random. Now when we run the workflow,

to random. Now when we run the workflow, we can see a small preview of what the image might look like before it even finishes generating. Let us change the

finishes generating. Let us change the steps to 30 and run again. You can now quickly see what is happening in the diffusion process. Even though this

diffusion process. Even though this preview is low resolution, you can clearly see how the image becomes more and more defined as noise is removed.

Now, let me try something more drastic.

I will use a very large image size. On

some computers, this might crash Comfy UI or take a very long time to generate.

I will run it again with these settings.

You can see that generation is now very slow. But the preview lets us observe

slow. But the preview lets us observe how the image slowly starts to appear.

This is a bit too slow. So I will cancel the generation here. Instead, I will try a slightly smaller image, still larger than what the model is comfortable with, just so we can see the preview updating

more slowly. Now we can clearly see the

more slowly. Now we can clearly see the diffusion process updating every few seconds. The speed of this preview is

seconds. The speed of this preview is also influenced by the sampler and the scheduler. As you may remember, models

scheduler. As you may remember, models are trained on specific image sizes. If

a model was not trained on large images, it treats them more like multiple smaller images stitched together. For

example, our juggernaut model was trained on 512 pixel images only.

Personally, I prefer not to keep the live preview enabled all the time because it can slightly slow down generation. So I will go back to

generation. So I will go back to settings and set the preview option back to default. I will also reset the image

to default. I will also reset the image width and height. You may notice that the preview is still visible. This can

happen because something remains in memory. To fix this, I will press F5 to

memory. To fix this, I will press F5 to refresh the browser. Keep in mind that refreshing the browser will reload only the current workflow. If you had other

workflows open and did not save them, they will be lost. Now everything is back to normal without the preview.

There are still more useful things to learn. This slide explains how a

learn. This slide explains how a diffusion model is trained. This is not image generation yet. During training,

the model is shown millions of images paired with text descriptions. For

example, images of cats, people, objects, lighting styles, and environments. The training process uses

environments. The training process uses something called forward diffusion.

Forward diffusion means gradually adding noise to a clean image. At first, only a small amount of noise is added. Then

more noise is added step by step.

Eventually, the image becomes almost pure noise. At each step, the model is

pure noise. At each step, the model is trained to predict what noise was added.

In other words, it learns how images break down as noise increases. By

repeating this process across millions of images, the model learns patterns. It

learns what shapes look like. It learns

what objects look like. It learns how lighting and structure behave. The goal

of training is not to memorize images.

The goal is to learn how to reverse this process later. Training a diffusion

process later. Training a diffusion model requires massive data sets and powerful hardware. In Comfy UI, we are

powerful hardware. In Comfy UI, we are only using the result of that training.

Now that the model has learned how noise works during training, we can use that knowledge in reverse to generate images.

This slide shows the difference between training and image generation. During

training, the model starts with a clean image. Noise is added step by step until

image. Noise is added step by step until the image becomes pure noise. This is

called forward diffusion. This process

teaches the model how images break down when noise is added. During generation,

the process is reversed. We start from pure random noise. The model removes noise step by step to create an image.

It is important to understand this clearly. During generation, we do not

clearly. During generation, we do not add noise like in training. We only

remove noise using what the model learned before. This slide explains an

learned before. This slide explains an important concept that is often misunderstood. The model does not store

misunderstood. The model does not store images in memory. During training, the model never saves photos that it has seen. Instead, it learns patterns and

seen. Instead, it learns patterns and relationships.

It learns what shapes look like. It

learns what objects look like. It learns

how parts of an image relate to each other. For example, it learns that faces

other. For example, it learns that faces usually have eyes in a certain position.

It learns that animals have specific structures. It learns how lighting,

structures. It learns how lighting, shadows, and perspective usually behave.

All of this knowledge is stored as probabilities inside the model, not as pictures, but as learned rules. You can

think of it like learning a language.

You do not memorize every sentence you read. You learn grammar and structure.

read. You learn grammar and structure.

The model works the same way. It learns

visual grammar, not individual images.

When the model generates an image, it is not copying anything it has seen before.

It is using learned patterns to guide the noise removal process. That is why results can look familiar but are still new images. This is why changing the

new images. This is why changing the prompt changes the result. The prompt

activates different learned patterns inside the model. That is also why the same model can generate many different images even though it was trained only

once. So far we talked about diffusion

once. So far we talked about diffusion in a simplified way as if it happens directly on images. In reality, most modern diffusion models do not work

directly on pixel images. Instead, they

work in something called latent space.

Pixel space is the image as we normally see it. It is made of pixels with width,

see it. It is made of pixels with width, height, and color values. Latent space

is a compressed representation of that image. It keeps the important structure

image. It keeps the important structure and information but removes unnecessary detail. You can think of latent space as

detail. You can think of latent space as a simplified version of the image that is easier for the model to work with. To

move between pixel space and latent space, the model uses a VAE. VAE stands

for variational autoenccoder. The VAE

has two main jobs. First, it encodes a pixel image into latent space. Second,

it decodes a latent image back into pixels. During image generation,

pixels. During image generation, diffusion happens in latent space. After

the dnoising process is finished, the VAE decodes the result back into a visible image. Working in latent space

visible image. Working in latent space makes diffusion much faster. It also

uses less memory and less computing power. This is why models like stable

power. This is why models like stable diffusion can run on consumer graphics cards. Without latent space, image

cards. Without latent space, image generation would be much slower and more expensive. In Comfy UI, this is why we

expensive. In Comfy UI, this is why we see nodes like VAE encode and VAE decode. When we generate images from

decode. When we generate images from text, the model works in latent space and VAE decode converts the result into pixels we can see and save. This also

explains why image resolution and VAE selection can affect results.

Now we look at how text prompts influence image generation. The prompt

does not act only once at the beginning.

During diffusion, the prompt is used at every denoising step. At each step, the model checks whether the image is moving closer to what the text describes. You

can think of the prompt as guidance. It

gently nudges the image in the right direction while noise is being removed.

This happens repeatedly, step by step, until the final image is formed. CFG

stands for classifier free guidance. CFG

controls how strongly the prompt influences the dnoising process. With a

low CFG value, the model follows the prompt loosely and allows more randomness. With a high CFG value, the

randomness. With a high CFG value, the model follows the prompt more strictly and forces the image to match the text more closely. Here is a quick example.

more closely. Here is a quick example.

You can find CFG here in the K sampler.

Too low CFG can produce images that ignore the prompt. Too high CFG can produce images that look unnatural or oversharpened. CFG is like telling the

oversharpened. CFG is like telling the model how strict it should be about your instructions. The prompt does not

instructions. The prompt does not generate the image by itself. The prompt

only guides the noise removal process.

The image is still created by diffusion in latent space. As you can see with CFG1, the cat is still a cat, but it is not read like we asked. With CFG7, the

result is much closer to the prompt.

That said, this also depends on the model we are using. Smarter or better trained models tend to follow the prompt more accurately. In fact, there are some

more accurately. In fact, there are some models where we intentionally use a fixed CFG value of one, which effectively ignores the negative prompt.

However, pushing CFG too high can damage the image. It can introduce artifacts or

the image. It can introduce artifacts or make the result look unnatural. Because

of that, we always try to find a balance. The goal is to use settings

balance. The goal is to use settings that give us the quality we want in the shortest amount of time without hurting the final image.

Now we talk about seeds. Seeds are very important for understanding consistency and variation. A seed defines the

and variation. A seed defines the starting noise used to generate an image. You can think of it as the

image. You can think of it as the initial random pattern the model starts from. When diffusion begins, the model

from. When diffusion begins, the model always starts from noise. The seed

decides exactly what that noise looks like. If you use the same prompt, the

like. If you use the same prompt, the same settings, and the same seed, you will get the same image every time. If

you change the seed, you change the starting noise, and the final image will be different. The prompt guides the

be different. The prompt guides the process, but the seed decides the starting point. Different starting noise

starting point. Different starting noise leads to different results, even when everything else stays the same. You can

think of the seed like rolling a dice before starting. If you roll the same

before starting. If you roll the same number, you start from the same situation. If you roll a different

situation. If you roll a different number, the outcome changes. This is a simplified explanation. The seed

simplified explanation. The seed controls a random number generator used internally by the model. You do not need to understand the math behind it. You

only need to know that seeds control repeatability. Let us put it into

repeatability. Let us put it into practice. The seed is this number here.

practice. The seed is this number here.

It can start from zero and go up to a very large number. So each seed can produce a slightly different result. If

you also change the prompt and settings, you can get millions of different results. We can control the seed

results. We can control the seed behavior. If we set it to fixed, we

behavior. If we set it to fixed, we generate once and the result will never change. To generate something new, we

change. To generate something new, we need to change other settings. If we

choose increment, after each generation, the seed number will increase by one. If

we choose decrement after each generation the seed number will decrease by one. So let us change it to fixed and

by one. So let us change it to fixed and set the seed to 10. When I generate I get this robot. Now let us change the seed to 15. You can see that I get a

different robot this time in profile. If

I change the seed back to 10, I get the previous robot again because we used the same prompt, the same settings, and the same seed.

In prompts, the order of the words matters.

With this prompt, I got this image because house was first. So, the model focused on the house and mostly ignored the car. With newer models, this happens

the car. With newer models, this happens less often. But this is an older model,

less often. But this is an older model, so the effect is more noticeable. Now,

look at what happens if I put car first and then house. This time, we clearly get both a car and a house. Words that

appear earlier in the prompt usually have more influence than words that come later. You can think of the prompt as a

later. You can think of the prompt as a list of priorities. The model pays more attention to the beginning and gradually less attention as it moves toward the

end. On top of that, some words can

end. On top of that, some words can carry more weight, either because of how the model was trained or because we explicitly give them extra emphasis.

Because of this, two prompts with the same words but in a different order can produce noticeably different results.

Think of the prompt like giving directions to someone. If you say a red cat sitting on a chair in a room with soft lighting, the most important idea

is red cat. Everything after that adds detail, but the core idea comes first.

We can also add more weight to a word by using round brackets. Right now, house has more weight. So the model pushes the car into the background and it is no

longer the main focus. If I add even more brackets, the influence of house becomes even stronger and now the car disappears completely. If I instead add

disappears completely. If I instead add more weight to the word blue, you will see more blue appear in the generation.

One more thing you might notice is that there is no spell check by default.

Sometimes it can be useful to turn it on. To do that, go to settings,

on. To do that, go to settings, search for spell, and enable text area widget spell check. Now, words that are misspelled or not part of the dictionary

will be underlined.

Now, we talk about denoising steps.

Steps control how many refinement passes the model performs during generation.

Each step removes a small amount of noise. The image is not created in one

noise. The image is not created in one action. It is refined little by little,

action. It is refined little by little, step by step. When you increase the number of steps, the model has more chances to clean up noise and add detail. When you decrease the number of

detail. When you decrease the number of steps, the process is faster, but the image can look rough or incomplete. More

steps means slower generation and more refinement. Fewer steps means faster

refinement. Fewer steps means faster generation and less refinement. There is

always a balance between speed and quality. You can think of steps like

quality. You can think of steps like polishing an object. More polishing

gives a smoother result. Less polishing

is faster but rougher. In Comfy UI, steps are set inside the K sampler node.

For most models, a good starting range is between 20 and 30 steps. Going much

higher often gives diminishing returns.

Going much lower is useful for fast previews. Steps work together with the

previews. Steps work together with the seed and the prompt. The seed decides the starting noise. The prompt guides the direction. Steps decide how far the

the direction. Steps decide how far the refinement goes. Now we are ready to

refinement goes. Now we are ready to look at a real workflow in Comfy UI.

This is called text to image. Often

shortened to text to img. Text to image means we start from pure noise and generate an image only from text instructions. There is no input image

instructions. There is no input image involved. This is usually the first

involved. This is usually the first workflow people learn and it is the best way to explore ideas and styles from scratch. We start by loading a model.

scratch. We start by loading a model.

This model contains everything the AI learned during training. Next, we give the model instructions using a text prompt. This describes what we want to

prompt. This describes what we want to see in the image. We also define the image size using an empty latent image.

This decides the resolution before the image is generated. Then the K sampler runs the diffusion process. This is

where noise is removed step by step guided by the prompt. After that, the VAE decodes the latent result into a visible image. Finally, the image is

visible image. Finally, the image is saved to disk. Use text to image when you want to explore new ideas, you want to test prompts and styles, or you are

starting from nothing. This workflow is ideal for concept art and experimentation. But we can also start

experimentation. But we can also start from an image not just from pure noise.

In that case instead of beginning with random noise we use an existing image as the starting point and apply d noiseise on top of it. You can think of den noiseise as how much freedom the model

has to change the image. With low den noiseise the model stays very close to the original image. With higher d noiseise it moves further away and behaves more like text to image. So

rather than generating everything from scratch, we are guiding the diffusion process using an image as the base and then controlling how much it changes using the D noiseis value. Image to

image is like starting with a rough sketch and deciding how much you want to redraw it. You can see that in the text

redraw it. You can see that in the text to image workflow we have empty latent image. That node generates the noise. In

image. That node generates the noise. In

this workflow, we have an image that is encoded to latent so it can go to the K sampler. Let me show you how I did it. I

sampler. Let me show you how I did it. I

removed the empty latent image node.

Then I doubleclicked on the canvas and added a load image node. From here we can load an image and I will choose this robot. Now you can see it does not have

robot. Now you can see it does not have a latent output. So we cannot connect it to the K sampler yet. So we need a VAE.

If we look we have decode and encode. We

already have VAE decode when it converts from latent to pixels. Now we want to encode it. An easy way to find the right

encode it. An easy way to find the right node is to drag a link and release it.

And you will see a suggestion for VAE in code. Now we have a latent output which

code. Now we have a latent output which means we can connect it to the K sampler which is what we want. If we try to run it like this, something is missing. It

says missing VA AE. You can see a big red outline around the node with the problem and a small circle around the input which means we need a connection

there. So let us connect it to the VAE.

there. So let us connect it to the VAE.

In this case, the VAE is included in the main model. So we connect it from there.

main model. So we connect it from there.

Now we encoded it and then we decode it.

Let us run again. And now it works. But

the result is still different from my input. We have the right prompt, but

input. We have the right prompt, but something is influencing it. Remember

this. Every time you use an image as input, we need to adjust the D noiseis because that controls how much the image changes. With the default value of one,

changes. With the default value of one, it is at the maximum. So, it changes the image too much. Let us change it to 0.2 and see how that affects it. Now, you

can see it is very similar to the original. It is hard to tell what parts

original. It is hard to tell what parts changed. Let us increase it to 0.5.

changed. Let us increase it to 0.5.

Now, we can see more changes in the robot face. There is an easy way to

robot face. There is an easy way to compare these images. Double click on the canvas and search for image compar.

This is part of the RG3 node pack. You

can see it has two inputs, image A and image B. I want to compare the original

image B. I want to compare the original image. So I will connect the load image

image. So I will connect the load image output to image A. For the second image, remember the save image node is only for saving to disk. The image we want to

compare is the one coming out of VAE decode. So we connect that to image B.

decode. So we connect that to image B.

Now let us run the workflow. We get this small preview. Let me make it larger. It

small preview. Let me make it larger. It

is still too small. So I will move some nodes to make space so you can see it better. By default, it shows image A,

better. By default, it shows image A, the original. When we move the mouse

the original. When we move the mouse over to the right, it shows the second image. Now it is much easier to compare

image. Now it is much easier to compare before and after. If I change the noise to 0.1, we get a very similar result because the amount of den noise is

small. If I change it to 0.9,

small. If I change it to 0.9, we get a big variation. All of this is also influenced by the sampler, the scheduleuler, and the model itself. But

in general, this is how it works. I

prefer to start with values around 0.7.

If that is too much, I reduce to 0.5 and keep adjusting until I like the result. Another thing you should know is

result. Another thing you should know is that the input image size influences the result size. Since we do not have an

result size. Since we do not have an empty latent image node where we set width and height, the loaded image decides the size. Comfy UI will also

round the size to a multiple of 8. For

example, if your image is 511 pixels, it can be rounded down to the next number that is a multiple of 8, like 504.

You can also control the input size by resizing or cropping it, like you saw in the earlier chapters. For example, I can add an upscale image node here, then

redo the connections so the image passes through it. I can upscale to a bigger

through it. I can upscale to a bigger size with the same ratio. Now when I run it, the final image should be larger because it follows the input image size.

Now we are going to talk about samplers and schedulers which you can find here in the K sampler. This is one of the most confusing parts at first, but the

idea is actually simple. Everything

begins with the same initial noise. The

seed defines that noise, but once the noise exists, two different systems control what happens next. The sampler

decides how noise is removed. It defines

the strategy the model uses to go from noisy to clean. Different samplers use different mathematical paths to dn noiseise. Some remove noise more

noiseise. Some remove noise more directly. Some refine the image

directly. Some refine the image gradually. Some are more random and

gradually. Some are more random and creative. Some are more stable and

creative. Some are more stable and precise. Even with the same prompt, the

precise. Even with the same prompt, the same seed, and the same number of steps, changing the sampler can change the final image. So the key idea is this.

final image. So the key idea is this.

Sampler controls how each denoising step is calculated. Or in simple terms,

is calculated. Or in simple terms, sampler equals how noise is removed.

Theuler does not change how denoising works. It changes when denoising happens

works. It changes when denoising happens during the steps. A linearuler spreads denoising evenly across all steps. Each

step removes roughly the same amount of noise. A nonlinearuler removes noise

noise. A nonlinearuler removes noise faster at the beginning and slower near the end. This allows fast structure

the end. This allows fast structure early and fine detail later. Both

approaches can reach a clean image, but they feel different in how detail is introduced. So the key idea here is

introduced. So the key idea here is controls when noise is removed or simply equals when noise is removed. Sampler

anduler always work together. You never

choose one without the other. The

sampler chooses the denoising method.

The scheduler chooses the timing of that denoising. The same noise plus a

denoising. The same noise plus a different sampler or a differentuler can produce different results.

Let us do a little experiment in comfy UI. From workflows, I open again this

UI. From workflows, I open again this text to image workflow and I change the seed to fixed. Then I run the workflow.

With this sampler anduler, we get this robot. Here we have a lot of samplers

robot. Here we have a lot of samplers and schedulers. Depending on the model

and schedulers. Depending on the model we use, some work better than others.

Let us say I pick the Oiler sampler. Now

when I run it, even if the seed and prompt are the same, the result is slightly different because the sampler influences how the den noiseise is applied. Let us say I also change

applied. Let us say I also change theuler to simple. Now the result will again be different because theuler changes when the den noiseise happens

during the steps because the model we use is quite small. We can actually preview multiple results at the same time. So I hold the control key and drag

time. So I hold the control key and drag over these three nodes. Then I use controll + c to copy them and controll + shift + v to paste them with the links

connected. Now this workflow will

connected. Now this workflow will generate two images and has two k sampler nodes. Let me use controll +

sampler nodes. Let me use controll + shift +v again to get a third one. Now

this workflow uses the same seed and prompt with three different k sampler nodes. And I want to change the samplers

nodes. And I want to change the samplers andulers for each one. You can play with these all day and try many combinations.

I will choose something random for this example.

Now when I run it, you can see it generates an image for each sampler.

Some results are quite similar, but some details are different. For example,

parts of the robot may change from one image to another. Let me now put the same sampler on all of them and use different schedulers only so we can see how the timing of den noiseise

influences the result.

Again, the differences are subtle but they are there. Sometimes this can mean one image has five fingers and another has six. So having options is useful

has six. So having options is useful especially when you want small variations. Now let us double click on

variations. Now let us double click on the canvas and add a primitive node. I

want to control the steps value for all three K samplers, but I do not want to change it manually on each one. So I

drag a connection from the primitive node to the steps input of the first K sampler. Then do the same for the second

sampler. Then do the same for the second and the third one. Now from this single node I can control all three. If I

change steps to one, you can see we get very similar results. If I change steps to three, you can already see differences. Someulers are faster. For

differences. Someulers are faster. For

example, with one, the image is still very noisy, while with another, you can already see a shape forming. If I change to four steps, the differences become

more visible. At five steps, some start

more visible. At five steps, some start to form clearer shapes. At six steps, some images already show eyes and a main structure. At eight steps, the middle

structure. At eight steps, the middle one is almost fully formed. At 10 steps, almost all of them have something that could work for certain concepts. And at

20 steps, most of them have enough detail to be usable in a project.

Usually, the people who create AI models suggest specific samplers and schedulers, or the community tests them and shares which ones work best. This

way, you do not have to test everything yourself for every model. But if you do find good settings, it is always a good idea to share them with the community so everyone can improve their image

generation results. Let's talk a little

generation results. Let's talk a little about subgraphs in Comfy UI. Go to

workflows and open the juggernaut text to image workflow. Here you can see a bunch of nodes. Just like before, hold the control key and drag to select most

of the nodes except the export node, which in this case is the save image node. Now that the nodes are selected,

node. Now that the nodes are selected, look at the icons at the top. One of

them says convert selection to subgraph.

When you hover over it, when you click it, all those selected nodes are combined into a single node. If you

write click on this new node, you will see an option called unpack subgraph.

When you click it, the nodes go back to how they were before. Let's do it again.

Select two or more nodes, then use the subgraph button to create a subgraph.

Resize it so it is easier to see. A

subgraph is a way to group multiple nodes into a single reusable block.

Instead of showing a long chain of nodes every time, you collapse them into one node that represents an entire process.

You can think of a subgraph like a function or a macro. Inside it there can be many nodes, but from the outside it looks simple. It is very similar to

looks simple. It is very similar to smart objects in Photoshop which can contain multiple layers inside a single object. Subgraphs solve three main

object. Subgraphs solve three main problems. First, they reduce visual clutter. Large workflows can become

clutter. Large workflows can become messy very quickly and subgraphs help keep things readable. Second, they help reuse logic. If you repeat the same

reuse logic. If you repeat the same setup many times, like a prompt encoding chain or an image pre-processing step, you can reuse it instead of rebuilding

it every time. Third, they make workflows easier to explain and share.

People understand a few clean blocks much faster than dozens of individual nodes. At the time I recorded this

nodes. At the time I recorded this tutorial, subgraphs were still being improved and may still have some bugs. A

subgraph does not make a workflow faster by itself. It is about organization, not

by itself. It is about organization, not performance. Performance depends on the

performance. Performance depends on the nodes inside the subgraph, not on the subgraph wrapper. A subgraph is like

subgraph wrapper. A subgraph is like putting many Lego pieces into one box and labeling the box with what it does.

All the pieces are still there. You just

do not need to see them all the time.

You can see the title says new subgraph.

Let's double click on that and rename it to something that makes sense like text to image and maybe also include the model name. So, juggernaut text to

model name. So, juggernaut text to image. Now, it looks like a simple

image. Now, it looks like a simple workflow with only two nodes. I do not like the order in which things appear in the node. So, let's write click on the

the node. So, let's write click on the node and select edit subgraph widgets.

Here you can choose what parameters to show in that node and what to hide. Let

me hide all of them so we have a clean subgraph that does not show any parameters. You can enable them one by

parameters. You can enable them one by one later if you want only the ones you need. But we will build those manually

need. But we will build those manually so you understand them better. Let's

close this panel. Now let's go inside the subgraph. You can see that all the

the subgraph. You can see that all the nodes are there plus some input and output. On top you can see a new tab

output. On top you can see a new tab next to the workflow name. If I click on the main workflow name that is how we exit the subgraph. From there we can go

back inside and from inside we can go back outside. You can see the output

back outside. You can see the output where the image is saved. If we go inside the subgraph that image output appears here as a link. From this dot we

can drag a connection to where it says checkpoint name. Now that field becomes

checkpoint name. Now that field becomes gray just like when we added a primitive node before. If we go back outside you

node before. If we go back outside you can see that the checkpoint name appears here. Let's go back inside again, double

here. Let's go back inside again, double click on that name, and rename it to model to see what happens. Now, when we go back to the main workflow, you can

see it says model instead of checkpoint name. So, this is very customizable.

name. So, this is very customizable.

Let's go back inside and drag another connection, this time to the positive prompt and rename it so we know what it is. Do the same for the negative prompt.

is. Do the same for the negative prompt.

Now, when we go back outside, we have positive and negative prompt visible. Go

back inside again and drag connections to width and height. And maybe do the same for all the parameters from the K sampler. Now we can see all those

sampler. Now we can see all those parameters exposed here. And when we go back outside, we have this single node that acts like a mini interface that can

control everything we need. You might

say that it looks nice, but does it actually work? Let's try it. And the

actually work? Let's try it. And the

answer is yes, it works. If you right click on it, you can see it still has other options like node color, bypass

and so on. With that subgraph selected, right click on the canvas this time and you will see an option called save selected as template. It asks for a

name. I will name it juggernaut text to

name. I will name it juggernaut text to image. Then press enter or confirm. It

image. Then press enter or confirm. It

looks like nothing happened, but where was that template saved? Let's open a new workflow. Now right click on the

new workflow. Now right click on the canvas and go to node templates. You can

see that name there now. And you also have the option to manage templates and remove them. When I select that

remove them. When I select that template, it is added to the canvas with all the nodes, connections and settings it has inside. Now we can just drag a

link from the image output and add a save image node or connect it to other nodes to create more complex workflows.

Over time, this simplifies workflows because we can organize them into pieces and group them by category or function.

Let's go back to the first workflow just to show you that any node or combination of nodes can be saved as templates.

There are cases where some connections can break when some nodes are inside a subgraph and others are outside. So,

keep that in mind. For example, I use this Pixaroma note node a lot. I want to save it as a template so I can access it easily next time. This might not be

useful for everyone, but as a workflow and tutorial creator, I use this a lot.

I will save it as a template and give it a name. Now I can go to any other

a name. Now I can go to any other workflow and quickly access that template from anywhere. You can also have subgraphs inside other subgraphs like boxes inside boxes.

You can disconnect or remove links at any time. I could select two nodes here

any time. I could select two nodes here and combine them into another subgraph or go outside and combine all these nodes even if some are simple nodes and

one is already a subgraph and it will still let me create a new subgraph. If

we go inside all those nodes are there.

If we go back outside we can unpack it using the icon or rightclick and choose unpack subgraph. These things will make

unpack subgraph. These things will make more sense as you work with them in practice. So play with them and have

practice. So play with them and have fun. When you see that icon on a node,

fun. When you see that icon on a node, you know it is a subgraph. It also has the icon that lets you go inside the subgraph, which is another indicator that it is not a simple node. Remember

that you can also use the interface to edit subgraph widgets. One thing I forgot to show is that you can use those dots to rearrange the order of the parameters shown in the subgraph node.

This way you do not need to go inside it. Most of the time you can control

it. Most of the time you can control things directly from the outside. Now we

are going to talk about loris. Laura

stands for low rank adaptation. In

simple terms, Allora is a small add-on that modifies how a base model behaves.

Allora does not replace the model. It

works together with the model. You can

think of the base model as the main photographer we hired earlier. Allora is

like giving that photographer extra experience in a specific style or subject. Why loris exist? Training a

subject. Why loris exist? Training a

full model is very expensive. It

requires a lot of images, time, and powerful hardware. Loris exists to solve

powerful hardware. Loris exists to solve this problem. Instead of retraining a

this problem. Instead of retraining a full model, we train a small adapter that teaches the model something new.

This could be a specific art style, a character, a face, a pose style, or a lighting style. Loris are much smaller

lighting style. Loris are much smaller than full models. That is why they are easy to download and experiment with. So

remember, a Laura does not work by itself. It always needs a base model and

itself. It always needs a base model and a compatible architecture. For example,

a stable diffusion 1.5 Laura needs a stable diffusion 1.5 model. An SDXL

Laura needs an SDXL model and so on. If

you mix incompatible models and loris, the results will be broken or random.

Let's open comfy UI to test it because what is theory without practice, right?

Open workflow 3, the one that has Laura in the name. As you can see, the workflow is very similar to what we had before. That is one of the reasons I am

before. That is one of the reasons I am using this older model instead of a newer one. It is easier to learn the

newer one. It is easier to learn the basics first and then we can make things more complex as we move forward.

Compared to the first text to image workflow we used earlier, we now have this Laura loader node that loads a Laura model in our photographer analogy.

This means the photographer took some classes on how to take photos of cakes and is now specialized in that subject.

Let's look at the note node first. We

need to download the Laura model.

Remember the workflow comes with nodes and settings, but since it is just a text file, it cannot include the actual models. We have to download those

models. We have to download those separately and place them in the correct folder. In this case, we are using a

folder. In this case, we are using a Laura called cake style. It is a small model trained on images of cakes. So, it

understands cakes better than the base model alone. A few years ago, when

model alone. A few years ago, when stable diffusion 1.5 models first appeared, they could not handle many subjects very well, and luris were often

used to fix those limitations. So, we

need to download this Laura and place it inside the Laura's folder. Click where

it says here. Then we need to place that file in the Laura's folder. Go to your Comfy UI folder. Open the models folder and then find the Laura's folder. If we

place it directly here, it will work perfectly. But this time, I want to keep

perfectly. But this time, I want to keep things organized. I want to create a

things organized. I want to create a folder that tells me which base model this Laura is compatible with. So, I

will create a folder called SD15.

This way I know it works with that model and I do not mix it with others. Save

the Laura inside that folder. If you

look at the file now, you can see that the Laura is much smaller than the base model. All lures should go into this

model. All lures should go into this folder and it is best to organize them by base model name like SDXL, Flux, Quinn, and so on just like we did with

checkpoints in an earlier chapter. Now

go back to Comfy UI. We have everything we need to run this workflow, but because Comfy UI was already open when we downloaded the model, it cannot see it yet. We need to press the R key to

it yet. We need to press the R key to refresh the node definitions.

Now it appears in the list and we can select it. You can also see a note here

select it. You can also see a note here with a trigger word. I like to add these notes so I remember them. Many Lauras

are trained using specific trigger words. These are words the Laura learned

words. These are words the Laura learned during training. If you do not include

during training. If you do not include the trigger word in the prompt, the lura may have little or no effect. Some luras

work without trigger words, but many require them. Always read the Laura

require them. Always read the Laura description from the place where you downloaded it. If we look at the

downloaded it. If we look at the positive prompt, we first added the trigger words so we do not forget them.

It is not required to be first. It can

also be placed after a few words, but I like to put it first. Then we have the prompt for a robot. I did it this way so we can clearly see how the Laura and a simple trigger word affect the result.

Now if we run this workflow, we get this robot cake. You might think this model

robot cake. You might think this model could do that without a Laura, but it depends on the prompt and the model. Let

me change the seed to fixed so we can get a consistent result. So this is how it looks with the Laura applied. Now

what I want to do is see the effect the workflow without Laura and without changing anything else. Same prompt,

same settings, same seed, just disable the Laura. To do that, I right click on

the Laura. To do that, I right click on this node and choose bypass. Now, when I run the workflow, the Laura is bypassed.

And you can see we get a normal robot instead of a cake robot. If I enable the node again and run it, you can clearly see the effect the Laura has on the image. Now that you see how it works,

image. Now that you see how it works, let's adapt a normal text to image workflow and add the Laura ourselves for practice. Open workflow 1, the basic

practice. Open workflow 1, the basic text to image workflow. Now I want to add the Laura between the model and the K sampler. Double click on the canvas,

K sampler. Double click on the canvas, search for Laura, and add the node called Laura loader model only. Let me

resize it so the text is easier to see.

I also like to color these nodes blue so I can spot them faster in big workflows, but that is optional. Now, we need the model connection to go through this node. If you look now, the model is

node. If you look now, the model is connected directly to the K sampler, but we want the extra knowledge from the Laura. Drag a connection from the model

Laura. Drag a connection from the model output to the Laura loader and then from the Laura loader to the K sampler. The

workflow is now complete. Let's set the seed to fixed so we can clearly see how different settings affect the result. It

runs without errors, so everything is connected correctly. Even if Allora

connected correctly. Even if Allora sometimes works without a trigger word, it is best to include it when one is provided. So let's add the trigger word

provided. So let's add the trigger word cake style to the positive prompt. Now

when I run it, we get a different result even though the seed is fixed. That

shows the Laura is doing its job. If I

change the seed, we get another variation. To avoid forgetting trigger

variation. To avoid forgetting trigger words, I like to add a note node. I

write the trigger word there. change the

note title so it is clear what it is for and often change the color to match the Laura nodes so I know they are related.

One important thing I have not mentioned yet is that you can use multiple loris.

If I want I can clone this node by holding alt and dragging then connect them one after another. You can stack several luras this way. I personally do not use more than three or four at once.

In this setup, the base model is combined with the first Laura, then the second Laura, and all that information goes into the K sampler. In the prompt, you add the trigger words for all the

Loras you use. If I run this now, some strange things can happen. First, I use the same Laura twice, which makes its effect too strong. Second, when using

multiple lores, it is usually a good idea to reduce their strength so they blend better instead of overpowering the image. If I lower the strength values,

image. If I lower the strength values, the result becomes much more stable and usable. If your result looks too weird,

usable. If your result looks too weird, one of the first things to try is reducing the Laura strength. Let me

delete the extra Laura and keep only one, then set its strength to one. Each

Laura has a strength value. This

controls how strongly the Laura affects the model. Low values give subtle

the model. Low values give subtle influence. High values give strong

influence. High values give strong influence. If the value is too high,

influence. If the value is too high, images can break, faces can deform, and styles can become unstable. A good

starting range is usually between 0.6 and 1.0. There is no universal best

and 1.0. There is no universal best value. Each Laura behaves differently.

value. Each Laura behaves differently.

Let's delete this Laura loader node. And

let me show you another node you can use, this time from a custom node.

Search for power Laura loader. This one

comes from the RG3 node pack. What is

different compared to the previous one is that it has two inputs and two outputs. Because of that, we need to

outputs. Because of that, we need to route both the model and the clip through this node. First, connect the model output to the power Laura loader

model input. Then connect the clip

model input. Then connect the clip output to the clip input on this node.

After that, the clip output from the power lura loader goes to both the positive and negative prompt nodes. Let

me select all these nodes and move them a bit so you can see more clearly how both the model and the clip go through the Laura loader. Now we can add the Laura we want directly inside this node.

We can also add multiple Lauras here.

You can see that I can add a second one and even a third one. If I right click on a Laura entry, I can remove it. I can

do the same for any of them. If we right click on a Laura and choose show info, we get more details. There is also a button called fetch from civitai. Civet

AAI is a website that hosts models and you will see it later. If the Laura is public and available on Civai, this will fetch useful information about it,

including examples and trigger words. We

also have toggle buttons here. We can

toggle all Loris on or off or toggle them individually. Let me add another

them individually. Let me add another Laura so you can see how that works.

This way we can load multiple loras but enable only the ones we want at any moment. After playing a bit with this

moment. After playing a bit with this Laura, I found that a strength value around 0.55 to 0.6 works better for this specific one. I tried it with the

specific one. I tried it with the trigger word, then added an orange golden fish, cute and adorable, and got this result. It is not bad for such a

this result. It is not bad for such a small model. For a second example, I

small model. For a second example, I tried a marzipan cake shaped like a woman and got this result.

For a third one, I tested a marzipan castle. Again, this is just for

castle. Again, this is just for practice. Later, you will see better

practice. Later, you will see better models that can produce much higher quality images with fewer errors. Loris

are lightweight. They do not increase VRAMm usage very much. Common beginner

mistakes are using the wrong base model, using strength values that are too high, forgetting trigger words, and expecting a Laura to fully replace a model. Loris

enhance models. They do not replace them. Stacking many Loras can slow

them. Stacking many Loras can slow things down slightly, but not dramatically. For beginners, it is best

dramatically. For beginners, it is best to start with one Lura at a time. The

base model is the photographer. The

Laura is a specialty training course that photographer took. The photographer

still uses the same camera. They just

learned a new style.

Now that you understand diffusion, prompts, imageto image, loris, and workflows, we are ready to talk about controlNet. ControlNet is one of the

controlNet. ControlNet is one of the most powerful features you can use in Comfy UI. In simple terms, controlNet

Comfy UI. In simple terms, controlNet lets you guide image generation using an extra image, not just text. Instead of

saying what you want only with words, you can also show the model what you want. What control net is? Control net

want. What control net is? Control net

is an additional neural network that works alongside the main diffusion model. It does not replace the model. It

model. It does not replace the model. It

does not replace the prompt. It adds

extra control. The base model still does the image generation. The prompt still guides the style and subject. Control

net adds structure and constraints. You

can think of it like this. The prompt

says what the image should look like.

The seed decides the starting noise. The

sampler and scheduler decide how noise is removed. Control net tells the model

is removed. Control net tells the model where things should go. Why control net exists? Text prompts are powerful, but

exists? Text prompts are powerful, but they are also vague. If you say a person standing, the model decides the pose. If

you say a city street, the model decides the layout. If you say a face, the model

the layout. If you say a face, the model decides the proportions. Control net

exists for cases where you want more control. For example, you want a

control. For example, you want a specific pose. You want a specific

specific pose. You want a specific composition. You want to follow a

composition. You want to follow a sketch. You want to preserve the

sketch. You want to preserve the structure of an input image. Control net

makes results more predictable and repeatable. This is a simplified

repeatable. This is a simplified explanation. In reality, control net

explanation. In reality, control net works by injecting additional conditioning into the diffusion process at every denoising step. But you do not

need to understand the math for learning and practical use. This mental model is enough. Control net guides structure

enough. Control net guides structure while diffusion fills in details. Let's

open comfy UI. Go to workflows and select workflow number four. The one

that has control net in the name. The

workflow is similar to the text to image workflow. It is still a textto image

workflow. It is still a textto image workflow but it is guided by an image using controln net. You can quickly tell it is text to image because of the empty latent image node. I highlighted in

yellow the nodes that we usually use for control net. Let's go to the note node

control net. Let's go to the note node to see what we need. The checkpoint

model was already downloaded earlier. We

also need to download some control net models for custom nodes. We need this specific custom node which comes with the easy install version. But if you are

using a different Comfy UI version, you need to install this node first. We have

a Cany model, another one called depth, and another one called open pose. There

are more types available, but these are the most popular and commonly used ones.

Let's download all three so we can test them. First, download the Cany model.

them. First, download the Cany model.

Then, go to your Comfy UI folder. Open

the models folder and think about where this model should go. If you guessed the control net folder, you are right. We

place it there because different base models can have different control net models. Just like with Loris, a control

models. Just like with Loris, a control net is only compatible with the base model it was trained for. So, let's

organize them properly and create a folder so we know which base model these control nets are compatible with. Save

the model in that folder. Next, download

the depth model and save it in the same folder.

Then download the open pose model and save it in the same folder as well. Wait

for all downloads to finish. If Comfy UI was open while downloading, we need to refresh it. Press the R key to refresh

refresh it. Press the R key to refresh the node definitions. Now we have everything we need to run this workflow.

In the workflow, we have an apply control net node with some settings. We

also have a node that loads the control net model we downloaded and a pre-processor node that converts the input image into a format that control net understands and was trained on.

Let's run the workflow. In this example, we are using the canny model. We loaded

a bunny sketch and with the help of the pre-processor, it generates a canny map which is an image that detects the edges of the input image. With the prompt, we influence what we want to generate. And

with apply control net, the model interprets that canny map and uses it to guide the generation to get this image.

You can imagine that without control net, it would be very hard to get something this complex using only a prompt, especially with the small model we are using today. Now, let's build this workflow ourselves so you

understand it better. Open again the first workflow that you already know how to build and we will adapt it to use controlNet.

We know control net comes before the K sampler. So let's move some nodes to

sampler. So let's move some nodes to make room for it. Double click on the canvas and search for apply control net.

Add the node and change its color to yellow so it is easy to recognize. Now

let's connect the parts that are obvious first. Positive goes to positive,

first. Positive goes to positive, negative goes to negative. For the

outputs, there is only one place where they make sense. So we connect those as well. At this point, the node still has

well. At this point, the node still has missing inputs. One of them is the VAE.

missing inputs. One of them is the VAE.

We already know where the VAE comes from. In our case, it is included in the

from. In our case, it is included in the checkpoint model. The same VAE we

checkpoint model. The same VAE we already used for encode and decode. The

next missing input is the control net model itself. Double click on the

model itself. Double click on the canvas, search for load control net, add the node, and color it yellow as well.

Now connect it to the apply control net node.

The last missing input is the image. Add

a load image node.

Then we can select an image. In this

case, I will use a bunny sketch. You

might be tempted to connect this image directly to control net. But that

usually does not work. Control net

expects a very specific type of image because it was trained on that type of data. Our sketch is just a normal image.

data. Our sketch is just a normal image.

So, we need a pre-processor to convert it into something ControlNet understands. Double click on the canvas

understands. Double click on the canvas and search for AIO, which stands for all-in-one. Add the prep-processor node

all-in-one. Add the prep-processor node and color it yellow. Connect the load image node to the prep-processor. Then

connect the prep-processor to the apply control net node. Right now, the prep-processor is set to none, so we need to choose one. Since we plan to use a Cany control net model, select a Canny

edge prep-processor.

To better understand what is happening, add a preview image node after the pre-processor. This allows us to see the

pre-processor. This allows us to see the control image that is actually being sent to control net. When we run the workflow, we get a cany map, white edges on a black background. This shows

exactly what ControlNet will use as guidance. You can also adjust the

guidance. You can also adjust the resolution here if you want more detail in the map. Now, let's look at the result. It does not look very good yet.

result. It does not look very good yet.

This happens often when working with controlNet. And there are a few things

controlNet. And there are a few things to check. First, look at the prompt. We

to check. First, look at the prompt. We

are still using a robot prompt, but the image is a bunny. So, let's change the prompt to something like a watercolor painting of a bunny. Next, reduce the control net strength and the end%

slightly. Run again. The result is a bit

slightly. Run again. The result is a bit better, but it still does not follow the sketch very well. If changing the seed does not help, the next thing to check

is the control net model itself. If you

look at the load control net node, you may notice that the selected model is not a canny model, but a depth model.

That is the problem. The pre-processor

and the control net model must match.

Select the correct CANY control net model. Now run the workflow again. The

model. Now run the workflow again. The

result is much better and follows the sketch closely. Let's try another

sketch closely. Let's try another example. Load a 3D text image. Since

example. Load a 3D text image. Since

this image has depth information, we can try a depth control net instead. Change

the control net model to depth and update the prompt to something like golden text in snow. When you run it, you may notice the preview still looks like a cany map. That means we forgot to

change the pre-processor. Switch the

pre-processor to a depth prep-processor.

The first time you run a new prep-processor, Comfy UI may take longer because it downloads a small model automatically. This only happens once.

automatically. This only happens once.

If you get a long path error on Windows, close Comfy UI and run the long path enabler from the tools folder. Now we

see a depth map. Dark areas represent parts that are farther away. Lighter

areas represent parts that are closer.

ControlNet uses this information to understand spatial structure. The

generated result now follows the depth and composition of the original image very closely. If for some reason you get

very closely. If for some reason you get an error saying the model is incomplete or something similar, you can close Comfy UI, go to the tools folder and run

the batch file called long path enabler.

This should fix the long path issue and allow Comfy UI to download the model it needs even when the file path is longer.

You can also try the same image with a Cany control net. Switch both the model and the pre-processor back to cany and run again. Even if some edges are

run again. Even if some edges are missing, it can still guide the generation. Well, try switching back to

generation. Well, try switching back to depth and compare results. Often, one

will work better than the other depending on the image. Now, let's talk about the key control net parameters.

Control net does not replace diffusion.

It only guides it during certain parts of dnoising. Strength controls how

of dnoising. Strength controls how strongly control net influences the image. Low values make control net a

image. Low values make control net a soft suggestion and the model can drift away. High values strongly enforce

away. High values strongly enforce structure and make the output closely follow the control image. Typical values

are between 0.5 and 0.7 for natural results and 0.8 to 1 for strict structure matching. Start percent

structure matching. Start percent controls when control net begins influencing the dnoising process. A

value of zero means control net starts from the very first step, locking structure early. Higher values allow the

structure early. Higher values allow the model to form rough shapes first before control net takes over. End% controls

when control net stops influencing dnoising. A value of one means control

dnoising. A value of one means control net stays active until the end, locking structure even in fine details. Lower

values allow control net to stop earlier, letting the model finish on its own and add more style. In simple terms, strength is how hard control net pulls.

Start percent is when it starts pulling and end percent is when it lets go. That

is why control net is so powerful. You

can guide structure without killing creativity.

It is time to test the pose control net as well. But first, let us add a pose

as well. But first, let us add a pose image as reference. Let us say I add this woman. Again, it is kind of hard to

this woman. Again, it is kind of hard to put into words the exact same pose. So

this is a good use case for control net.

Let us change the prompt to something else. Maybe woman in a sumo yoga pose.

else. Maybe woman in a sumo yoga pose.

Not sure how to call it. Then do not forget to change the model to open pose.

I know it might seem complex at first, but newer models have union control net models that include everything in one.

So you only load one model, which makes things easier. That said, this is how we

things easier. That said, this is how we used to do it. And there are still cases where we need specific models. Then we

can try either DW pose or open pose.

Type pose and select this open pose. And

then let us run it again. It will take a bit since this is the first time I use it. In fact, if you look at the command

it. In fact, if you look at the command window, you can see it is downloading that model from hugging face. That is

why it takes so long. After it finishes, it gives this pose image that looks like a skeleton with each color representing a bone. That is how it knows which side

a bone. That is how it knows which side is right and left and so on. So it

captured the pose and now let us see the result. Holy sumo, what is this? Okay,

result. Holy sumo, what is this? Okay,

let us adjust the prompt. Maybe a fit woman will help. That did not help much.

I bet the word sumo has too much weight.

Like we talked about before, some words have more power than others. So if I try without that word, I get a better result. Even the face is not so great.

result. Even the face is not so great.

And that can happen with people in the distance. Usually with portraits, we get

distance. Usually with portraits, we get better faces. Newer models fixed most of

better faces. Newer models fixed most of that. Let me try to change the

that. Let me try to change the resolution to see if something changes.

Now I get a better resolution for the skeleton. And the pose is okay. Just the

skeleton. And the pose is okay. Just the

face. That face does not invite me to do yoga. Okay. Let us try a different pose.

yoga. Okay. Let us try a different pose.

Something for a portrait. Let us say I use this portrait photo. Let us change the prompt to a businesswoman and run it to see what we get. The pose looks okay.

Even if it is missing an arm, it should still work. The results are much better

still work. The results are much better now that the face is closer. Let us try a warrior woman as well.

That works well too. So with control net, you have to continuously search for balance. Make sure you select the right

balance. Make sure you select the right control net model for the job. Then

choose a pre-processor that matches the model. As you saw, it is easy to forget

model. As you saw, it is easy to forget to change something. I usually play with strength and endstep. Also, do not

forget control net models made for SD1.5 only work with SD 1.5 base models. If

you use SDXL, you need SDXL control net models. In a later episode, we will

models. In a later episode, we will check some advanced models that do not even need controlNet and can do everything from prompts. Beginner

mistakes to avoid. Using the wrong control net model for the base model.

Forgetting to install or download controlNet models. Using very high

controlNet models. Using very high strength values. Expecting control net

strength values. Expecting control net to fix bad prompts. Using control net when it is not needed. Control net is a

tool not a magic fix. When you start using comfy UI you will notice there are many different model types. AIO models

FP16, FP8, GGUF, and others. This can be confusing at first, but the reason is actually simple. At their core, all

actually simple. At their core, all diffusion models are just very large collections of numbers. Those numbers

represent what the model has learned.

The knowledge itself does not change, but the way those numbers are stored can change. Different model formats exist to

change. Different model formats exist to balance memory usage, speed, and hardware compatibility. Some formats are

hardware compatibility. Some formats are larger but more precise. Others are

smaller and faster, but slightly less accurate. FP32 is the highest precision

accurate. FP32 is the highest precision and is mostly used for training. It uses

a lot of memory and is rarely used for image generation. FP16 is the most

image generation. FP16 is the most common format for stable diffusion. It

offers a very good balance between image quality and VRAMm usage. This is the safest and most recommended choice for most users. FP8 uses even less memory

most users. FP8 uses even less memory and can be faster on newer GPUs that support it. The trade-off is that it can

support it. The trade-off is that it can sometimes reduce image stability or detail slightly. AIO models stand for

detail slightly. AIO models stand for all-in-one. They bundle the main model

all-in-one. They bundle the main model VAE and sometimes clip into a single file. They are designed to be easy to

file. They are designed to be easy to use and reduce setup mistakes. The

downside is that they give you less flexibility if you want to swap components later. GGUF models come from

components later. GGUF models come from the language model world. GGUF stands

for GPT generated unified format. They

are optimized for very low memory usage and can run on CPU or low VR RAM systems. It is important to understand that these formats do not make the model

smarter or more creative. They do not change what the model knows. They only

change how efficiently that knowledge is stored and processed. You can think of it like the same video saved in different resolutions. The content is

different resolutions. The content is the same, but the file size and playback requirements are different. For most

users, FP16 models are the best starting point. AIO models are great for

point. AIO models are great for beginners. FP8 is useful if your GPU

beginners. FP8 is useful if your GPU supports it. GGUF is best when memory is

supports it. GGUF is best when memory is very limited. Once you understand this,

very limited. Once you understand this, choosing models becomes much easier. You

saw in the workflows that I include links to models, but you might wonder where I find those models, right? One of

the sites is hugging face, but it is not the most beginnerfriendly one. At the

top, we have a models tab. And here you can find a lot of models, but not all of them are diffusion models or used for generating images. Some are for video,

generating images. Some are for video, some for audio, some for large language models, and many are not compatible with Comfy UI. Some require different

Comfy UI. Some require different interfaces to run or they are so large that you cannot even run them on your computer. For example, I can sort them

computer. For example, I can sort them by text to image. And here you can see some popular ones like Quen, Zimage, or even the Flux model. If I click on one

of these, you will see that some models require you to sign in and accept certain terms. Each model has a license.

Some are open- source, some are free with conditions, and others are available only in certain countries. By

default, you are on the model card. This

is basically an info page about the model. On another tab, you have files

model. On another tab, you have files and versions where the files are usually available in different formats like in this example. And there are a few more

this example. And there are a few more files inside those folders. Let us go back to the homepage. Here you can also search for a model if you know the name

or browse popular ones like Z image.

Always check the tabs at the top to find more information about the models since they can be quite large and you want to make sure you can actually run them on your system. I know this is a lot of

your system. I know this is a lot of information, but as I said before, I usually include the model link directly in the workflow, so you do not have to stress about it. Still, it is good to

understand how models work and where they come from. Another site that is more beginnerfriendly and better organized is the Civot AI website. The

downside is that recently they removed access for some countries like the UK.

If you are from one of those countries, you will need a VPN to access it and download models. If you click on the

download models. If you click on the models tab, you can find all kinds of models for different interfaces like Comfy UI, Forge UI, and others. Most of

them are compatible with Comfy UI. On

the right side, you have filters. These

let you sort models by when they were added. You can also filter by model

added. You can also filter by model type. For example, checkpoints are the

type. For example, checkpoints are the main AI models. You can also filter by Laura or Control Net since we talked about those model types in previous

chapters. Of course, you can also filter

chapters. Of course, you can also filter by base model so you know what is compatible with your workflow. The first

workflow we used was based on an SD 1.5 model, but I can also sort by other ones like the Fluxdev model or an older one

like SDXL. By the way, SDXL is newer

like SDXL. By the way, SDXL is newer than SD 1.5 and Flux is even newer than SDXL. So, use these buttons to sort

SDXL. So, use these buttons to sort models. If you already know the name,

models. If you already know the name, you can just search for it. For example,

I can search for Juggernaut. Here you

can see multiple versions of that model based on different base models like SDXL or SD1. If I click on SD1.5,

or SD1. If I click on SD1.5, I will only see those versions. If I

click on the one that says Juggernaut, it opens the model info page. At the

top, you can see different versions. We

used the Reborn version, but you can try other versions as well. Below that, you have details about the model. It clearly

says what type it is. It can be a checkpoint allura or a checkpoint merge.

In this case, it is a checkpoint merge, which means the main SD 1.5 model was mixed with other SD 1.5 models to combine the best parts of each one. It

also clearly states that this is a base SD 1.5 model. You can see the publish date as well, which shows that it is quite old. At the top you have the

quite old. At the top you have the download button and the file will go into the correct folder. In this case, it goes into the checkpoints folder. As

I mentioned before, the author sometimes includes recommended settings. You can

see them here. This is how I knew what settings to use in the K sampler for the workflow. At the top, you also have a

workflow. At the top, you also have a gallery with images generated using that model. This helps you understand what

model. This helps you understand what the model is capable of. Some images

also have an info button that shows the prompt and settings used to generate that image. So, explore Civid AI if you

that image. So, explore Civid AI if you have access to it and see what models and loris are available. Once you are signed in, you also get more options to control what type of models are visible

since some are disabled by default. So,

now that we played a little with that old SD 1.5 juggernaut model, it is time to try a better, newer model to see how far AI has come in just 2 years. Let us

go to workflows again and this time open the workflow named 5A, the one for Z image turbo with the all-in-one model.

The workflow is quite similar to the others we tried. We just have two extra nodes this time. One of them is this conditioning node that we use instead of the one for the negative prompt. And the

other one is this model sampling node.

Since we are using a new model, we need to download it because we do not have it yet. The model is called Zimage Turbo.

yet. The model is called Zimage Turbo.

Juggernaut and Zimage Turbo are very different types of models built with different goals in mind. Juggernaut is

based on stable diffusion 1.5.

It uses the classic diffusion architecture that has been used for years. The model file itself is

years. The model file itself is relatively small, usually around 2 GB.

Juggernaut was created by the community by fine-tuning and merging stable diffusion models. Z image Turbo is a

diffusion models. Z image Turbo is a newer type of model created by the Tongi team from Alibaba. It uses a more modern architecture designed to generate images

more efficiently. Even though Zimage

more efficiently. Even though Zimage Turbo is much larger in file size, it is optimized to produce good results in very few steps. One important difference

is how the models understand prompts.

Juggernaut relies on the classic clip text encoder. It understands prompts if

text encoder. It understands prompts if are short, but it often requires careful wording, sometimes key words like prompts. Zimage Turbo uses a more

prompts. Zimage Turbo uses a more advanced text understanding system inspired by large language models. This

allows it to understand prompts in a more semantic and natural way. Because

of this, Zimage Turbo can often follow instructions better, even with shorter or more loosely written prompts. So, in

simple terms, Juggernaut is smaller, very flexible, and highly compatible.

Zimage Turbo is a larger, newer model, and smarter at understanding what you ask for. So, we have here an all-in-one

ask for. So, we have here an all-in-one model. And there are two types, a

model. And there are two types, a smaller one, FP8, and a bigger one, BF-16. It depends on your graphic card.

BF-16. It depends on your graphic card.

If you can run the big one, use that one. For this first episode, I want to

one. For this first episode, I want to run it on a low V RAM card, so I will use the FP8 small version. All-in-one

means it has everything it needs included. The clip and VAE model in this

included. The clip and VAE model in this case, so we do not need to download those models separately. That is why it is easy to use for beginners. The models

go into the checkpoints folder and there we can create a special folder for the Zimage model. Also, if you want to learn

Zimage model. Also, if you want to learn more about the model, I included an info link here. So click on it. Now we are on

link here. So click on it. Now we are on the hugging face page and you can learn more about this specific version from workflows to different model versions.

If we go to files, we can see different model versions that you can try depending on how good your graphic card is. So let us test the small version.

is. So let us test the small version.

Click here. Then go to comfy UI. Go to

models then checkpoints and create a folder called Z image. So everything is more organized inside this folder. Place the

model. Since this is a big model, you need to wait for it to finish downloading. Because Comfy UI was open,

downloading. Because Comfy UI was open, you can see that it does not appear in the list yet, only Juggernaut. So I

press the R key to refresh node definitions. And now we can see both

definitions. And now we can see both models nicely organized in folders.

First is Juggernaut and second is Zimage. So let us select the Z image

Zimage. So let us select the Z image model. That is all for the model

model. That is all for the model download. And now we can run the

download. And now we can run the workflow. The first time you run a

workflow. The first time you run a workflow, it is slower because it needs to load the model. The second time you run it, it should be faster. For me, it took about 10 seconds because I have a

lot of ROM and VROM. The result looks pretty good compared to the robots we used to get with the SD 1.5 model. We

have much nicer details. For the image size, I used a smaller size so it runs faster. This model was trained with

faster. This model was trained with bigger images, not like SD 1.5. So, we

can even use larger sizes like 1,600 pixels if we want. Even if you go bigger than the size it was trained on, it does not produce many errors like SD 1.5 did.

It just becomes a bit more diffused.

Usually, for most newer models, a good place to start is around 1,024 pixels.

So let us say I try a landscape image this time using these sizes. The result

looks pretty good. I like it.

Let us go back to workflows again and open the first workflow to see what is different and how we can recreate the Z image turbo workflow. So we already have the right node to load the model in this

case. So I just select the model from

case. So I just select the model from the list for empty latent. This one is used more for older models with a different architecture. Many newer

different architecture. Many newer models use a different empty latent node. If we look at the nodes and search

node. If we look at the nodes and search for empty, we have one empty latent and one empty SD3 latent. In this case, we want the one with SD3. On the surface,

they look identical. It is just a different latent representation internally. If we make it purple, it

internally. If we make it purple, it looks like the previous one. If you do not have enough VRAM to run this, you can use sizes like 768 for width and

height. I will use 1,024 pixels since it

height. I will use 1,024 pixels since it is the most popular size and my system can handle it. So let us delete the old empty latent. And now reconnect the new

empty latent. And now reconnect the new node. This model does not use a negative

node. This model does not use a negative prompt, only a positive one. So I will remove the negative text. You can also collapse it if you want. That way you

know not to add a negative prompt. Then

we have the settings which as you remember are different from model to model. If we look here, we only have

model. If we look here, we only have five steps. So, it can generate with

five steps. So, it can generate with fewer steps and the CFG is one. Let us

change the steps to five and the CFG to one. When the CFG is one, it ignores the

one. When the CFG is one, it ignores the negative prompt. We also need a sampler

negative prompt. We also need a sampler and auler.

So, let us add the DPM++ SDE sampler.

And for theuler, we use beta. Let us see what else is missing. We have this extra node called model sampling oraflow. It

has a long name. Not sure why it cannot be something simpler, but anyway, let us search for that node.

We change the shift to three and we make the connection go through that node just like we did with Laura. The model

sampling aura flow node is a special node that modifies the model sampling behavior before it goes into the K sampler. It is designed to work with

sampler. It is designed to work with models that use the Auraflow sampling method, which is an advanced sampling technique used by some modern models for better stability and quality. What this

node does is apply a sampling adjustment or patch to the model itself. So, the

sampler works in the best way for that model. The node takes the current model

model. The node takes the current model and a shift value as inputs and outputs a modified version of the model with the Auraflow sampling logic applied. The

shift parameter controls how strong that sampling adjustment is. Changing the

shift value can subtly affect contrast, sharpness, and how the generation behaves internally. So, we changed the

behaves internally. So, we changed the empty latent to the SD3 version. We

added a node to shift the model values and we adjusted the settings to work better with the Z image model that we loaded in our workflow. Let us run it and see if it works. As you can see, it

works just fine and we get a nice robot.

If we look at the previous Z image workflow, you can see that it does not use a negative prompt, but instead it has a conditioning zero out node. So let

us go back to our workflow and search for that conditioning node. As I

mentioned before, this model does not use a negative prompt. So you might wonder why we do not just delete the node. We could do that, but then we

node. We could do that, but then we would have a missing input and the workflow would throw an error. To fix

this, we use the conditioning zero out node. You can make space for it and

node. You can make space for it and place it between nodes if you want. This

conditioning does not come from clip like the negative prompt did before. We

connect it directly to the negative input on the K sampler. You can place it wherever you want to make the connections clearer, but I like to put it under the positive conditioning to

save space. The conditioning zero out

save space. The conditioning zero out node does exactly what its name suggests. It removes the influence of a

suggests. It removes the influence of a conditioning input without breaking the workflow. In simple terms, it takes a

workflow. In simple terms, it takes a conditioning signal, usually text conditioning, and replaces it with a neutral zeroed version. So the model still runs normally, but that

conditioning contributes nothing to the generation. Why this exists and when it

generation. Why this exists and when it is used? In diffusion models, the

is used? In diffusion models, the sampler always expects both positive and negative conditioning inputs. If you

want to remove or disable one side, you cannot just unplug it. That would break the workflow. conditioning zero out is a

the workflow. conditioning zero out is a safe way to say use conditioning but make it have no effect. So if we run the workflow everything works fine without

any errors. Now the good part about Z

any errors. Now the good part about Z image turbo is that it is very good at realistic images but it is also very good at understanding prompts. For

example, if I want to create a portrait of a cat with a hat, I can easily get an image like this. But you can also create more complex prompts by using a large

language model. Maybe you use chat GPT,

language model. Maybe you use chat GPT, Gemini, or even a local LLM. I will use chat GPT for this example. I ask it for a detailed photo prompt and give it the

details of what I want. Chat GPT then gives me a long detailed prompt that I can copy and paste directly into Comfy UI. So, let us test it again. Now, we

UI. So, let us test it again. Now, we

get a different cat, but it is still a bit too simple. Let us make it more complex. I go back to chat GPT and ask

complex. I go back to chat GPT and ask for the cat to hold a rose in her mouth and wear a t-shirt that says Pixa.

Again, we get a long detailed prompt.

And from that prompt, we get this image.

Sometimes the model can take things very literally. So, you need to explain

literally. So, you need to explain details clearly if you want more control. For example, you might need to

control. For example, you might need to mention that you want a full rose held horizontally in the mouth and not something else. Let us create something

something else. Let us create something different now. This time a cartoon bunny

different now. This time a cartoon bunny since this series is full of bunnies.

Anyway again we get a nice prompt and the result looks like this. It is pretty cute. Maybe now I want the bunny to be a

cute. Maybe now I want the bunny to be a ninja. Let us see what this prompt

ninja. Let us see what this prompt generates. And we get our ninja bunny.

generates. And we get our ninja bunny.

If we generate again we get another one.

As you can see compared to older models the results with different seeds are quite similar. You do not get a huge

quite similar. You do not get a huge variation from seed to seed. That is why I recommend using longer prompts and adjusting each prompt carefully. This

model is very good at following instructions. So the more precise you

instructions. So the more precise you are, the more control you get over the result. Let us open the first workflow

result. Let us open the first workflow again so we can compare it with workflow 5A. Now let us say I use the same long

5A. Now let us say I use the same long prompt and the same fixed seed for both workflows. If I generate with Z image, I

workflows. If I generate with Z image, I get a robot like this one which looks nice and detailed. Now if we try the old juggernaut model using the same prompt

and the same fixed seed, the result looks like this. It is smaller and much less detailed. Let us copy this image

less detailed. Let us copy this image and paste it into this workflow. So you

can clearly see the difference in quality and also how well the image follows the prompt. But maybe this single test is not enough to fully see the difference. So let us try something

the difference. So let us try something else. Let us test text generation. Newer

else. Let us test text generation. Newer

models can generate readable text, but older models usually cannot. We normally

put the text we want inside quotes. So,

let us test that. Look at this result.

It looks very good. And it understood the assignment.

Now, let us go back to the juggernaut model and use the same prompt. We get

something like this. What is this? What

does it even say? Gold gola or something like that. It clearly cannot do text.

like that. It clearly cannot do text.

Let us go back to Z image and try another test. A red sphere on top of a

another test. A red sphere on top of a green cube placed on a black car.

We get this realistic result. Z image is more specialized in realism, but it can also do 3D paintings and other styles.

Now let us see what Juggernaut does with the same prompt. It gets the red sphere since that was mentioned first and then it gets lost and forgets what it needs to do next. So clearly Z image is a very

good model to have and you will probably spend more time playing with this model.

Still keep an eye on new models because they keep getting smarter and better as they get more training. You have now seen how an all-in-one model works and how we load checkpoints. In the next

chapter, we will use models that are split where clip and VAE are loaded separately so we can have more control.

Let us talk a little bit more about diffusion models. Open Comfy UI and then

diffusion models. Open Comfy UI and then open workflow 5A and also workflow 5B so we can compare them.

In the first workflow, Z image is loaded as an AIO model. AIO means all-in-one.

You can see that we used a load checkpoint node to load that model. The

checkpoint already contains the diffusion model, the text encoder, the VAE. Everything is bundled into a single

VAE. Everything is bundled into a single file. Advantages:

file. Advantages: Very easy to use, fewer nodes, less setup required. Good for quick testing

setup required. Good for quick testing and simple workflows. Disadvantages:

Less flexible. You cannot swap the text encoder. You cannot change the VAE.

encoder. You cannot change the VAE.

Harder to customize or optimize. This

format is designed for simplicity and convenience. Now, let us check the

convenience. Now, let us check the second workflow, the 5B version. You can

see that we have three nodes now instead of one. We have the load diffusion model

of one. We have the load diffusion model node that loads the main model. Then we

have the clip load node that loads the text encoder. And then we have the load

text encoder. And then we have the load VAE node that loads the VAE. So it is like we split the previous checkpoint into separate pieces. And now we have

more flexibility. Even though the final

more flexibility. Even though the final result is still Z image turbo, the pipeline is modular. Advantages more

control. You can change the text encoder and experiment with different VAE.

Better for advanced workflows and optimization and easier to update individual components. Disadvantages,

individual components. Disadvantages, more complex setup, more nodes, higher chance of misconfiguration if you do not fully understand what each part does.

However, this is actually one of my favorite workflows. The reason is

favorite workflows. The reason is flexibility and efficiency. With a

modular setup like this, you save disk space. For example, this VAE is the same

space. For example, this VAE is the same VAE used by the Flux model. So, if I already use Flux, I do not need to download the VAE again. With an

all-in-one model, every new version means downloading the entire model again, even if only one part changed. In

a modular setup, I can update or swap individual components. I can test

individual components. I can test different text encoders without downloading the main diffusion model again. So while modular workflows

again. So while modular workflows require more understanding, they are more efficient, more flexible, and better for experimentation.

That is why I personally prefer this approach. But we did not download these

approach. But we did not download these models yet. I suggest that when you

models yet. I suggest that when you follow this tutorial, you test everything to see what is better or faster on your computer and then keep only the ones you like. There is no

point in keeping all types of models if they do the same thing unless you have a lot of space on your hard disk. So let

us start with the main diffusion model.

This long name is actually describing how the model is built and optimized. Z

image turbo. This is the model family and architecture. FP8. This means the

and architecture. FP8. This means the model uses 8 bits floatingoint precision. FP8 models use much less

precision. FP8 models use much less memory than FP16.

I did not include a link in this tutorial for the FP16 version, but you can find those online if you have more VRAM and want to try them. Scaled refers

to the FP8 format being calibrated for better precision. This improves quality

better precision. This improves quality and stability compared to a raw unscaled FP8 format. You can think of it as FP8

FP8 format. You can think of it as FP8 with tuning for better accuracy. E5M2.

This is the specific FP8 encoding variant used. KJ. It is usually a

variant used. KJ. It is usually a variant tag or builder ID added by the person or team that exported or repackaged the model. It does not change

the model itself. It just helps distinguish between different builds.

Safe tensors. This is the file format.

Safe tensors is a safe and efficient format and is recommended over older formats like CKPT for better stability and speed. We can download this model

and speed. We can download this model from here. And I also added more info

from here. And I also added more info about the model so you can check different versions. You can also see the

different versions. You can also see the author. So now you know what that KJ in

author. So now you know what that KJ in the model name stands for. So let us click here and see where we place it.

Navigate to the comfy UI models folder.

You should already know this by now.

This time we do not use the checkpoints folder because that is usually for complete models that already include most of what they need. Instead we place this one in the diffusion models folder.

To keep things organized, we create a folder called Zimage and place the model inside. Next, we have the text encoder.

inside. Next, we have the text encoder.

I used one recommended by ASD from the Discord server, but there are other text encoders you can try made by different people. For this one, we again go to the

people. For this one, we again go to the models folder and this time we place it in the text encoders folder. Here I do not create a Zimage subfolder because

many text encoders work with multiple models. I usually create subfolders only

models. I usually create subfolders only for main models Laura and controlN net when it is important for the workflow that they match the same base model.

Then we have the VAE. This is the same VAE that we might also use later for the flux model. So again we go to the models

flux model. So again we go to the models folder and this time we place it in the VAE folder. Some of these models are

VAE folder. Some of these models are large so wait for them to finish downloading. Once everything is done,

downloading. Once everything is done, press the R key to refresh the node definitions. Now let us check that all

definitions. Now let us check that all models are visible and selected correctly. The Z image diffusion model

correctly. The Z image diffusion model is here. The clip text encoder is here

is here. The clip text encoder is here and the VAE is also here. That means we have everything we need to run this workflow. So let us click run and see if

workflow. So let us click run and see if we get any errors. Everything works fine and we get this image. What I usually do next is compare the results with the first workflow. When I have multiple

first workflow. When I have multiple models available, I download all of them, test them, keep the ones I like the most, and delete the rest. When I do testing, it can get confusing which

model generated which image. So, here's

a small trick. Double click on the canvas, search for it tools ad, and select the node called IT tools add text overlay. This node comes with easy

overlay. This node comes with easy installer, but if you have a different Comfy UI version, you can install the I tools nodes from the manager. We add

this node right after VAED code and before the save image node. This way,

the final image goes through this node.

The text overlay is added and then the image is saved to disk. For example, we can add the model type in the text overlay. You can also add more text like

overlay. You can also add more text like the model name or other info. Let us say I add FP8 scale diffusion. So I know this image comes from this workflow. Now

when I run it, text will be added on top of the image. We can control the text, the background color, the font size, and whether the text overlays the image or

is placed under it. Let us disable overlay mode and try again. Now the text is under the image. This way we know exactly which model generated it. Next,

I select this node and press Ctrl + C to copy it. Then I go to the first workflow

copy it. Then I go to the first workflow and press Ctrl +V to paste it. Now we

connect the node the same way as before.

We need a name that represents this model. So let us name this one FP8

model. So let us name this one FP8 all-in-one.

Now I can test it and you can see the text under the image. To make a fair comparison, we use the same settings for both workflows. Let us also enable the

both workflows. Let us also enable the bottom panel to see how much time it takes to generate. As you remember, the first time you run a model, it is slower because it loads the model. We can

unload the models using this button and clear the cache using this one. This

lets us compare which model loads faster and which one generates faster. I run it once and you can see the first run took around 8 seconds. The second and third

runs are faster around 3.57 seconds. Now

let us go to the second workflow.

I unload the models and clear the cache.

Then run it again a few times.

This one loads slower, but the second and third runs are faster. On my older PC, the all-in-one model was faster, so it really depends on your system. Test

it yourself and see what works best for you. Now, let us look at quality. We use

you. Now, let us look at quality. We use

a fixed seed with a value of 50 and run the workflow. Then we do the same for

the workflow. Then we do the same for the first workflow, same fixed seed and run it.

Right click on the image result and copy the image. Then create a new workflow

the image. Then create a new workflow where we compare the two images. I press

Ctrl +V to paste the image and you will see it adds a load image node with that pasted image. I do the same for the

pasted image. I do the same for the image from the second workflow. Now we

have both results. Let us add an image compar node so we can compare them.

Connect the first image to image A and the second image to image B. Then run

the workflow. Now we can enlarge and compare them. The results are quite

compare them. The results are quite similar but still slightly different.

This happens because I used a text encoder that is different from the one included in the all-in-one workflow. If

I had used the same text encoder, the results would have been much more similar. Let us try again with a

similar. Let us try again with a different prompt. Maybe we do a portrait

different prompt. Maybe we do a portrait photo of an old woman. We get a result like this one. Now let us do the same for the second workflow and we get another woman for this one. Let us copy

both images and go to the compare workflow. Select the load image node and

workflow. Select the load image node and use Ctrl +V to paste the image into that node. Now we can compare the two

node. Now we can compare the two results. Again, because the text encoder

results. Again, because the text encoder is different, the comparison is a bit harder. Still, I kind of like the FP8

harder. Still, I kind of like the FP8 scaled version more. You can see that we use the same settings for both workflows. One has everything included

workflows. One has everything included and the other has everything separated.

If I searched for the same clip used in the AIO workflow, I could get much closer results. This load clip node is

closer results. This load clip node is something we will use in other workflows as well. As you can see, it has a type

as well. As you can see, it has a type option that lets you select different types of models to match the diffusion model you loaded. Do not stress too much if you do not understand everything yet.

It will make more sense as you practice.

If we look at the VAE, you can see where it goes. As you remember, we use it to

it goes. As you remember, we use it to connect to nodes like VAE decode and VAE encode. If we go back to the first

encode. If we go back to the first workflow, that VAE is coming directly from the one included with the main model. Different model formats do not

model. Different model formats do not change what an AI knows. They only

change how that knowledge is stored. So

choosing the right format is about balancing quality, speed, memory, and flexibility for your hardware and

workflow. GGUF stands for GPT generated

workflow. GGUF stands for GPT generated unified format and it is a model format designed to run large models efficiently on systems with limited memory. In Comfy

UI, let us go to workflows again and this time open workflow 5B and 5C so we can compare them. The workflow we saw in the previous chapter had three nodes.

Load diffusion model, load clip, and load VAE. And you can see that it was

load VAE. And you can see that it was loading safe tensors files. If we go to the GGUF workflow, you can see that some nodes are different. We now have a unit

loader that has GGUF in the name and the file format is GGUF instead of safe tensors. We use this node to load GGUF

tensors. We use this node to load GGUF type diffusion models. For the clip, we could have used the previous node to load an existing text encoder, but I wanted to show that you can also use a

clip loader GGUF node to load text encoders in GGUF format. The last node is the same load VAE as before. So,

compared to the previous workflow, we only changed two nodes so we can load GGUF models, but we do not have those models yet. So let us go to the notes

models yet. So let us go to the notes and check which node loads which model and also look at the download links. If

we go here you can see there are many GGUF model versions. Most of the time you will see something with a Q version

in the name like Q2, Q4, Q6 or Q8. Most

of the time I use Q8 models. If that is too big, I switch to Q6. And if that is still too big, I use Q4. The lower the Q

number, the lower the quality of the generation, but the models are smaller and can be faster on limited hardware.

Let us look at what this model name means. Zimage Turbo. This is the core

means. Zimage Turbo. This is the core model family and variant name Q4. This

means the model is quantized to four bits precision. Lower bit quantization

bits precision. Lower bit quantization reduces file size and VRAMm usage. The K

indicates a specific quantization method, usually a blockbased or KQ quant method, which helps preserve model accuracy even at low bit precision. The

S usually means small or standard variant within that quantization type.

It trades a bit more quality for a smaller footprint compared to M versions which stand for medium. GGUF. This is

the file format. Let us download this model and give it a try. We go to the Comfy UI folder, then to the models folder. This main model, just like in

folder. This main model, just like in the previous workflow, goes into the diffusion models folder. Since we

already have the Z image folder from the previous chapter, I will place it in the same folder because it is the same base model just a different quantization. So,

we save it there. Now, let us do the same for the text encoder. We click here to download it. Then again go to the models folder, find the text encoders

folder and place the model there. For

the VAE, if you followed the previous chapters, you should already have it. If

not, download it and place it in the V folder. These models are big, so wait

folder. These models are big, so wait for them to finish downloading. After

the download is finished, press the R key to refresh node definitions so Comfy UI can see the new models. Now we go back to the nodes and make sure we can

select the models. The Z image model is there. The text encoder is also there. I

there. The text encoder is also there. I

am using the one with GGUF in the name because if you use a safe tensors version here, even if it does not give an error, the results will not be what you expect. For the VAE, we already have

you expect. For the VAE, we already have it. So now we have everything we need.

it. So now we have everything we need.

Let us run the workflow. The result

looks pretty good for a Q4 version. Let

us open the bottom panel and run it again. You can see that the first time

again. You can see that the first time it loads the model, it is slower, but after that it takes around 5 seconds to generate. In my case, this was slower

generate. In my case, this was slower than the all-in-one model or the FP8 scaled version. That does not mean it

scaled version. That does not mean it will be the same on your system. On some

systems, it might be faster. That is why I keep saying you should test everything and then keep the best model for your setup. What is best for me will not

setup. What is best for me will not necessarily be best for you because we have different video cards, different VRAM amounts and probably different

drivers. Now I am curious how a larger Q

drivers. Now I am curious how a larger Q version will perform. So let us go back to the model list. This time I want to test a bigger one. The biggest available

here is Q8 which is around 7 GB in size.

I have 24 GB of VRAM so I can easily fit this model in memory and even larger ones. Sometimes if a model is larger

ones. Sometimes if a model is larger than your available VRAM, it will be slower because it tries to load the model in parts. You lose time during that process and generation can be slow

or it can even crash CompuI and force you to restart it. So let us download this one. We place it in the diffusion

this one. We place it in the diffusion models folder inside the Z image folder.

right next to the Q4 version. Again,

wait for it to finish downloading.

Luckily, I have a fast internet connection. After that, press R to

connection. After that, press R to refresh. So now in the unit loader, we

refresh. So now in the unit loader, we can see both models. By the way, UNET is the main neural network inside a diffusion model that predicts what noise

should be removed at each step to turn random noise into an image. First let us change the seed to fixed so we can compare the models properly. I get this

image for Q4. I copy the image, create a new workflow and paste it there. I will

rename the node to Q4 so I know which model was used. Now let us go back to the workflow and select the Q8 model.

Everything stays identical. Only the

model changes. Let us see what we get.

It looks similar at first glance. I copy

this image, go back to the new workflow, and paste it there as well. I rename

this one to Q8.

At first glance, the Q8 version seems to have fewer mistakes and looks clearer.

Let us add an image compar node to compare them properly.

Connect the two load image nodes to the image compar.

The first image shown is image A, which is the Q4 version. As we move the cursor to the right, we see the Q8 version. In

my opinion, Q8 has better details and fewer errors. For example, some bolts

fewer errors. For example, some bolts seem to be missing in the Q4 version, while the Q8 version looks more complete. In most cases, Q8 will be

complete. In most cases, Q8 will be better than Q6 and better than Q4 in terms of quality. But now, let us check the speed.

The first time Q8 took longer to load because it is a 7 GBTE model. Let us

change the seed and try again. Now the

second run takes under 4 seconds. Let us

try once more. We change the seed again and once more it takes under 4 seconds.

Now let us switch back to the Q4 model.

This one is lower quality but also smaller. You can see that the first time

smaller. You can see that the first time it loads faster. Let us change the seed and try again. The second run takes more than 5 seconds. Let us try one more

time. And again, it takes more than 5

time. And again, it takes more than 5 seconds. This is why I keep saying you

seconds. This is why I keep saying you should test all of them and then decide.

For me, Q8 is faster and gives better quality than Q4, but that is because my video card probably works better with that quantization on your system.

Especially if you have an older card, it might be the opposite and Q4 could be faster. So please test them yourself and

faster. So please test them yourself and then keep the one that gives you the best quality and the best speed on your system.

So let's go to workflow again. And now

it might make sense why I named workflow 5 all these three workflows because they are workflows for the same model just different model types. So let's open

workflow 5a and you will see how we can adapt the workflow. Let's move this to the side. So, this has an all-in-one

the side. So, this has an all-in-one model with everything included. We want

to change it into a workflow where the models are split. So, let's start with the model. Instead of load checkpoint,

the model. Instead of load checkpoint, we search for load diffusion model. This

one only loads the model without clip and VAE. And we select the Z image model

and VAE. And we select the Z image model from the list. Then, we need a node that has that clip output that loads the text encoder. So, we search now for a node

encoder. So, we search now for a node called load clip. Let's make it bigger so we can see the parameters.

We first select the text encoder. Then

we select the type. Z image uses luminina 2. Lumina means light. You can

luminina 2. Lumina means light. You can

think of it like reaching the end of a tunnel. Z is the last letter of the

tunnel. Z is the last letter of the alphabet and at the end you see the light. Luminina 2 represents a newer,

light. Luminina 2 represents a newer, clearer way for the model to understand prompts and guide image generation. It

simply means more advanced guidance compared to older models. Then what is left is the VAE. So, we use the load VAE node and we select that VAE model. So,

now all that's left to do is to redo the connections. We drag a link from model

connections. We drag a link from model to model. Pretty easy, right? Now, we

to model. Pretty easy, right? Now, we

need the clip. So, let's drag another link. And all that's left is the VAE.

link. And all that's left is the VAE.

This one will connect to the VAE decode node. And if the workflow is image to

node. And if the workflow is image to image, it will go to VAE end code also.

Now, we can get rid of the load checkpoint node. So now we successfully

checkpoint node. So now we successfully replaced all the models and basically we have the workflow version 5B that we used before. So let's run it and it all

used before. So let's run it and it all works okay as it should. Let's say the model we use now is too big and our video card doesn't have enough VRAM.

Then we can try a GGUF model to see if it works faster or better. So let's

search for a node again and this time search for the unit loader, the one that has GGUF in the name. So in this node we can select a GGUF model. You can see I

downloaded two versions before. So let's

say Q4 is smaller in size than the FP8 version in this case. So it has better chances to run faster than a bigger model. But as you saw before on my

model. But as you saw before on my computer Q8 was faster. So maybe I will use that to get better quality instead.

Let's connect the model. And now we can remove that node. So we replaced an FP8 safe tensors model with a GGUF version and if we run this workflow you can see

it works just fine and we got a nice result. If for some reason you are not

result. If for some reason you are not happy with the text encoder maybe it is not so accurate or it is too big we can try a GGUF version of the text encoder

also. So let's delete that node and

also. So let's delete that node and let's search for clip loader the gguf version. You can see it has clip loader

version. You can see it has clip loader in one word. So now we can select the GGUF model. And of course we need to

GGUF model. And of course we need to adapt the type since it is not stable diffusion. It is luminina 2 instead.

diffusion. It is luminina 2 instead.

Remember that light at the end of the tunnel and then link the clip to text encode prompt. And basically now we have

encode prompt. And basically now we have the workflow 5C. So you saw that having a modular version allows you to change models and have more freedom just like

on your computer. If you're not happy with a mouse or your printer, you can change it with a smarter or faster version. Now, if you do have enough

version. Now, if you do have enough VRAMm, you can try to increase the size for width and height to get more details. For example, at this size, I

details. For example, at this size, I got this image and now we can see more details on those cables and overall. But

usually for Z image, I use values between 10, 24, and 1280 pixels. So, at

the moment of this recording, Zimage is a pretty good model to have. It is free and you can generate all kinds of stuff with it. Let's compare a few of these

with it. Let's compare a few of these models to see what the difference is.

So, for this one, I compared the FP8 all-in-one version with the FP16 all-in-one version, which is double the size. The results are quite similar.

size. The results are quite similar.

Maybe the FP8 is a little more desaturated compared to FP16 and FP16 might be a little bit clearer, but it is not a huge difference. Both are good

quality. For the Viking image, the FP8

quality. For the Viking image, the FP8 version has fewer details in some areas.

In FP16, it added some extra things like more ornaments. Again, FP8 looks a

more ornaments. Again, FP8 looks a little more desaturated. For the bunny, both look good. So, I would say if FP8 is faster, has half the size, and the

results are very close, you can get away with FP8 and keep that. Now, let's

compare the FP8 version with the FP8 scaled version. Keep in mind that the

scaled version. Keep in mind that the text encoder is also different in this case compared to the one included in the first model, but the results are still quite similar. Sometimes FP8 does it

quite similar. Sometimes FP8 does it better, sometimes the scaled version does it better. So if you do more tests, you can decide which one is better for you. Since the results are very close,

you. Since the results are very close, again, it makes sense to keep the one that is faster on your system. Now,

let's compare the FP8 version with the GGUF version. Instead of the Q4 version,

GGUF version. Instead of the Q4 version, I will use the Q8 version downloaded from here so we can see the difference.

For the portrait, it looks a little clearer on the Q8 version. For the

Viking, some details are more defined on the Q8 version. For the Bunny, it is pretty similar for me. For other models we will test in the future, the difference might be bigger and more

obvious, but in this case, the difference is quite subtle. So, which

one will I keep for my video card? Maybe

the FP8 scaled or the Q8 version, mainly because they are modular and I can save space and time when I use the same models for other workflows. In this

chapter, we explore batch generation and styles. So, let's open another workflow.

styles. So, let's open another workflow.

this time workflow 5A since it has fewer nodes and you can see things better. But

the methods I show work with any workflow. Right now, each time you press

workflow. Right now, each time you press the run button, the workflow runs once and you get a single image. But what if you want more images and you do not want

to click run every time? In this node, we have an option for batch. By default,

it is set to one. You can change that, but keep in mind it will use more VRAMm because it is like running multiple workflows at the same time. If your

video card can handle it, it will be faster than generating one image at a time. So now we get two images. If I

time. So now we get two images. If I

change it to four and run again, we get four images. If we toggle the bottom

four images. If we toggle the bottom panel, we can see the time it took for one image, for two images, and for four images. If we multiply 3.77,

images. If we multiply 3.77, which is the time for one generation, by four, we get over 15 seconds. But

because we used batch, it only took 13 seconds. So, you need to see what batch

seconds. So, you need to see what batch size works best for your video card. I

might be able to use a bigger batch, but you might need a smaller one. Now, from

these four images, we can click on any of them to open it bigger. These images

are saved in the output folder as well.

To close the big preview, you can use the X in the top right. Let's open

another one. You can also navigate using the buttons in the bottom right corner, so you can check all generations. The

bigger the image, the more VRAMm it will need. Let's say I set the batch to

need. Let's say I set the batch to eight. Since the image size is quite

eight. Since the image size is quite small, the result is eight images. You

can check the results and pick your favorite. You can rightclick on an image

favorite. You can rightclick on an image and save it in any folder you want. Now

let's change the batch back to one. So

we only get one image. Next to run, we have an arrow that shows multiple options. Here we also have batch count.

options. Here we also have batch count.

This is not the same as the batch we used before. Think of this like a

used before. Think of this like a counter where you tell it how many times to run the workflow. So if I set the value to four and hit run, it will run

once, then again, and again until it has run four times. This is a bit slower than the previous batch method, but it uses less VRAM. If we add these values,

we get over 14 seconds. With the batch and empty latent, we got 13 seconds. You

might say 1 second is not much, but if you use bigger images and longer workflows, seconds can quickly turn into minutes. Let's change the batch back to

minutes. Let's change the batch back to one. And let's explore more run options.

one. And let's explore more run options.

Run on change will run the workflow when we change a value. So if I change the seed, it will start running. It should

stop after the run. But I am not sure if this is a bug or if this is how it is supposed to work. Because the seed is random, it keeps generating continuously after the first change. But if the seed

is fixed, it only runs the workflow when I make a change and then it stops. So I

will stop it manually by switching back to run. If I change it to run instant,

to run. If I change it to run instant, it will generate forever until you stop it. So do not forget to stop it by going

it. So do not forget to stop it by going to the arrow and selecting run. After it

finishes that workflow, it will stop.

But what if we want to run multiple prompts? Until now, we only had one

prompts? Until now, we only had one prompt and the seed was different. But

for the Zimage model, for example, the seed variation is not that big compared to other models like flux or stable diffusion. So let's search for a node

diffusion. So let's search for a node called I tools line loader. This node

loads each line as a prompt. If we drag a link from this line loader output and connect it to the text encoder, a small dot will appear in the top left corner

of that text input. By default, you have three prompts here, cat, dog, and bunny.

Let's say the first prompt is a cat photo. The second prompt is a bunny with

photo. The second prompt is a bunny with a flower. And the third prompt is a lion

a flower. And the third prompt is a lion logo. We have a seed here that decides

logo. We have a seed here that decides which prompt will generate. And we also have control after generation. Randomize

means that after each generation the seed will change to a different random value. So let's run it. We got a cat and

value. So let's run it. We got a cat and now we have a different seed.

For this seed we got a bunny. Now let's

change the seed to fixed so we can understand better how this works. For

the seed we put zero. In computer

programming lists usually start with zero not with one. So instead of 1 2 3, it is 0 1 and 2. So 0 corresponds to the

cat prompt. If I run the workflow, I get

cat prompt. If I run the workflow, I get a photo of a cat. If I change the seed to one, it corresponds to the second prompt, which is the bunny. So the

result is a bunny. And for seed 2, we get a lion logo. Now we know the order in which this node uses the prompts. Can

you guess what we will get for seed 3?

It will start over with the first prompt. So the result will be a cat

prompt. So the result will be a cat photo. Let's add another prompt like a

photo. Let's add another prompt like a rose and maybe a house with a car in front. I will start with zero so it

front. I will start with zero so it starts with the first prompt. Then for

control after generate I will use increment so it starts with the first prompt and continues with the next one and so on. This way it is more controlled and not random. Now that we

have five prompts I can change the batch to five so it runs the workflow five times. You can see it will generate all

times. You can see it will generate all those images one prompt at a time in order and it will stop after five generations.

You can also put 10 if you want to get two generations for each prompt or you can let it run continuously and stop it when you get something you like. We saw

how we can load prompts line by line, but we can also load prompts from a text file. Let's search for it tools prompt

file. Let's search for it tools prompt loader. As the name says, this node

loader. As the name says, this node loads prompts from a file. Let's drag a link from it to the positive prompt. You

can see here it says file path. We

already have an example with a prompts.txt file. So let's run it. It

prompts.txt file. So let's run it. It

will pick a random prompt from that file. And the result is this cat. Now

file. And the result is this cat. Now

let's find that prompts text file. Let's

go to the Comfy UI folder. Then go to custom nodes. And here look for the

custom nodes. And here look for the Comfy UI tools folder. These are all the files used by that node. Basically, the

note itself looks like this. If we go to the examples folder, we have a text file with prompts. Let's open it. You can see

with prompts. Let's open it. You can see we have a few prompts here. And the

image that was generated corresponds to one of these prompts. You can delete everything and add your own prompts here one by one. Or you can ask chatgpt to generate a bunch of prompts. So maybe I

will add one for a dog, maybe one for a cat, and one for a rose. Now I can save that file and close it.

Let's go back to Comfy UI and generate again. It should pick prompts from the

again. It should pick prompts from the same text file. Now we got that cat playing with a mouse. Let me remove this node and try again to see if we get another prompt. And now we got a rose.

another prompt. And now we got a rose.

If the text file is in a different location, you just add the path to that file here so it knows where to load it from. And of course, you can change to

from. And of course, you can change to run instant and let it run, then stop it when you have enough images generated.

Let's delete this node and I will show you more things you can do. Let's search

for a node called it tools prompt styler and select this one. This node picks prompts or art styles from a file. We

have positive and negative prompts. But

since our workflow only uses the positive prompt, we drag a link from there and connect it to the positive prompt input. Here we have an area where

prompt input. Here we have an area where we can type our prompt. Let's say I type a white bunny holding a rose. Then we

can select the style file. These files

that contain different prompts are stored locally. Let's go to the custom

stored locally. Let's go to the custom nodes folder again, then to eye tools, and this time go to styles. You can see here a few example style files, which

are actually YAML files. If we go to more examples, we have even more. Now,

if we look back here at the file list, we can see exactly those files. Among

them, there is one called Pixaroma. I

asked the creator to add my file there so you can access it easily. Thanks,

Mikotti. Once you have the file selected, you can choose a template from that file. You can see here different

that file. You can see here different templates. For example, I can select a

templates. For example, I can select a 3D icon or something else. What is

important to remember is that you select the file first and then the template inside it. For example, let's open one

inside it. For example, let's open one of these files with Notepad so we can see what is inside. Each template looks like this.

You have the template title, the negative prompt, and the positive prompt. As you saw in our workflow, we

prompt. As you saw in our workflow, we only use the positive prompt this time.

So, it will only pick that part. In the

positive prompt, you can see the word prompt inside brackets. That is where it takes your prompt and combines it with the rest of the template prompt. So, if

I do not have anything selected here for the template, it will use something like landscape photography of the prompt. and

instead of the word prompt, it will insert a white bunny holding a rose.

Basically, we recreated what these styles do. This system saves you time by

styles do. This system saves you time by letting you write a short prompt and combine it with a ready-made prompt from a template. This was created back in the

a template. This was created back in the days for stable diffusion models when we did not have access to AI prompt generators. It still works today with

generators. It still works today with most models that recognize these prompts even though you have much more freedom using a custom prompt made with chat GPT. So if the prompt is a white bunny

GPT. So if the prompt is a white bunny holding a rose and for the file I select the Pixaroma file, then for the template I can filter by landscape and select

photography landscape. Now when I run

photography landscape. Now when I run it, it should combine my bunny prompt with the landscape photography prompt.

And the result is this one. So

everything works quite nicely. Let's

change the template. Let's say I select 3D icon and run the workflow. And we get this 3D icon of a bunny holding a rose.

Now let's try an ancient Egyptian mural.

We run it again and we get this mural.

And you can clearly see our bunny in the image. Let's say I select the RCOO art

image. Let's say I select the RCOO art style. The result is this decorative

style. The result is this decorative style illustration of the bunny. Now

let's open the Pixarroma styles file with Notepad. You can see all the

with Notepad. You can see all the templates and prompts for each style and you can edit them if you want. Just keep

the same format. Otherwise, it will not work. Let's say I want to use the

work. Let's say I want to use the template for surreal toy. If I use it in the workflow, it will take this prompt and replace the word prompt with my

bunny holding a rose. So, let's test it.

From the templates, I search for surreal and select that toy style. Now, let's

run the workflow. And we get this 3D surreal bunny with a rose. Pretty cool.

Let's scroll down and see what else we can use. Let's say afroofuturism art.

can use. Let's say afroofuturism art.

That means it will use that specific prompt. Let's change the style and test

prompt. Let's change the style and test it. And the result is this one. Keep in

it. And the result is this one. Keep in

mind each model will interpret these prompts differently depending on how it was trained. There are over 300 styles

was trained. There are over 300 styles or prompts saved in this file from 3D to art styles, painting, photography, design, all kinds that I use most often.

Let's say I select the vector coloring book page style. Now, when I run it, I get this clean coloring page design. Of

course, if you want it to be more unique, give more information in the prompt, like how the bunny looks, how it is dressed, how the environment looks,

and maybe make it fit your story. Let's

say I want to do a cartoon illustration.

Let's search the list to see if we have something like that. For example, I can select a soft 3D cartoon environment and see what we get. And the result is this

one. Let's search for cute and test this

one. Let's search for cute and test this cute cyberpunk style. and we get this illustration. These are good for

illustration. These are good for discovering art styles you might not have thought to try yet. Now, let's

remove this node and search again. This

time, we look for it tools prompt styler extra. It is called extra because it has

extra. It is called extra because it has slots for multiple files and templates.

Let's connect it to the positive prompt.

For the base file, let's select the Pixarroma file since that one has the most styles. For the second file, I will

most styles. For the second file, I will use the same one. Let's set both to random. So we get a random combination

random. So we get a random combination of two styles. If I run it now, I get something like this. There is no bunny because we added a new node and we did

not add a prompt yet. Let's drag a link from the output called used templates.

This outputs the actual styles that were used. Then search for preview and add a

used. Then search for preview and add a preview as text node. Now when I run it, you can see what styles it combined.

Reflection with fantasy. Now, let's add the prompt, "A white bunny holding a rose," and generate again. This time, it combined propaganda art style with knitting art, something you probably

would not think to combine. Let's select

a third file, again, the Pixar file. For

the third style, let's select random or any other style you want. Now, if we look again, it combined Japanese traditional sticker and fine art. Let's

run it once more and we get this gilded fantasy bunny. Pretty cool. Let's try

fantasy bunny. Pretty cool. Let's try

again. And this time we get a cute minimal line art style. Of course, you can also manually select which styles to combine. For example, let's choose a

combine. For example, let's choose a game asset style combined with low poly.

And for the third one, select Adam Punk.

And the result is this one. You can also run it multiple times to get different seeds. By now, you should start to get

seeds. By now, you should start to get an idea of how styles work. Let's try

one last combination. Change low poly to a steampunk style. We get this image because we used a game asset style. If I

change the game asset to cute cartoon, I get this cute bunny in a steampunk environment. So, create your own styles

environment. So, create your own styles for the things you use most often. or

use chat GPT or other large language models to generate longer prompts that describe exactly what you need. In the

previous chapter, we saw how we can use different prompts to change the style of the image. But if we want a style that

the image. But if we want a style that the model did not learn, we cannot generate that style. For that, we have the Laura files, which add extra information to the main model. We talked

more about this in episode 13. Let's go

to workflows, and this time, let's select workflow number six. the one with Laura in the name. This is a simple Z image text to image workflow. In fact,

if we remove these nodes, we get exactly workflow 5A that we used before. So,

let's undo that. What is different here is this Laura loader model only node which allows us to load a Laura from our computer. I just changed the color to

computer. I just changed the color to blue. That is all. The node with trigger

blue. That is all. The node with trigger words is just a simple note. Again,

revisit chapter 13 for more details. So,

let's go and download Aurora. I created

a Laura for a girl with white hair, and you can download it from here. After

that, navigate to Comfy UI, go to models, and then open the Laura's folder. Here, we already have one from

folder. Here, we already have one from chapter 13, the SD1.5 Laura. Now, we

create a new folder called Zimage. Since

this Laura only works with the Zimage model, and we save the Laura inside that folder. After the Laura is downloaded,

folder. After the Laura is downloaded, press the R key to refresh the node definitions.

Now, if we go to the Laura loader, we can select that Laura. You can see the folder and the Laura name there. Just

like for all other Loris, this is the trigger word that I used when I trained that Laura. I use that in the prompt

that Laura. I use that in the prompt together with more words to describe what I want to generate. Now, when I generate, I get this girl with white hair. The Laura I am using here is a

hair. The Laura I am using here is a character Laura. There are also loris

character Laura. There are also loris for styles, objects, or functional ones that speed things up. This also allows you to keep a character consistent. So

even if you change the prompt and keep the trigger words, you get the same character, which is very useful. There

are many Lauras trained by people online on sites like Hugging Face or Civot AI.

Over time, you can also learn how to train them yourself, either online or locally, if you have enough VRAM. Let's

search for Allora on the Civit AI website. Again, if you are from the UK,

website. Again, if you are from the UK, you will need a VPN to bypass the restrictions they set for your country.

Let's go to models. Then we can filter them. Set time period to all. For model

them. Set time period to all. For model

type, select Laura. For base model, select Z image turbo. Now we should see only Loris compatible with our base model. We can sort by highest rated. We

model. We can sort by highest rated. We

have quite a few here. Let's pick one at random. Maybe this one that lets us

random. Maybe this one that lets us create character design sheets. Now we

are on the Laura page. At the top you can see this Laura is available for different models, but we want the Z image version. We check the type to make

image version. We check the type to make sure it is a Laura and that the base model is correct. We also check if it has trigger words or other settings.

Then we can download it from here. You

must be logged in to download models. We

save the Laura in the same Laura's folder inside the Z image folder. After

the download finishes, press the R key to refresh. Now we should be able to see

to refresh. Now we should be able to see that Laura in the list and select it.

Let's see what else it says about this Laura so we can learn more. Here it

shows the trigger words. I can copy those and paste them in a note so I have them for later. They also give an example prompt showing how to use it.

Let's copy that and paste it into the positive prompt. I will remove the

positive prompt. I will remove the beginning and ending quotes. Now, let's

copy the trigger words and place them here instead of the previous trigger words. We also have a subject. So, let's

words. We also have a subject. So, let's

say a white bunny warrior. For art

style, maybe I add a 3D render style.

The rest looks fine. For the model strength, I will use one. If it is too strong, I can reduce the weight. For the

size, let's make it bigger so we get more details. Now let's run the

more details. Now let's run the workflow. We get this character sheet

workflow. We get this character sheet which is not bad. This could be useful for concept artists to see different angles. Let's run it again with a

angles. Let's run it again with a different seed. And we get this result.

different seed. And we get this result.

Now let's change some things in the prompt. Maybe it is a medieval bunny.

prompt. Maybe it is a medieval bunny.

For art style, I try a vector art style.

Let's run the workflow again. Now we get a different image. This seed does not look that good. So maybe I try another seed to see if I get something better.

Again, not perfect, but at least it gives some ideas. Let's go back to Civid AI and look again at models. You can see there are luras for all kinds of things.

That does not mean all of them are great. They are trained by people like

great. They are trained by people like you and me and shared for free.

Depending on the training, some are very good and some are not so good. Training

is never perfect. If you want to see how much the Laura influences the result, we can test that too. Change the seed to fixed and generate once to see the result. In my case, I got this image.

result. In my case, I got this image.

Now, let's go to the Laura loader node.

Right click on it and select bypass.

Then, we run the workflow again with the same settings and prompt. You can see that without the Laura, we do not get a character sheet anymore. So, this Laura

clearly helps with creating multiple characters on a sheet. Remember, Allora

is like an add-on to the main model. It

adds extra training to that model, like the model took a new course and learned how to do character sheets. Hope that

helps. I explained controlNet basics in chapter 14, but there are models like Zimage that need different nodes to run control net. Let's go to workflows. And

control net. Let's go to workflows. And

now I want to open workflow 4 and also workflow 7 since both are using controlnet. And you can see the

controlnet. And you can see the difference. So let's go to the

difference. So let's go to the juggernaut workflow which is a stable diffusion 1.5 model. Here we use a load control net model node and it is the

same node used for SDXL models or flux models. Then for control net we have

models. Then for control net we have different models like depth canny pose or other types that control the image generation. Now if we go to the Z image

generation. Now if we go to the Z image turbo workflow, we have a different node here called model patch loader. Here we

load a control net model and it is called union because it has depth, canny and pose integrated into one single model. So we do not have to keep

model. So we do not have to keep changing the model. It is one model that does everything it needs. Back in the juggernaut workflow, we had a pre-processor node that converted our

image into a format the control net model understands.

For the Z image workflow, that part remains the same. We can try different pre-processors like canny depth or DW pose and they will work with this model.

For the last part in the juggernaut workflow, we had an apply control net node between the prompts and the K sampler with different parameters. For Z

image turbo, the node is different. It

is called Quen image diff control net.

Here we only control the strength which I set to 0.8. So it is not too strong.

Now let's download the required models.

By now you should already have the main models downloaded either FP8, FP16 or BF-16. The principle is the same even if

BF-16. The principle is the same even if you use a GGUF version or other types.

We also need to download the control net model because we are not using the load control net model node but the model patch loader. We need to place this

patch loader. We need to place this model in the model patches folder. So

let's click here to download it. Go to

the Comfy UI folder, then to models, and here you will find the model patches folder. Let's save the model here. Wait

folder. Let's save the model here. Wait

for it to download since it is around 3 GB. Also, keep in mind that over time

GB. Also, keep in mind that over time more versions can appear like version two or three. So, always check if there is a newer version available. After the

download is finished, press the R key to refresh the node definitions so the model appears in the list. Then, select

that model from the drop down. Now we

should have everything we need to run the workflow. I have here in the load

the workflow. I have here in the load image node a robot image loaded. For the

pre-processor I use depth or cany but let's start with canny. For the

resolution I will make it bigger so the cany map has better details. Then for

the prompt we describe what we want to get and then we run the workflow. Now we

can see that we got a cany map that control net understands.

Look at the result. It looks much better and it follows the edges of the original robot. Let's try with a different image.

robot. Let's try with a different image.

As you remember in the input folder, I added some images you can use. So, let's

say I load this sphere and cube image.

For the pre-processor, let's use depth this time. Then, let's adjust the prompt

this time. Then, let's adjust the prompt to fit. Maybe a green sphere on top of a

to fit. Maybe a green sphere on top of a golden cube in the desert, golden hour, alien. Now, when I run the workflow, I

alien. Now, when I run the workflow, I get a depth map for that image. For the

most part, it got it right, except the ground. Let's help it understand what I

ground. Let's help it understand what I want. So, I will add to the prompt, the

want. So, I will add to the prompt, the sphere and cube levitate in air. Let's

run it again and see if it understands it better. Now, we got exactly what we

it better. Now, we got exactly what we asked for. You can also give the image

asked for. You can also give the image to chat GPT and ask for a prompt together with instructions on how you want it to look. Let's try something else.

This time, let's upload that woman in a yoga pose that the juggernaut model struggled with to see how much the model advanced in the last 2 years. For the

pre-processor, I use the DW pose pre-processor. For the prompt, I will

pre-processor. For the prompt, I will add a photo of a woman dressed in white doing yoga on top of a mountain. Maybe I

add photo taken with a DSLR camera. Not

sure if it will take that too literally.

So now we got our pose skeleton which looks correct. We also got an Asian

looks correct. We also got an Asian woman which Z image tends to generate when you do not specify what kind of woman it is. It also added a DSLR camera

on the ground which I do not want. So

let's go back to the prompt. I remove

the DSLR part and for the woman I add that she is European. Now let's test again. The result is actually great.

again. The result is actually great.

Same pose, the clothes I asked for and on the mountains. a perfect result. What

do you think? Let's try to recreate this controlN net workflow so you can practice. Go to workflows and let's open

practice. Go to workflows and let's open workflow 5a since this is a simple textto image workflow for the z image model. Search for quen image written as

model. Search for quen image written as one word. Then select the diff synth

one word. Then select the diff synth control net node. Now we need to connect this between the model and the k sampler. So let's add the links so

sampler. So let's add the links so everything goes through this node. Let's

see what other inputs we have here. It

says model patch. So let's search for that node. We add the model patch

that node. We add the model patch loader.

And here we select the union control net model. Now we drag a connection from

model. Now we drag a connection from this node to the control net node. We

also need the VAE and we already know where it is in this workflow. So we

connect that as well. All that is left now is an image. To load an image, we use the load image node. So search for that node and add it. If we try to

connect the image directly, it will not work correctly because this model is trained with canny depth and pose. So we

need something to convert the image into those formats. Search for AIO and add

those formats. Search for AIO and add the OX pre-processor node. Now our image goes through this pre-processor. From

the list, we can select one, for example, the depth anything prep-processor. For the resolution, we

prep-processor. For the resolution, we can increase it a bit to get more detail. Now, we connect the output of

detail. Now, we connect the output of this node to the control net node since this is the correct format that control net understands. We can also add a

net understands. We can also add a preview node to see how the processed image looks. All that is left now is to

image looks. All that is left now is to adjust the prompt. Let's say the prompt is a modern house in winter. We can also increase the width and height to get

more details. Now, we are ready to test

more details. Now, we are ready to test the workflow. We can see the depth map

the workflow. We can see the depth map of the building. We can enlarge it to see it better. The result looks like this. It is similar, but not exactly the

this. It is similar, but not exactly the same building shape. You could try a more detailed prompt or a different pre-processor. Let's add an image compar

pre-processor. Let's add an image compar node to see the differences. I want the original image before processing. So, I

connect it to image A. Then, just after the VAE decode, I connect that output to image B. Now let's run the workflow

image B. Now let's run the workflow again and make the image compar node larger. We can see that it shares some

larger. We can see that it shares some building edges with the original image but not all of them. If we want more accuracy, we can change the pre-processor.

Let's select a canny pre-processor instead. Now when we run it, you can see

instead. Now when we run it, you can see it captures all the edges in the canny map. The result should be more accurate.

map. The result should be more accurate.

And this is the result we get. Now we

can see many things in common with the original image. Keep in mind this is

original image. Keep in mind this is controlled mainly by edges. So it will not be exactly the same building. We can

get more control later when we cover edit models like Flux 2, Quinnedit or Nano Banana Pro. Up to now everything we did in Comfy UI happened locally inside

the interface. We loaded models,

the interface. We loaded models, connected nodes, ran workflows, and generated images on our own machine. API

nodes are different. They allow Comfy UI to communicate with external services.

An API is simply a way for one program to talk to another program over the internet. Instead of doing everything

internet. Instead of doing everything locally, we can send data out, let another service process it, and then receive a result back. Think of it like

this. Local nodes are tools on your

this. Local nodes are tools on your desk. API nodes are tools you rent

desk. API nodes are tools you rent remotely. You send instructions and you

remotely. You send instructions and you get results back. In Comfy UI, you can click on the plus to add a new blank workflow. Then double click on the

workflow. Then double click on the canvas and search, for example, for chat GPT. You can see that it says API node

GPT. You can see that it says API node under the node name. Let's select this node. Now, this node looks different

node. Now, this node looks different compared to others. It comes already colored in gold like a VIP version. On

top, it tells you how many credits this node will consume depending on the settings. Those credits change based on

settings. Those credits change based on what you use. Here we have a list of models from OpenAI that are accessible through the API. The API letters stand

for application programming interface.

For example, if we select a big model, it can cost between 2 to 8 credits depending on what you ask from it and how long the answer is. If I change to

chat GPT mini, it is almost zero credits. It is not zero, but it is 0

credits. It is not zero, but it is 0 something. So, it is quite cheap. This

something. So, it is quite cheap. This

node has a string output. So like chat GPT, you ask something and you get a text reply back. Let's drag a link and search for a node that displays text.

Search for preview. And we have this preview as text node that we can add.

Here we will get our reply from the chat GPT model. Let's say I ask it to

GPT model. Let's say I ask it to generate a prompt for a cute cartoon bunny, something 3D. Now when I try to run that, it asks me to sign in if I

want to use the API. We could use this login button or we can cancel and go to the menu then settings. Here we have the user section in the settings and again

we have the sign-in option. Let's click

sign in. If you have a comfy UI account, you can use that or you can simply log in with Google which is a faster option for me. Then you select your Gmail email

for me. Then you select your Gmail email from the list and you will be signed in.

Now you also have the option to log out.

So now we are connected but we need credits to run API nodes. Let's go to credits. Credits are like money. You

credits. Credits are like money. You

basically use real money to buy credits that you can spend on a lot of models that are available through the API in Comfy UI. I have here some credits I

Comfy UI. I have here some credits I bought a while back. I can click on purchase credits and then it asks me how much I want to spend. For example, I have $10 here, but that might be too

much for a beginner to spend on a first try. Let's click on minus to see if we

try. Let's click on minus to see if we can go lower. The minimum you can buy is 1,55 credits using $5. Then you can click continue to payment. Depending on

your country, you have different options to purchase. You can use link, but you

to purchase. You can use link, but you can also choose without link if you do not have one set up. Here you have options to pay with a card or you can use Google pay if you want and you also

have the option to purchase as a business. Back in Comfy UI, I have

business. Back in Comfy UI, I have enough credits to test a few nodes in today's tutorial. Now, when I run the

today's tutorial. Now, when I run the workflow again, this node sends information to the server wherever those are located in the cloud on OpenAI or somewhere else. Depending on the

somewhere else. Depending on the situation, sometimes it is faster, sometimes it is slower. From the

workflow point of view, nothing special is happening. Nodes still connect left

is happening. Nodes still connect left to right. Data still flows through

to right. Data still flows through cables. The only difference is where the

cables. The only difference is where the computation happens. Local nodes use

computation happens. Local nodes use your GPU or CPU. API nodes use someone else's hardware. This has advantages and

else's hardware. This has advantages and disadvantages.

Advantages are that you can use very powerful models that you cannot run locally. You save local VRAMm and system

locally. You save local VRAMm and system resources. Some APIs are faster for

resources. Some APIs are faster for specific tasks. Disadvantages are that

specific tasks. Disadvantages are that you depend on an internet connection.

There may be usage limits and of course it costs credits. It is not free. You

have less control over model internals.

So we got the response from chat GPT and it gave us multiple prompts and suggestions instead of a single prompt.

So let's refine what we asked and tell it to generate a single prompt. Maybe

repeat it once again to reinforce that.

Let's run it again. This time we got a single prompt just like I asked. Now we

can copy the prompt and paste it into another workflow. If we want with this

another workflow. If we want with this node selected I will use controll + c to copy the node then let's go to workflows and open a workflow like this five a

workflow that uses z image turbo which we know likes long prompts I will move this node to the side then controll +v to paste that node to connect this node

we just drag a link to the positive prompt now we have a mix of local models that take the prompt from an API node we can also drag a preview here if we

Let me search for preview as text. Now

we can see what prompt it gave us. I can

rename it prompt so I know this is the prompt. Let's run the workflow. You can

prompt. Let's run the workflow. You can

see it generated a prompt for me. Then

it continues to the next part of the workflow and generates the image. This

can be quite useful. There are free models that can also do this, but we will talk about that in another episode.

Let's change the prompt to be a ninja bunny, maybe in an action pose. Generate

again. We get a new prompt describing that bunny and the result is this image.

There are many API nodes and many options to connect them. Let's go back to the previous workflow where it was just those two nodes. Now let's add a concatenate node, a node that lets you

combine two strings or prompts. Let me

remove this prompt since we want to get the prompt from the concatenate node. I

will use it for string B for now. And

then connect this concatenate node here.

I will add a green color so it looks like a positive prompt node. For the

first part, I write a cute cartoon bunny ninja. For the second part, I write

ninja. For the second part, I write something like, "Use the prompt to generate a single detailed prompt creative. Adapt the prompt to match the

creative. Adapt the prompt to match the prompt style and mood." You can use all kinds of chat GPT formulas here to get exactly what you want. Let's drag a link

to a new preview as text node so we can see the result of the concatenate node.

Maybe I name it prompt, but I might change that later. Still exploring what we can do. When we run it, you can see I forgot to add a separator. So, it just combined the ninja prompt with use the

prompt to generate. In the end, it still understood and generated the prompt.

But let's fix that delimiter and add a comma and a space. Of course, you can split this into multiple nodes and make workflows more complex, one going into

another workflow, and so on. This is the prompt that goes into chat GPT and this is the prompt that comes out of chat GPT, the one we want to use in other

workflows. Now we can run it again. You

workflows. Now we can run it again. You

can see the input prompt to chat GPT is this combined text. I like to use concatenate because I can easily change the first prompt without changing the formula below. So it is easier to edit.

formula below. So it is easier to edit.

The result is this long prompt for the Ninja Bunny.

These two nodes are the same, so I only need one and I remove the other. What we

did here is split a workflow into multiple pieces so we can easily edit the prompt without worrying about the formula. I can quickly change the first

formula. I can quickly change the first prompt, run it again, and get a new prompt. It

is quite easy to use at the cost of a few cents or 0 something credits. I

probably do not need that preview anymore since I know how they are combined. So I will leave just one

combined. So I will leave just one concatenate node, the chat GPT node and the preview of the final prompt. Now

that we have this, we can save it. Hold

control and drag a selection over all nodes. Right click on the canvas, then

nodes. Right click on the canvas, then use save selected as template. Give it a name, maybe chat GPT prompt, so we know it generates prompts. Now we can paste

it into any workflow. Let me open that 5A workflow again.

Since it was already open, I will close it because I do not want the extra nodes. Then open it again fresh with

nodes. Then open it again fresh with default values. I move this to the side.

default values. I move this to the side.

Right click on the canvas. Go to node templates. And now we have that template

templates. And now we have that template there. We can move it wherever we want.

there. We can move it wherever we want.

Remember that the chat GPT prompt comes from this string output here. And we

connect it to any workflow we have to the positive prompt. Now when I run the workflow, chat GPT generates a prompt.

That prompt is used in the Z image workflow and the result is this cute monk cat.

But the chat GPT model is also a vision model which means I can give it an image and it can see what is in that image.

Let me remove this concatenate node.

Let's add a load image node.

I upload this image of a helmet. Now we

can connect this node to where it says images. It says images because you can

images. It says images because you can add multiple images if you use a batch images node, but maybe we explore that in a future episode. For the prompt, let's say something like, give me a

single prompt description for this image. Descriptive prompt. There are

image. Descriptive prompt. There are

more complex formulas, but I am just trying something on the spot. Now, let's

run the workflow. Chat GPT looks at my image, and after a few seconds, it should give me a prompt based on that image. We got this nice long prompt and

image. We got this nice long prompt and the result is this one. It is not perfectly identical, but with a better formula, we can probably get something even closer. It is still pretty close to

even closer. It is still pretty close to what we asked. You can also run the workflow multiple times to get different seeds. Let's see what else we can do.

seeds. Let's see what else we can do.

Let's create a new blank workflow.

Double click on the canvas and search for Nano Banana. We have this first version of Nano Banana that is cheaper.

It is eight credits, so you can probably get something similar for free from Google Gemini. We also have Nano Banana

Google Gemini. We also have Nano Banana Pro, the more powerful model that can do big images. This one costs 28 credits.

big images. This one costs 28 credits.

If we change to 4K size, it will cost 51 credits. Depending on the model, some

credits. Depending on the model, some can cost over 100 credits, so be careful what nodes you use because you can run out of credits pretty fast. Both accept

images, but Nano Banana Pro understands prompts and images better. You can see what model is used for the first Nano Banana. And for Nano Banana Pro, it is

Banana. And for Nano Banana Pro, it is actually called Gemini 3 Pro image.

Let's remove the first node and use this one to generate an image. I add a load image node so we can load an image from disk. Then connect the nodes. Now I

disk. Then connect the nodes. Now I

upload an image. For example, this portrait of a man. Then I add the prompt. We did not talk yet about

prompt. We did not talk yet about editing models like Flux 2, Quenedit, or Nano Banana Pro, but these can be used to edit or modify an image. I could say

change the t-shirt or replace the background or hair color. Let's try

something simple like telling it to change what he wears to a steampunk suit. For resolution for this test, I go

suit. For resolution for this test, I go with 2K since it uses about half the credits. We also have aspect ratio.

credits. We also have aspect ratio.

Instead of auto, I set it to 9 to6, but you should use whatever ratio you need.

When I run it, I get a prompt failed message. Can you guess why? It says it

message. Can you guess why? It says it has no output. That is because we did not save the image. So, let's drag a link and add a save image node. I also

see it has a string output. So, let's

add a text preview node to see what it outputs there.

Now, we can run the workflow again. This

one takes longer, over a minute to generate. You can also check your

generate. You can also check your profile here to see how many credits you have or sign out, manage subscription, and so on. You can also check partner

node pricing. This opens the Comfy UI

node pricing. This opens the Comfy UI website where you can see how much it costs to use any of the models that are not free, the so-called partner nodes.

You have models for images, text, and also a few nodes for video. These

usually need a lot of VRAM and you can generate video even if you do not have that VRAM locally but at a cost. Back in

Comfy UI we got our generation. If we

look at it the result is quite good in 2K size and it is quite similar to the original man. So it is a good model but

original man. So it is a good model but expensive. For the text output we also

expensive. For the text output we also got something like a peak into what the model was thinking. basically a prompt it used to generate that image based on

the small prompt I gave it and the image. You can explore more API node

image. You can explore more API node workflows created by the comfy UI team.

If you go to templates here you have all kinds of workflows but if you want to see the API ones select partner nodes then you can filter them by model if you

know what model you are looking for or just explore random workflows. It does

not cost you anything to open and check a workflow. It only costs credits when

a workflow. It only costs credits when you run it. Let's say I like the preview of this workflow. I click on it and I get the workflow. Let's see what it

uses. We have a load image node. So, it

uses. We have a load image node. So, it

expects an image from our computer. We

have a nanobanana prompt and it says color this image. So, if you upload a sketch, it will color that sketch using the nano banana pro model. By default,

it is set to 1k, but you can change the settings to fit your needs. Let's go

again to templates and check another workflow. Maybe this one with the shoe.

workflow. Maybe this one with the shoe.

This one is more complex. It expects an image of a product like a shoe. Then it

uses a bite dance model which is similar to nano banana but a cheaper version.

Once the image is saved, it goes to different video models. These models

cost around 103 credits each. It looks

like it generates multiple videos from that shoe, depending on the prompt, and then combines all those videos into one final video. A workflow this big takes

final video. A workflow this big takes some time to run and can cost you maybe around 300 credits, so roughly a couple of dollars. I did not do the exact math,

of dollars. I did not do the exact math, but you can spend credits very fast with video models. One important thing to

video models. One important thing to understand is that API nodes do not make Comfy UI cloud-based. Comfy UI is still running locally. You are simply adding

running locally. You are simply adding external steps into your pipeline. From

a mental model point of view, treat API nodes exactly like normal nodes. The

cables do not care where the data comes from. If the output type matches, it

from. If the output type matches, it works. This also means you can mix local

works. This also means you can mix local models, gguf models, diffusion models, and API nodes all in the same workflow.

Comfy UI is a complex application with a lot of nodes created by different people. And we combine all these things

people. And we combine all these things together like Lego pieces. At some point you will get an error either because you forgot to connect a link, you connected

the wrong nodes or you used the wrong models. In this chapter I will try to

models. In this chapter I will try to explain what you can do when that happens because it will happen. If we

look at the workflows we have this workflow with the number zero. The first

one, this one is for help and resources.

And I tried to gather here some information that might help you. Let's

start with resources. The best way to learn Comfy UI is to watch tutorials and to practice. You have here a link to the

to practice. You have here a link to the Pixar YouTube channel, but there are many other YouTubers who do tutorials for Comfy UI. You can search on YouTube for different tutorials. Try to look for

more recent ones because if a tutorial is two years old, most things are probably different. Now, that is one of

probably different. Now, that is one of the reasons I made this new series. On

the top of my YouTube header, I added a link to my Discord. Click on it, then click go to site, and you will get an invitation to join the Pixaroma Discord

server. If for some reason it says

server. If for some reason it says invalid, try a different browser or the mobile application. You click accept

mobile application. You click accept invite and you will land in the welcome channel. I will show you more about how

channel. I will show you more about how to navigate Discord in a minute. In this

note, I also included a link to Discord and some useful info like where the Pixar workflows are and so on. Let's

click on this link and we get to the same invite. It goes to the same server

same invite. It goes to the same server and the same welcome channel. This is my server called Pixaroma, but there are many other servers for Comfy UI. For

example, if you go to the comfy.org or website and then to resources. They also

have a Discord link. It is the same process. You accept the invite and you

process. You accept the invite and you land on their welcome channel. On the

left side, you have the servers you joined like my Pixaroma server or the Comfy UI server. Let's go to Pixaroma and explore a bit more. On the left, we

have different channel names so we stay organized. Each channel has its purpose.

organized. Each channel has its purpose.

There are also categories that contain multiple channels which you can collapse or expand. For example, if I collapse

or expand. For example, if I collapse this category, you might think you cannot find the Pixarroma workflows channel. But if we click on this arrow,

channel. But if we click on this arrow, we expand the category and now we can see the Pixaroma Workflows channel there. For every server you join, check

there. For every server you join, check the rules so you know what you are allowed to do and you do not break the rules and get banned. If your Discord account gets hacked and posts spam in

your name, you might get banned as well.

You can send me a message to remove the ban if you fixed your account and it is not hacked anymore. Here we also have a help channel where you can find what each channel is for and where to post.

Some channels are public, some are private and only for members, and some are public, but only moderators or admins can post like news and updates.

We also have a daily challenge for people who use AI. You can find more info in this channel and you can participate in the challenge in the daily challenge channel. When you see a

number with a red circle, that means someone mentioned you or everyone on the server. For example, when you see that

server. For example, when you see that the news and updates channel has a notification, go check it because I probably posted a new tutorial or shared an update. You can see this post used

an update. You can see this post used everyone to mention everyone on the server. In off topic, you can discuss

server. In off topic, you can discuss things that do not fit in Comfy UI or other channels, but try to avoid spam and make sure it still respects the rules. For Comfy UI here, you usually

rules. For Comfy UI here, you usually have the most active chat. People talk

about Comfy UI, so if you post here for quick help, and if members know the answer and have time, you might get help. If not, it might get ignored.

help. If not, it might get ignored.

Another channel where you can post is the forum. There you usually post things

the forum. There you usually post things for longer term discussion. You might

post today and get replies in hours, days, or sometimes not at all if people do not know the answer. You can see that I can post in this channel because it lets me type. You can ask for help here

and include screenshots and all the details. This is also the channel where

details. This is also the channel where EVO posts updates about the Easy Installer, which he continuously improves and adds more scripts to make things easier. Thanks to EVO for all the

things easier. Thanks to EVO for all the help. Make sure you check this area for

help. Make sure you check this area for updates related to the easy installer.

The most visited channel is probably the Pixarroma workflows channel where people come to get my workflows from tutorials.

Here I have a list of older episodes from 2024 to 2025 and also this new series I am doing now. Starting with the first episode, you can see links that

lead to specific episodes. For example,

if I click on the first episode, I land on this page where I will also add a link to the YouTube video once it is ready. You will find all the chapters of

ready. You will find all the chapters of that video plus links to Comfy UI and the workflows. You can download the

the workflows. You can download the workflows either as a zip archive that you extract or as individual JSON files.

You can also comment on this forum post.

Since I post this series as forum posts, you can comment if something does not work so we can try to fix it if possible. Keep the conversation limited

possible. Keep the conversation limited to that specific episode. For the next episodes, comment on their respective posts. If it is not related, use the

posts. If it is not related, use the forum or the Comfy UI channel. For

off-topic discussions, use the off-topic channel. You can also use Discord to

channel. You can also use Discord to navigate quickly. You can create links

navigate quickly. You can create links to different channels. For example, if I type the hash sign, I can select different channels like Pixaroma Workflows. Or I can type hash and then

Workflows. Or I can type hash and then help. And you can see what happens. It

help. And you can see what happens. It

adds a link to that channel. When I

press enter, I get a clickable link to the help channel. If I click it, I land in the help channel. Let's go back to the off-topic channel. If you hover over

a message, you have different options like edit or add reactions. Some servers

also allow you to use emojis from other servers. For example, I can select this

servers. For example, I can select this Pixab Bunny emoji. To remove a message, hover again, click the three dots, and you have different options, including

delete. If I type hash and then

delete. If I type hash and then Pixarroma, and select Pixarroma workflows, press enter, then click it, I land in the Pixarroma workflows channel.

I keep getting messages that people cannot find the Pixarro Workflows channel. So, I hope this tutorial helps

channel. So, I hope this tutorial helps you find it more easily. In channels

where you cannot comment, you will see a message saying that posting is not allowed. Usually only admins or

allowed. Usually only admins or moderators can post there. Use the Comfy UI channel for discussion and help related to Comfy UI. Use Offtopic if you

cannot find the right channel. Let's go

to the forum. Here you can find all kinds of forum posts. For example, we have this pinned forum post that you should read before you post anything.

From here you can create a new post. You

can close this if you want to see more of the forum. When you create a post, you can add a title, a message, and screenshots with your workflow that has problems. You can also add tags to your

post depending on what the post is about. Add a clear title and a

about. Add a clear title and a descriptive message, not something vague. When you are done, you can post

vague. When you are done, you can post it using this button. You can also check other posts to see how they are written.

For example, the workflows from the first episode are posted in a forum post.

Besides that, you have more channels for AI video and AI music for chat GPT and other AI topics and a few more channels that I will let you explore in your free

time. Keep discussions civilized and

time. Keep discussions civilized and help when you can. I visit the Discord every day, but I cannot respond to all

messages. mention Pixarroma if something

messages. mention Pixarroma if something is important. In the top right, you also

is important. In the top right, you also have an inbox. On the left side, if you click on the logo, you have direct messages where you can talk with your

friends. If you click on unread, you can

friends. If you click on unread, you can see notifications, including mentions, so you can quickly see when someone mentioned you or everyone on the server.

You can jump directly to that message using the jump button. Always check

mentions, especially when you see your username. You also have a search bar

username. You also have a search bar which many people forget exists. Here

you can type a model name or a few words that people might have used in discussions. For example, if I search

discussions. For example, if I search for LTX2, I can see quite a few posts with that search query and I can jump to any of those discussions. You can also

search posts from a specific user. For

example, I can search for posts from Pixaroma. Make sure the username is

Pixaroma. Make sure the username is Pixarroma and not something else because some people try to mimic the name. Both

the username and display name should be Pixarro. Now you can see all the posts

Pixarro. Now you can see all the posts from Pixarro. You also have more options

from Pixarro. You also have more options and filters that you can use for different channels and searches.

You can use these arrows to reply to a message or forward it to someone else.

Okay, enough with Discord. Let's go back to Comfy UI. I included here more resources for Comfy UI like the official ones and also some unofficial ones like

Reddit or Facebook groups that you can try. Let's open the Reddit group for

try. Let's open the Reddit group for example. We have this comfy UI Reddit

example. We have this comfy UI Reddit group where you can see discussions, news, tutorials and so on and where you can post your questions. There is also

one called stable diffusion which includes discussions about stable diffusion and free models comfy UI but also other interfaces not only comfy UI.

You can also search for a word on Reddit like Comfy UI and sort the results by communities. Then you can check which

communities. Then you can check which ones have more members. The two I use the most are these ones. Make sure you also check the other notes I added here

like definitions for beginners, what a model is, what a text encoder is, and so on. There's also more information about

on. There's also more information about performance, common errors and fixes, model locations, custom nodes, and how to update Comfy UI. I also included a

link to the easy installer in case you want to go back to it and find more info or check what is new in the releases. If

you want, I also created an experimental chat GPT that you can try, especially for this easy install comfy UI version.

Like any chat GPT, it can hallucinate sometimes, but it is still better than a simple chat because it is more specialized for comfy UI. For example,

if I ask where the images are saved, it will think and also search the knowledge database where I added some files. Then

it will answer. You can see the answer is pretty good. So I think it will help a lot of beginners. Sometimes if you think it made a mistake, maybe because something is new and the model was

trained months ago, you can ask, "Are you sure?" Look online and it will

you sure?" Look online and it will search the web. This way you can double check and improve your chances of getting a more accurate response. In

this case, it knew that images are saved in the output folder. Let's ask

something else like where are the Pixarroma workflows? Where can I find

Pixarroma workflows? Where can I find them? It will tell you they are on

them? It will tell you they are on Discord and give you the channel name.

Let me try something else. Let's open

workflow number one, the juggernaut text to image workflow and disconnect this node to cause an error. When I run it, I get this error. Now I take a screenshot

of this error, go to that custom chat, paste the screenshot, and ask how to fix this error. You can see that in this

this error. You can see that in this case, since it was a simple error, it knew how to answer and told me to drag a wire from load checkpoint VAE to the VAE

decode node. This can save you a lot of

decode node. This can save you a lot of time in many cases. So, I hope you find it useful. You can give it more

it useful. You can give it more screenshots and more info, even ones without the error, so it can understand the workflow better. Sometimes on

GitHub, it asks you to post an error report, and you can find here a report that gives more info about the error.

You also have find this issue which opens the issue pages on the Comfy UI GitHub page. This is the official Comfy

GitHub page. This is the official Comfy UI GitHub page for the portable version, not the easy installer. Even though the easy installer installs the same version

plus extra scripts, there is an issues tab where people post problems. You can search issues that are open or include closed ones as well. You can also post a new issue if it is something new and you

did not find any information about it.

Make sure it is an issue with a comfy UI node, not a custom node. For custom

nodes, you need to go to the custom node page instead. To fix this error, we just

page instead. To fix this error, we just connect the VAE back to VAE decode. But

let's say you have your VAE as a separate file for some workflows. So you

use load VAE to load a VAE and connect that to VAE. Let's see what happens when I run the workflow. It gives this error which usually means we used models with different architectures that are not

meant to work together. The error is shown in VAE decode. But VAE decode is not really the problem. The problem is the input that goes into that node. In

this case, the VAE loaded with load VAE was the issue. Let's go back to the help workflow. I want to remind you that when

workflow. I want to remind you that when you ask for help, include screenshots of your workflow. Tell us what video card

your workflow. Tell us what video card you have, how much VRAM and system RAM you have, and which operating system you are using. Also, explain what you

are using. Also, explain what you already tried and what did not work.

This helps the community assist you faster. Okay, one more chapter to go.

faster. Okay, one more chapter to go.

Are you ready? You have now reached the end of this course. At this point, you understand how Comfy UI works, how workflows are built, how models differ, and how to use tools like Laura,

ControlNet, and advanced diffusion models. But learning Comfy UI does not

models. But learning Comfy UI does not really end here. This is just the foundation. The most important thing to

foundation. The most important thing to understand is that Comfy UI is not a fixed tool. It is constantly evolving.

fixed tool. It is constantly evolving.

New models appear, new nodes are created, new workflows solve problems in better ways. So the best way to continue

better ways. So the best way to continue learning is by experimenting. Open

workflows, break them, rebuild them, change one thing at a time and see what happens. That is how real understanding

happens. That is how real understanding happens. Another important habit is

happens. Another important habit is reading workflows, not just using them.

When you download a workflow, do not just press run. Look at the nodes.

Follow the connections. Ask yourself why something is there. If a workflow looks confusing, that usually means it is teaching you something new. Next, stay

connected to the community. Use Discord

to ask questions, share results, and help others when you can. Very often,

answering someone else's question will make you understand things better yourself. Follow model releases, but do

yourself. Follow model releases, but do not chase everything. You do not need every new model. Find a few that work well for your style and hardware and learn them deeply. As you get more

comfortable, start building your own workflows from scratch, even simple ones, especially simple ones. That is

how you move from copying to creating.

Also, remember that AI tools change fast. What matters most is not

fast. What matters most is not memorizing settings, but understanding concepts. Noise, conditioning, sampling,

concepts. Noise, conditioning, sampling, structure versus style. Those ideas will stay useful even when models change.

Finally, do not rush. There is no finish line here. Learning Comfy UI is a

line here. Learning Comfy UI is a process, not a goal. Take your time, have fun, and keep experimenting. This

is just the beginning. So, what comes next? Obviously, I will continue this

next? Obviously, I will continue this series and do episode 2, 3, and so on.

But I cannot make them as big as this first episode. They will be shorter

first episode. They will be shorter videos focused on things we did not learn yet like other models such as Quen or Flux video models and so on. We still

have a lot to cover and every week or month we see new models and new nodes appearing. My plan for the new series is

appearing. My plan for the new series is to show you these new models and workflows in a more easy to understand way so everything makes sense as much as possible. Some of these workflows and

possible. Some of these workflows and models are so new that nobody really knows much about them yet. I will try to post a new episode every week if my

health allows it. If not, then at least one episode every 2 weeks. This new

series will have bunnies on the thumbnails so you do not confuse it with the old series.

For the new series, as you saw, the workflows on Discord are posted in the forum. This makes it easier for me to

forum. This makes it easier for me to see when you find a bug or when something does not work anymore so I can try to fix it. That is why the old series, even though it still has good

tutorials and you can still watch it, especially the last episodes, will not receive updated workflows. I will not go back and try to fix those old workflows.

Instead, I will focus on the new series.

When needed, I can revisit those older workflows, adapt them to new models, and present that in a new episode in the new series. I wanted you to have the basics

series. I wanted you to have the basics in this long episode 1, this course that you probably cannot find somewhere else.

I worked one month on this episode, and I wanted everyone to have access to it for free. I could have put it behind a

for free. I could have put it behind a paid course, but I feel better when I can help people. That being said, I do appreciate your support. There are many ways you can help me and this channel so

I can create more videos. The easiest

way is to press the like button, subscribe to the channel, and leave a comment, even if it is just a simple thank you. This shows activity to the

thank you. This shows activity to the YouTube algorithm and helps the video reach more people. So now, if someone asks where they can start learning Comfy UI, you can share the link to this

course. I will also create a new

course. I will also create a new playlist that will host all the new episodes from this series. For those who can afford to buy me a cup of tea, since I do not drink coffee, being a bunny, I

already have too much energy. You can

use the join button. Here you have four different options from really cheap, like half a cup of tea per month, to more expensive, like a premium cup of tea. Depending on the option you choose,

tea. Depending on the option you choose, you get different perks. For example,

Legends have a private channel on Discord where they get to know me better. If you do not want to help

better. If you do not want to help monthly, you can also help one time. On

each video, you can find this heart with a dollar sign called super thanks. Super

thanks allows you to select an amount of money that you want to donate and send it. You can use this for videos that

it. You can use this for videos that really helped you like this course or any other episode where you learned something useful. Speaking about legends

something useful. Speaking about legends and those who subscribed to the membership, I want to thank all of you who made this course possible with your support. Together, we can help other

support. Together, we can help other people learn new tools, understand this crazy AI world we live in, and maybe even make it a better place. Thank you

all. You are the best. Have a great day, and I will see you on Discord.

Loading...

Loading video analysis...