I Built A Fully Local AI Agent with GPT-OSS, Ollama & n8n (GPT-4 performance for $0)
By The Recap | AI Automations
Summary
## Key takeaways - **Local AI for Free: GPT-OSS & Ollama**: GPT-OSS is a new open-source model that performs comparably to GPT-4 level models, runs entirely locally, and costs $0. It can be integrated with n8n for building AI agents without API keys or cloud dependencies. [00:05], [00:21] - **N8N Local Setup with Docker**: Docker is recommended for setting up n8n locally due to its ease of use for spinning up and tearing down instances, ensuring data persistence with volumes. [00:37], [01:25] - **Ollama: Local Model Management**: Ollama simplifies downloading, managing, and prompting large language models directly on your machine, avoiding complex scripts or code. [00:44], [05:27] - **Connecting N8N to Local Ollama**: To connect n8n (running in Docker) to Ollama, the base URL in the n8n credential must be set to 'http://host.docker.internal:PORT' to bridge the Docker container and the local service. [09:54], [10:10] - **Local AI Agent Performance**: AI agents powered by local models like GPT-OSS on n8n can execute tasks, such as writing a LinkedIn post comparing platforms, and even utilize tools, with performance dependent on local hardware capabilities. [12:14], [13:31] - **Privacy-Focused AI Development**: Local AI models enable development in sensitive sectors like defense, healthcare, and legal fields where data privacy concerns prevent the use of API-based LLMs. [14:14], [14:21]
Topics Covered
- Simplify Local N8N Setup Using Docker Desktop.
- Install GPTOSS Locally: Test Its Instant Response.
- Fix Docker-Olama Connectivity for Seamless N8N Workflows.
- Run Private AI Agents Locally for Sensitive Industries.
Full Transcript
OpenAI just dropped their very first
open-source model since GPT2 and it's a
gamecher. It runs 100% locally, costs
$0, and performs on par with GPT4 level
models like 03 and04 Mini. In this
video, I'll show you how to spin it up
on your own machine, integrate it with
N8, and start using it inside fully
automated AI agents. No API keys
required and no cloud dependencies.
Let's build. All right, here is the game
plan to get GPTOSS up and running
locally inside of NADN. The first step
we have to do here is getting our NADN
instance installed and set up to run
locally. We're going to use Docker for
this just to make everything easier.
After that's done, the next step we have
to do here is install this piece of
software called Olama. You can think of
Olama as this piece of software that
makes it easy for us to manage,
download, and actually prompt against
large language models on our own
machine, laptop, or other device. This
is just going to make life a lot easier
and help us avoid dealing with any extra
programming scripts or other code that
we have to run. After that's done, we
then need to install the GPT OSS model
through Olama so we're actually able to
use that in our connected nodes. And
finally, we can hook up the Olama chat
model inside NADN so we can start
building, prompting, and shipping
software against it. Let's go ahead and
dive in with setting up NAN locally. So,
I'm here on the NADN documentation page
for getting up and running with NADN
hosted locally by using Docker. I do
recommend you host it with Docker
because it's just easy to spin up and
tear down if something goes wrong. And
I'm just a big fan of Docker in general
and use that in my day-to-day
development. So, the prerequisite that
they mention here is you need Docker
Desktop installed on your computer
before you can get up and running. I
already have this, but I'll briefly show
you guys the setup steps you need to go
through. So, if you come up to the top
here, click on Docker, you're going to
be able to download Docker Desktop for
the operating system that you're on. I'm
on Apple Silicon, but if you have
Windows or Linux, make sure you're
grabbing that right one. Go through the
setup wizard and install steps just to
make sure that's up and running. And on
Mac, I can see that it is running here
by this, you know, green dot and Docker
Desktop is running. If you're on a
different operating system, you can also
pull up the terminal, go ahead and just
type Docker in the command line. And if
everything was installed correctly, you
should see this help command printed out
that shows you everything that you can
do with it. So if you see something like
this, that means you're all set. The
next step here is going to be scrolling
down and going into this starting NAD
section. So the first thing we got to do
is open up our terminal and copy this
first command here, which is going to
create this volume docker volume create
nad data. This is essentially going to
create something on our file system that
allows us to save our workflows,
executions, debug data, all that NAND
creates even if we have to restart NADN
in the middle of, you know, a system
reboot or for whatever reason. If you
don't have this and you end your Docker
container, which is just like an
instance of NAD running, you're going to
lose that data. So, make sure you don't
skip this step when you're getting set
up. So, I'll copy this. We'll go ahead
clear this out in my command line.
And we can see that madn data was
created. And so then if I do docker
volume ls that is going to list out all
of the volumes we have installed on our
machine. And so we can see I have on my
local machine we have the nadn volume
created which matches exactly what we
need. Let's go forward to the next step
here which is going to be setting up and
running the naden instance itself. So
using docker run and then a couple
arguments are going to get passed in
here that spins up our nadn docker
container which is the nadn server on
our own machine and is going to bind all
of the data that nadn creates to that
volume we made earlier. And so make sure
you're not skipping anything here. Once
again I'll copy this. I'll paste it in.
And if everything worked correctly,
we're going to see this message down
here that says editor is now accessible
via this localhost URL. Let's grab that
and see if we're able to now pull up our
NAN local instance. And so perfect.
Everything is set up here. And if you're
seeing this for the first time, you'll
need to make your local account.
And we should be able to skip through
this onboarding session to get into our
NAND dashboard just like we see on the
cloud version. So that completes it here
for the initial NADN cloud setup. Pretty
easy to get up and running with Docker
and that is a path I suggest you take
when building locally with NADN. All
right, so that finishes up everything
for step number one for setting up our
Naden instance locally. It's time to
move to step number two, which is going
to involve setting up Olama. I'm going
to come over to olama.com.
Go ahead and click on download on the
homepage and then pick the operating
system I need to download this for. I'm
on Mac, so I'll go ahead and download
that. And this is going to give me this
DMG file that I'm going to be able to
execute as soon as it opens up. So, I'll
click this,
be able to drag that into my
applications folder. And since I already
have this, I'm just going to stop. But
on your side, make sure you just
continue forward with that process or
follow the install steps you need for
your specific operating system. That's
all we have to do for now. A very easy
setup step. And if you remember from the
overview, Olama is going to be the piece
of software we use to download, manage,
and chat with large language models on
our own computer. So, we just completed
setting up Olama on our own laptop. It's
time to move to step number three, which
is going to be installing the GPT OSS
model inside of Olama so we can then
chat against it and power our AI
applications inside NADN. So, let's go
over here to this tab I've set up. It's
under um.com under library and then
GPTOSS. And so, if you want to search
for this, you can just search GPT
OSS. And this is going to give you the
setup steps to get up and running. We
want to go ahead and run this latest
version of GPTOSS. And so I'm going to
copy the model name here. And then
inside our terminal, we need to run O
Lama run GPToss.
So let's go ahead and do that now. So
we'll go O Lama
run.
And then we'll paste in that GPOSS
colon latest model version that we saw
on the table. And so I'll hit enter. And
that's going to go ahead and get started
pulling in the open model weights for
GPOSS that we're going to be able to use
and run actually on our laptop instead
of having to call into OpenAI's API.
Everything is going to run completely
for free and locally on our computer.
So, I'm going to let this run for a
second and then I'll be back as soon as
that finishes. All right, we're back
here and we can see that the download
here has completed to 100%. And if you
see the writing manifest and the success
message here at the bottom, you're good
to go with the latest version of GPTOSS
installed on your computer. Let's see if
we can chat against it here in the
terminal. Let's say, write me a poem
about NAN.
Let's see what it does. And so once
again, this is all running here locally.
And we can see I'm just on a I think a
MacBook M2 hardware. So, not even the
latest version of MacBook Pro and we're
getting a pretty instant response back
here from this model. So, that looks
pretty cool. Let's um see if we can exit
out of this because we're going to need
to run this a little bit differently
when it comes time to connect this to an
ADN. So, we have our GPT OSS model now
installed locally through Olama. It's
time to move on to the final step, which
is going to be hooking up an Olama chat
model inside of Naden. that's going to
be connecting to and talking with the
Olama model we just installed. The first
thing we got to do here before diving
into NADN is going to be running one
more simple command in our terminal. And
so we have Docker still open here. And
then in this separate terminal over
here, we need to spin up an O Lama
server that is going to host the chat
model locally on our machine that NAN is
going to be able to reach out and talk
to. So, we're going to come over here
and run one more command, which is going
to be Olama serve. It's going to spin up
that Olama software we already
downloaded. And so, if you see a message
here, um, like some of these IP
addresses appearing on the local IP and
you see this message that says listening
on 127.0.0.1,
that means everything's working as
expected here. And then we're just going
to leave this terminal open and not
touch it for now. Let's go back to NADN.
Let's go back to NADN and get started
building a brand new workflow here that
is going to connect to our Olama chat
model. Let's make this just a manual
trigger.
And let's hook up a new LLM chain node
here that is going to let us use Olama
to make a single chat call. So, we'll
pull this in. Let's go ahead and add
Olama
Olama chat model from the right hand
side here which is going to be our
credential that allows us to connect
from NNN to the server that we just spun
up. So in order to create this
credential we'll come here click create
new credential and we're going to see
this base URL says this local host
value. I can click save but that's going
to give us an error right off the bat
because we're using Docker. There's one
other step we have to follow in order to
get this connected and working to power
our automations. I found this from this
uh self-hosted AI starter kit on GitHub
that sets up something very similar with
self-hosted NAD and Olma. If I search
for local host on this, we can see that
in order to connect from our Docker
container that's hosting NAD to the
service that's running on our computer,
which is Olama hosting the chat model.
We do need to switch a URL here and that
is going to be changing the base URL to
this http host.docker.in internal and
it's going to be this port number. So
make sure you go ahead and copy this
exact value. And when we come over to
our base URL here inside our Olama
account credential we're trying to set
up, we can click save and then that
connection is going to be successful. So
that tells us that NAND is now connected
to Olama and we're going to be able to
chat together.
Let's exit out and get started with the
rest of this build. So we have this
connected. We have our basic LLM chain
set up here. Let's go ahead and do a
quick set field result and let's see
this automation in action. But before we
do that, let's add in a prompt here and
actually tell our LLM to do something.
So, let's have it write a LinkedIn post
comparing naden versus make.com.
So, let's execute
model llama 3.2 not found. So, it looks
like I missed one setup step here. And
so, let's click into the tool. Let's
pull up our GPTOSS
colon latest model here, which is going
to be GPTOSS that we just downloaded.
And let's give that one more shot.
So we can see this running here which is
good. And then if I pull up my terminal
again and we go to our O Lama tab, we
can see that this is actually processing
and our NADN instance has connected to
this Olama server and is interacting
with GPT OSS. So this is spinning and
working locally. Um compared to a
typical Open AI API call, this is
actually going to take a little bit
longer. And so that's something to keep
in mind when you're building here. um
it's going to depend on the capabilities
of your own laptop, your own desktop
computer. If you have a GPU, that's
going to allow it to go much faster. So,
just be aware of that when you're
building on our text field here. Let's
go ahead and grab in the text result so
we can see what this looks like. So,
there we go. There is the simple setup
for getting an LLM chain call working
locally using GPTOSS.
Let's go a step further and see if this
works with the agent nodes inside of
naden. So, I'll go ahead and add in a
new chat trigger. We'll drag that in and
let's add our agent node here and see if
we can get this connected and running.
I'll go ahead and pull this down. I'll
drag in my chat model here, which looks
like it's going to work. Let's add some
simple memory and let's see if we can
get a tool working here, too. So let's
do just the thing tool for now and see
if it's able to call into tools and do
that effectively. We'll do another set
field and let's see if we can get
something running here just by opening
up the chat.
Think deeply about the differences
in andmake.com and write me a LinkedIn
post breaking down the pros and cons of
each. So we'll send that in. We're going
to see our AI agent node running here
calling into the chat model. If I pull
up the terminal for our Olama server,
we're going to see this is running once
again.
And let's see if we're able to get this
think tool called here. It does look
like the memory was getting used as
expected. And let's just wait a bit for
that result. And there it is. We can see
our think tool was executed during the
log of this AI agents loop that was
running forward. And if we pull up the
result, let's drag in the output and we
can see what we got here. So again, n
versus make breaks down a markdown
formatted output of this LinkedIn post
and gives us some details here. So this
looks great. Once again, pretty
incredible for local AI models running
here. And honestly, wasn't that slow
here compared to some of the other
models I've tried in the past. So I
think this is going to open up a lot for
local development and really allow AI
developers, builders, agencies to start
building, bundling and selling products
to industries where it wasn't always
possible before like defense,
healthcare, legal things where privacy
was a huge concern and API based large
language models weren't able to be used.
So very excited about this for the
future. Hope you guys found something
helpful here. Definitely let me know
what you plan on building with GPTOSS.
I'm excited to hear about it.
>> All right, before you go, make sure to
hit the like and subscribe button.
Seriously, do it because we're going to
be breaking down all of the AI
automations that we use to run all of
our businesses. They're super helpful
and we're going to break down exactly
how we do them on this YouTube channel.
So, make sure like and then subscribe
and then you'll get notified when we
publish new workflows that can make your
businesses run 20 times more
efficiently, just like we're seeing here
at the recap. and the other businesses
that we're running. The other item is
join our school community for free. The
links in the description. You'll be able
to get this template, this automation
that we just ran through in this video
completely for free. You can go to
school and navigate to the video that
you want and then you can download the
JSON output for the NAN automation. So
like and subscribe. Join our free
community to get this automation for
yourself.
Loading video analysis...