I Built A Fully Local AI Agent with GPT-OSS, Ollama & n8n (GPT-4 performance for $0)

By The Recap | AI Automations

Summary

## Key takeaways - **Local AI for Free: GPT-OSS & Ollama**: GPT-OSS is a new open-source model that performs comparably to GPT-4 level models, runs entirely locally, and costs $0. It can be integrated with n8n for building AI agents without API keys or cloud dependencies. [00:05], [00:21] - **N8N Local Setup with Docker**: Docker is recommended for setting up n8n locally due to its ease of use for spinning up and tearing down instances, ensuring data persistence with volumes. [00:37], [01:25] - **Ollama: Local Model Management**: Ollama simplifies downloading, managing, and prompting large language models directly on your machine, avoiding complex scripts or code. [00:44], [05:27] - **Connecting N8N to Local Ollama**: To connect n8n (running in Docker) to Ollama, the base URL in the n8n credential must be set to 'http://host.docker.internal:PORT' to bridge the Docker container and the local service. [09:54], [10:10] - **Local AI Agent Performance**: AI agents powered by local models like GPT-OSS on n8n can execute tasks, such as writing a LinkedIn post comparing platforms, and even utilize tools, with performance dependent on local hardware capabilities. [12:14], [13:31] - **Privacy-Focused AI Development**: Local AI models enable development in sensitive sectors like defense, healthcare, and legal fields where data privacy concerns prevent the use of API-based LLMs. [14:14], [14:21]

Topics Covered

Simplify Local N8N Setup Using Docker Desktop.
Install GPTOSS Locally: Test Its Instant Response.
Fix Docker-Olama Connectivity for Seamless N8N Workflows.
Run Private AI Agents Locally for Sensitive Industries.

Full Transcript

OpenAI just dropped their very first

open-source model since GPT2 and it's a

gamecher. It runs 100% locally, costs

$0, and performs on par with GPT4 level

models like 03 and04 Mini. In this

video, I'll show you how to spin it up

on your own machine, integrate it with

N8, and start using it inside fully

automated AI agents. No API keys

required and no cloud dependencies.

Let's build. All right, here is the game

plan to get GPTOSS up and running

locally inside of NADN. The first step

we have to do here is getting our NADN

instance installed and set up to run

locally. We're going to use Docker for

this just to make everything easier.

After that's done, the next step we have

to do here is install this piece of

software called Olama. You can think of

Olama as this piece of software that

makes it easy for us to manage,

download, and actually prompt against

large language models on our own

machine, laptop, or other device. This

is just going to make life a lot easier

and help us avoid dealing with any extra

programming scripts or other code that

we have to run. After that's done, we

then need to install the GPT OSS model

through Olama so we're actually able to

use that in our connected nodes. And

finally, we can hook up the Olama chat

model inside NADN so we can start

building, prompting, and shipping

software against it. Let's go ahead and

dive in with setting up NAN locally. So,

I'm here on the NADN documentation page

for getting up and running with NADN

hosted locally by using Docker. I do

recommend you host it with Docker

because it's just easy to spin up and

tear down if something goes wrong. And

I'm just a big fan of Docker in general

and use that in my day-to-day

development. So, the prerequisite that

they mention here is you need Docker

Desktop installed on your computer

before you can get up and running. I

already have this, but I'll briefly show

you guys the setup steps you need to go

through. So, if you come up to the top

here, click on Docker, you're going to

be able to download Docker Desktop for

the operating system that you're on. I'm

on Apple Silicon, but if you have

Windows or Linux, make sure you're

grabbing that right one. Go through the

setup wizard and install steps just to

make sure that's up and running. And on

Mac, I can see that it is running here

by this, you know, green dot and Docker

Desktop is running. If you're on a

different operating system, you can also

pull up the terminal, go ahead and just

type Docker in the command line. And if

everything was installed correctly, you

should see this help command printed out

that shows you everything that you can

do with it. So if you see something like

this, that means you're all set. The

next step here is going to be scrolling

down and going into this starting NAD

section. So the first thing we got to do

is open up our terminal and copy this

first command here, which is going to

create this volume docker volume create

nad data. This is essentially going to

create something on our file system that

allows us to save our workflows,

executions, debug data, all that NAND

creates even if we have to restart NADN

in the middle of, you know, a system

reboot or for whatever reason. If you

don't have this and you end your Docker

container, which is just like an

instance of NAD running, you're going to

lose that data. So, make sure you don't

skip this step when you're getting set

up. So, I'll copy this. We'll go ahead

clear this out in my command line.

And we can see that madn data was

created. And so then if I do docker

volume ls that is going to list out all

of the volumes we have installed on our

machine. And so we can see I have on my

local machine we have the nadn volume

created which matches exactly what we

need. Let's go forward to the next step

here which is going to be setting up and

running the naden instance itself. So

using docker run and then a couple

arguments are going to get passed in

here that spins up our nadn docker

container which is the nadn server on

our own machine and is going to bind all

of the data that nadn creates to that

volume we made earlier. And so make sure

you're not skipping anything here. Once

again I'll copy this. I'll paste it in.

And if everything worked correctly,

we're going to see this message down

here that says editor is now accessible

via this localhost URL. Let's grab that

and see if we're able to now pull up our

NAN local instance. And so perfect.

Everything is set up here. And if you're

seeing this for the first time, you'll

need to make your local account.

And we should be able to skip through

this onboarding session to get into our

NAND dashboard just like we see on the

cloud version. So that completes it here

for the initial NADN cloud setup. Pretty

easy to get up and running with Docker

and that is a path I suggest you take

when building locally with NADN. All

right, so that finishes up everything

for step number one for setting up our

Naden instance locally. It's time to

move to step number two, which is going

to involve setting up Olama. I'm going

to come over to olama.com.

Go ahead and click on download on the

homepage and then pick the operating

system I need to download this for. I'm

on Mac, so I'll go ahead and download

that. And this is going to give me this

DMG file that I'm going to be able to

execute as soon as it opens up. So, I'll

click this,

be able to drag that into my

applications folder. And since I already

have this, I'm just going to stop. But

on your side, make sure you just

continue forward with that process or

follow the install steps you need for

your specific operating system. That's

all we have to do for now. A very easy

setup step. And if you remember from the

overview, Olama is going to be the piece

of software we use to download, manage,

and chat with large language models on

our own computer. So, we just completed

setting up Olama on our own laptop. It's

time to move to step number three, which

is going to be installing the GPT OSS

model inside of Olama so we can then

chat against it and power our AI

applications inside NADN. So, let's go

over here to this tab I've set up. It's

under um.com under library and then

GPTOSS. And so, if you want to search

for this, you can just search GPT

OSS. And this is going to give you the

setup steps to get up and running. We

want to go ahead and run this latest

version of GPTOSS. And so I'm going to

copy the model name here. And then

inside our terminal, we need to run O

Lama run GPToss.

So let's go ahead and do that now. So

we'll go O Lama

run.

And then we'll paste in that GPOSS

colon latest model version that we saw

on the table. And so I'll hit enter. And

that's going to go ahead and get started

pulling in the open model weights for

GPOSS that we're going to be able to use

and run actually on our laptop instead

of having to call into OpenAI's API.

Everything is going to run completely

for free and locally on our computer.

So, I'm going to let this run for a

second and then I'll be back as soon as

that finishes. All right, we're back

here and we can see that the download

here has completed to 100%. And if you

see the writing manifest and the success

message here at the bottom, you're good

to go with the latest version of GPTOSS

installed on your computer. Let's see if

we can chat against it here in the

terminal. Let's say, write me a poem

about NAN.

Let's see what it does. And so once

again, this is all running here locally.

And we can see I'm just on a I think a

MacBook M2 hardware. So, not even the

latest version of MacBook Pro and we're

getting a pretty instant response back

here from this model. So, that looks

pretty cool. Let's um see if we can exit

out of this because we're going to need

to run this a little bit differently

when it comes time to connect this to an

ADN. So, we have our GPT OSS model now

installed locally through Olama. It's

time to move on to the final step, which

is going to be hooking up an Olama chat

model inside of Naden. that's going to

be connecting to and talking with the

Olama model we just installed. The first

thing we got to do here before diving

into NADN is going to be running one

more simple command in our terminal. And

so we have Docker still open here. And

then in this separate terminal over

here, we need to spin up an O Lama

server that is going to host the chat

model locally on our machine that NAN is

going to be able to reach out and talk

to. So, we're going to come over here

and run one more command, which is going

to be Olama serve. It's going to spin up

that Olama software we already

downloaded. And so, if you see a message

here, um, like some of these IP

addresses appearing on the local IP and

you see this message that says listening

on 127.0.0.1,

that means everything's working as

expected here. And then we're just going

to leave this terminal open and not

touch it for now. Let's go back to NADN.

Let's go back to NADN and get started

building a brand new workflow here that

is going to connect to our Olama chat

model. Let's make this just a manual

trigger.

And let's hook up a new LLM chain node

here that is going to let us use Olama

to make a single chat call. So, we'll

pull this in. Let's go ahead and add

Olama

Olama chat model from the right hand

side here which is going to be our

credential that allows us to connect

from NNN to the server that we just spun

up. So in order to create this

credential we'll come here click create

new credential and we're going to see

this base URL says this local host

value. I can click save but that's going

to give us an error right off the bat

because we're using Docker. There's one

other step we have to follow in order to

get this connected and working to power

our automations. I found this from this

uh self-hosted AI starter kit on GitHub

that sets up something very similar with

self-hosted NAD and Olma. If I search

for local host on this, we can see that

in order to connect from our Docker

container that's hosting NAD to the

service that's running on our computer,

which is Olama hosting the chat model.

We do need to switch a URL here and that

is going to be changing the base URL to

this http host.docker.in internal and

it's going to be this port number. So

make sure you go ahead and copy this

exact value. And when we come over to

our base URL here inside our Olama

account credential we're trying to set

up, we can click save and then that

connection is going to be successful. So

that tells us that NAND is now connected

to Olama and we're going to be able to

chat together.

Let's exit out and get started with the

rest of this build. So we have this

connected. We have our basic LLM chain

set up here. Let's go ahead and do a

quick set field result and let's see

this automation in action. But before we

do that, let's add in a prompt here and

actually tell our LLM to do something.

So, let's have it write a LinkedIn post

comparing naden versus make.com.

So, let's execute

model llama 3.2 not found. So, it looks

like I missed one setup step here. And

so, let's click into the tool. Let's

pull up our GPTOSS

colon latest model here, which is going

to be GPTOSS that we just downloaded.

And let's give that one more shot.

So we can see this running here which is

good. And then if I pull up my terminal

again and we go to our O Lama tab, we

can see that this is actually processing

and our NADN instance has connected to

this Olama server and is interacting

with GPT OSS. So this is spinning and

working locally. Um compared to a

typical Open AI API call, this is

actually going to take a little bit

longer. And so that's something to keep

in mind when you're building here. um

it's going to depend on the capabilities

of your own laptop, your own desktop

computer. If you have a GPU, that's

going to allow it to go much faster. So,

just be aware of that when you're

building on our text field here. Let's

go ahead and grab in the text result so

we can see what this looks like. So,

there we go. There is the simple setup

for getting an LLM chain call working

locally using GPTOSS.

Let's go a step further and see if this

works with the agent nodes inside of

naden. So, I'll go ahead and add in a

new chat trigger. We'll drag that in and

let's add our agent node here and see if

we can get this connected and running.

I'll go ahead and pull this down. I'll

drag in my chat model here, which looks

like it's going to work. Let's add some

simple memory and let's see if we can

get a tool working here, too. So let's

do just the thing tool for now and see

if it's able to call into tools and do

that effectively. We'll do another set

field and let's see if we can get

something running here just by opening

up the chat.

Think deeply about the differences

in andmake.com and write me a LinkedIn

post breaking down the pros and cons of

each. So we'll send that in. We're going

to see our AI agent node running here

calling into the chat model. If I pull

up the terminal for our Olama server,

we're going to see this is running once

again.

And let's see if we're able to get this

think tool called here. It does look

like the memory was getting used as

expected. And let's just wait a bit for

that result. And there it is. We can see

our think tool was executed during the

log of this AI agents loop that was

running forward. And if we pull up the

result, let's drag in the output and we

can see what we got here. So again, n

versus make breaks down a markdown

formatted output of this LinkedIn post

and gives us some details here. So this

looks great. Once again, pretty

incredible for local AI models running

here. And honestly, wasn't that slow

here compared to some of the other

models I've tried in the past. So I

think this is going to open up a lot for

local development and really allow AI

developers, builders, agencies to start

building, bundling and selling products

to industries where it wasn't always

possible before like defense,

healthcare, legal things where privacy

was a huge concern and API based large

language models weren't able to be used.

So very excited about this for the

future. Hope you guys found something

helpful here. Definitely let me know

what you plan on building with GPTOSS.

I'm excited to hear about it.

>> All right, before you go, make sure to

hit the like and subscribe button.

Seriously, do it because we're going to

be breaking down all of the AI

automations that we use to run all of

our businesses. They're super helpful

and we're going to break down exactly

how we do them on this YouTube channel.

So, make sure like and then subscribe

and then you'll get notified when we

publish new workflows that can make your

businesses run 20 times more

efficiently, just like we're seeing here

at the recap. and the other businesses

that we're running. The other item is

join our school community for free. The

links in the description. You'll be able

to get this template, this automation

that we just ran through in this video

completely for free. You can go to

school and navigate to the video that

you want and then you can download the

JSON output for the NAN automation. So

like and subscribe. Join our free

community to get this automation for

yourself.

Loading...

Loading video analysis...