Route and control all your AI traffic on ngrok.ai | early access

By ngrok

Summary

Topics Covered

One Gateway Fixes Fragmented LLM Infrastructure
Provider Switching Without App Changes
Failover Lives at the Gateway, Not Your Code
Gateway Failover in Action

Full Transcript

Building with LLMs usually means juggling providers, SDKs, API keys, and you don't even get org-wide visibility into your own requests, and you don't get failover when something inevitably goes wrong.

ngrok AI fixes all of that with one AI gateway for all your LLM traffic, and it's in early access right now.

I've got a really simple app here that sends a request to open AI for a story about a unicorn, vanishing before Dawn's about a unicorn, vanishing before Dawn's first light. Lovely ending.

first light. Lovely ending.

But I want to use ngrok's AI gateway to route my traffic, so I'm going to go over to my dash. I'm already

I'll create a new AI gateway, and I'm going to use my free dev domain that in the AI gateway section.

going to use my free dev domain that comes with every account.

As you can see here, I have a public URL for my AI gateway, and all I have to do in my app is drop that in as a base URL, save the file, and send another request.

And the great thing is that that's all I have to do. For the sake of my app, it have to do. For the sake of my app, it doesn't know that it's going through an AI gateway.

It still thinks it's basically just talking directly to open AI's API. And

that's great, because now I can do things like change providers on a whim, or add like change providers on a whim, or add failover for when things inevitably break.

The open AI SDK only takes a single API key, so that means that if I want to add failover for things like when the failover for things like when the provider goes down entirely, I hit a rate limit, or I've run out of tokens on a single API key, I need to do that at my gateway.

And so what I've done first is I've already moved my API keys for both open AI and Anthropic into ngrok vaults, which are a secure place to store your secrets, and then reference them in your policies later.

I've also added one for my IP, which I'll talk about in a moment here.

So I can go back to my AI gateway, and into the traffic policy, which is where I configure how it behaves. And I will just drop in a policy here, and then we'll talk through it a little bit.

talk through it a little bit.

My AI gateway here, and yours as well, it's a public URL, which means it's publicly accessible publicly accessible to the whole internet.

And you don't want people, especially once you move your API keys to the once you move your API keys to the gateway, you really don't want them gateway, you really don't want them making requests on your behalf.

So what I've done here is I've restricted access to my AI gateway just to my IP access to my AI gateway just to my IP address. There are other

address. There are other ways you might want to do this.

ways you might want to do this.

For example, you might want to create your own API keys that you use to authenticate to the gateway. But this works for me.

gateway. But this works for me.

gateway. But this works for me.

gateway. But this works for me.

The second piece here to note is that I've added these providers, and I've referenced my keys that are in vaults, so that my gateway will first try open AI.

And then if that fails, it will fall back to an anthropic.

Now, I only have to change one other thing in my app here, which is to add a models array. And this allows me to

models array. And this allows me to specify exactly which models I want to fall back to in case the primary one doesn't work.

I've decided that I want to try a different GPT, then go over to anthropic and clods on it as a third option. So

let's try this again.

And you can see that Claude Sonnet has stepped in where open AI for whatever reason seems to be failing. And even

better, I can go back to the dashboard over to traffic inspector and see all the details about my requests and the responses from different LLM providers.

Okay, in very short order, those are the foundations of ngrok AI. One gateway for all your LLM traffic. We already have all your LLM traffic. We already have docs available. I'll link

docs available. I'll link those in the description below.

If you want to check out other things like model selection strategies or how you route to local models like you might have with Olama or LLM.

And if you want to try out ngrok AI right now, go to ngrok.ai, enter your email address, and we'll get you in right away.

Loading...

Loading video analysis...