AWS re:Invent 2023 - Amazon VPC Lattice architecture patterns and best practices (NET326)
By AWS Events
Summary
Topics Covered
- Admins Empower Developers Safely
- Services Abstract Compute Platforms
- Service Networks Enable Scoped Connectivity
- Lattice Handles IP Overlaps Natively
- VPC Lattice Replaces Sidecar Meshes
Full Transcript
- Hey everybody and welcome to NET326.
This is Amazon VPC Lattice Architecture Patterns and Best Practices.
My name is Justin Davies and I am a Product Manager here at Amazon focusing on all things application layer networking.
It's in the EC2 Networking organization.
So let me get to know the audience a little bit here.
just before we kinda get started.
I like to do this every time because as you'll see as we kind of go through this presentation, it has a lot to do with roles and meeting customers where they're at.
So maybe just with a show of hands, who in this room considers themselves kind of more of an admin?
This could be security admins, network admins, cloud admins.
Okay, so a good portion of you.
All right, what about...
So to the people watching the video later, a good portion.
What about developers?
This doesn't have to be the person writing the code, but necessarily owning a service perhaps, configuring that service and I would say it's probably about 50, 50%.
So that is very good to see.
What about both?
Who is kind of the wearing all hats?
That's good too, so a good portion of you.
Alright, well this is exactly why we wanted to build VPC Lattice because while a smaller company or even a larger company might have people that wear multiple hats, there's a lot of people that might be covering the admin and the developer roles.
Typically what we find is that you usually fall into one or the other and sometimes it doesn't have to be, but sometimes there's some conflicting priorities that kind of can get in the way.
So, these are the kind of things I like to say.
I wanna empower my developers.
This is what the admin says, but doing it safely, right?
A lot of the time I've heard the term, they're the fun police is what a lot of developers call the admins.
But it's not true, right?
It's like they don't want to end up on the news.
Their whole charter is to make sure that you can run and operate and move quickly but not end up on Hacker News, right?
Now the developers even if you are this full stack engineer, they may not be the CCIE or the the network expert and in all reality it's probably their least favorite thing to do.
They'd rather not be focusing on trace route to try to figure out and troubleshoot their problems. So how do we make this easier for folks, not just with configuration but also with troubleshooting?
That's kind of where a lot of this stuff came from after talking to customers and seeing that this persona problem exists.
So earlier this year, actually last year we announced it as a preview, but earlier this year we took a new product called Amazon VPC Lattice to GA.
That was back in March.
And the whole purpose if you don't know, we will spend a little bit of time going over it, the whole point is to simplify application layer networking and to bridge the gap between admins and developers, right?
How do we make this a little bit easier?
How do we make them become friends again?
(Justin laughing) By the end of this, I want all of you people talking with each other.
All right, so with VPC Lattice, the whole mission was to make developers be able to build their applications faster, simplify networking and connectivity so that they weren't dealing with the underlying network connectivity pieces.
We wanted VPCs to not really be a thing, right?
We wanted applications to be able to live wherever made sense to them, different accounts, different VPCs, and the developer really shouldn't have to think about any of this kind of stuff, right?
From a security perspective, there's a big push to get zero trust architecture patterns and to adopt kind of application layer security, not just network layer security.
We wanted to make it really easy for customers to get the strongest security possible with the application of authentication and authorization.
And then from a visibility perspective, we wanted to make sure that we not only gave them the network level logs and metrics that they had them to, but also give them a whole bunch of new metrics and logs which we'll give you an example from for in just a tiny bit here.
On the admin side of the house, it's all about guardrails, right?
How do I empower the developer to go super fast but still have those tools and controls to be able to audit and enforce my organization's security posture, right?
How do I do that without getting in the way?
And so this is where on it covers the same three tasks, right?
Connectivity security visibility but a little bit different, right?
So for them the big focus is on, how do we simplify one of the most common problems?
Network connectivity between VPCs and accounts.
On top of that, what if all of these accounts have literally the same IP address or what if one was IPv6 and the other one was IPv4?
How do we solve all of that kind of stuff all at once?
So VPC Lattice is really kind of solving that for the network admins.
On the security perspective, it is giving an interface.
A centralized interface to kind of enforce those security postures.
And then from the visibility perspective to be able to see if that security posture is actually doing what you're thinking it's doing, as well as if your developer calls you at two in the morning and says it's a network problem, you have the same visibility that they do to actually determine what is this IP address?
'Cause in all reality, like what is an IP address?
It's not an IP address, it's probably a service behind that.
So it gives you that kind of bridge to understand that.
Okay, so what are we gonna talk about today?
We've got an overview of VPC Lattice for those of you living under a rock.
(Justin laughing) We'll talk about kind of what it is, some of the key core components.
Then we'll also talk about some new features and capabilities that we've launched since GA, And then I'm also gonna cover some of the common questions and answers.
You know, we launched in March and it's been really interesting talking to customers and a lot of the time there is some common questions, and I'd like to address some of those today to help out some folks that you probably all have the same kind of questions.
We'll also talk about some top architecture patterns and some popular things for you to think about to help address some of the problems that you're probably seeing with your customers today or your own applications today.
Then we'll follow it with how to get started, how to contribute and stay in touch with us.
Alright, so VPC Lattice has four key components that we're introducing.
This is A VPC feature.
Services, service networks, auth policies and a service directory.
These are four new things that we've introduced.
You don't have to use them by the way.
You can use it if you want to, if you can enable this, and we're gonna cover each and every single component, okay?
So first and foremost it's really important to know what a service is, right?
Because everybody's got their own definition, right?
There's Kubernetes services out there, there is a Lambda function of service, like what is a service?
So let me talk about this for two seconds.
So you make sure that you're on the same page.
In VPC Lattice, a service is not a physical thing, it's just a logical abstraction.
You could think of it like a DNS endpoint.
So every service is mapped to a DNS endpoint.
In this example I've got app1.example.com.
Okay, that is the service.
And a service in VPC Lattice is just a logical front end and it consists of listeners, rules, and target groups.
Now if you've been around Amazon for a while, you probably know this is very similar to the elastic load balancing family.
Listeners, rules and targets.
The listener is something like a protocol.
Is it http?
Is it https?
What's the protocol version?
Is it GRPC?
Things like that.
HTTP1, HTTP2, that's the listener.
The rules can be something super simple, like if you see the connection default forward to target group, right?
That's just the catchall for all the 443 or all of port 80.
Or it can be something way more advanced, right?
It doesn't have to be, but you can make it more advanced.
So I've copied a couple of examples down in this slide.
The first one's kinda showing, maybe I wanna send a path, like path one for a certain API to target group one that's being fed by a instance autoscaling group, an HTTP2 autoscaling group.
But if I get a header that says user agent equals Justin, I wanna send that over to my target group two that has my Kubernetes services behind it.
If I get a get request, send that to Fargate.
And maybe a catchall, the default action if it didn't meet any of those, send it to a Lambda function, right?
And so it's really powerful because you can now manipulate these things in any way that you see fit, and mix and match the compute platforms for whatever your developers want to use.
Something I come across all the time is the fact that I talk to a developer and they'll say, "I really wanna use Lambda, but my security team hasn't really wrapped their head around how they can secure this in an appropriate way and so I'm being forced to do X, Y, and Z."
And the admins, I come from an admin background.
So I'm like, "I get it."
Yeah, if I haven't approved that, I don't want you using that either.
But this gives a little bit more control where I can have this abstraction in front where I can enforce the same security controls regardless of whatever is behind it.
And it's super, super powerful to allow the developers to move fast and choose the platform that makes sense.
It's the same philosophy of like letting your developers use whatever coding language they want to use.
Here's another one.
We also support the ability to wait targets.
So target groups can exist across VPCs and you can wait targets.
Like maybe for /path1 send 90% of the traffic to target group one and slide in 10% of the traffic to a Lambda function.
I'm not showing it in this diagram, but you can have the targets be in different VPCs, which is really cool.
For some customers what they've kind of said to us is like, "Hey actually this helps us with Kubernetes upgrades." Right?
Because now I can actually lifecycle my VPCs, have my old one up and running, keep it going, spin up a new VC, get the new cluster up and running, get my services going.
It's like the staging environment.
Fail away 10%.
Once I've qualified it and I feel comfortable with it, I can delete the old one.
And what does this do?
Because I have an abstraction in front of it, AKA the service.
I don't need to do anything to my client.
My client, wherever that client is, it could be another service, it could be a user, I don't care, they didn't even know you did it, right?
Like they don't even know you changed something.
And so it's a very, very powerful thing to be able to separate those two things.
It's a good abstraction layer.
Service network, okay.
So this is another totally made up thing.
It's a logical abstraction.
If you're familiar with the concept and a lot of you probably are of like active directory and different IAMs. Have this concept of users and groups.
You can do a whole bunch of things with individual users like put permissions on them and all that kind of stuff.
But if you have a whole bunch of users, it makes sense to kinda like make a group and put all the common policies for that affect all those users.
Then you could put like the really specific policies on a individual user.
It's the same thing here, okay?
So a service network is really just a grouping mechanism that allows you to do a couple things and we're gonna give some examples of what those things are.
But really think of it as a grouping mechanism, okay?
You can put services inside it and you can associate that service network to VPC.
That's the thing that enables connectivity, right?
Just because a service is in a service network, it doesn't mean those things can talk to each other.
It's a provider and consumer model.
And so you put the services in it, you make this thing everything you want it to be and then you associate it with the VPCs, okay?
Auth policies, this is what I'm talking about with the users in group story.
(Justin chuckles) You can put off policies, which is basically IAM resource policy on service networks and you can put them on services.
And here's the big kicker.
So an IAM resource policy, you're probably familiar with it, you could put it on a bucket and the resource is usually like an AWS thing, right?
Like who's allowed to do something to this resource?
Here's the difference.
That resource can now be your service, your VPC Lattice service, okay?
So it's a very, very powerful way to use IAM authentication.
It gives you all the benefits like hands-off secrets management, ability to attach instance profiles and task roles, hands off credential management for your own service to service authentication and then write super context rich authorization rules of what can talk to what.
And these can be applied on the service network or the service.
So what does this look like if you can apply them in two places.
Is that super confusing, right?
No, this is basically the least privileged model, okay?
So if I write a policy that essentially says allow traffic on the service network and then I write another policy on the service that says allow, you guessed it, you're allowed.
But if I did something a little bit different and I said allow traffic at the service network, don't allow traffic on the service, well you guessed it, it's not allowed.
Okay, so it gives you this kind of flexibility to really control things.
Here's an example of basically a service network policy.
Service network policies and service policies, they're the exact same thing.
You can put fine grain or coarse grain policies on either side.
You can choose not to even enable service policies if you didn't want to use authentication and authorization.
But I encourage it.
And what I typically tell to customers is, don't put something super complex on your service network.
This is your guardrail, this is your defense in depth, right?
This should be human readable that you can quickly understand.
Like I know that every single request that came in here was at least authenticated from my org id.
That's what this policy is that I'm showing here.
So it at least it came from me.
That's what I'm basically saying, right?
Which is not where we want you to go.
It's not like the end goal, but this is your guardrail, this is your protection here, defense in depth.
So now on the service side, and it still might be an admin configuring the service policy, it's just completely up to you and your organization.
Some peoples give their developer or service owner more control of their own services, others they separate that duty.
Whatever makes sense to you.
But in this model, I say this is where you do the fine grained stuff and you can get really, really fine grained.
So in this example, I'm actually defining individual principles to call, right?
This supports the typical IAM stuff so you can actually do principle tags.
So like if you wanted to group and say, "Only principles that were tagged prod or PCI."
You could do stuff like that here.
And then this one where all the yellow text is, I'm basically saying only for my service, which is the widget service.
If the query string parameter of the request, you know the actual HTTP headers equals color blue and it's for get requests.
So I'm talking very, very, very context specific rich policy.
The reason I'm saying do that on the service level is 'cause if you did that on the service network, it wouldn't be human readable, right?
Like it would be really, really tough.
So you want that kind of separation there.
Okay, service directory.
This is really powerful actually, even though it's the most simple, simple feature we have.
This is just a list of all of the services that you created in your account.
It's an account level view of all the services you created and all the services that were shared with you.
Okay, so admins love it because it's like, I can log into that account and see here's all the things that you technically have access to.
And the developer loves it because they can go in, copy the URL of the service they're trying to get and hit the ground running, right?
So very, very powerful.
Even though it's the simplest, it's nothing super fancy, right?
You go to the console and it's a great thing to have there.
Alright, so this is what the roles in my mind make sense to me.
You can change these if you want, give different IAM permissions.
But typically what I see when I'm talking to customers, the admin is the person that creates the service network, defines the access controls with the off policies and is usually responsible for not only spinning up VPCs, but associating the service network with VPCs.
The developer on the other hand is kind of like the person that would traditionally be setting up the application load balancer and everything behind it, right?
And so that could be somebody different in your role.
But they create the service, they define traffic management.
It could be as simple as just default forward to me and I'll figure it out, or it can be that more advanced kind of blue-green type canary style deployment stuff.
They can or cannot associate services to service networks.
Some companies say, "Whoa, whoa!
No way, I'm the only one that's gonna be able to do that."
Sorry. (Justin laughing)
So it's just up to you.
You can kind of figure that out.
But this seems to make sense for most of the customers I'm talking to.
Alright, so let's walk through an example here with a bunch of my vegetables and fruit salad company.
(Justin laughing) We're gonna walk through what it looks like to create a service, create service networks and associate service networks to VPCs.
This is mostly gonna just be an architecture talk.
I'm not gonna really give a demo or anything like that.
I hope that's okay and I'm happy to do that later if anybody wants to grab me in the hallway.
But here we've got four VPCs, a bunch of VPCs all over the place.
Some of these are shared services, some of them are services that only can talk to each other within their own application, and other ones are clients only.
What I mean by that is that, when I talk to customers a lot of the time they have applications that have dependencies, AKA meaning that they're a service provider and a bunch of applications are calling them, but they also have applications that are like batch jobs that nobody is ever gonna call them.
They're just gonna call others, right?
So that's what I call clients, right?
Those are clients and some applications can be both.
Okay, so let's walk through an example of where we kind of see each of these in action.
If I go ahead and I create the service, you'll see that I created services for blueberry, kale, tomato, carrot, cucumber, VPC three I didn't really do anything in because they're just consuming things.
Nobody needs to access them.
So why would I even create a service for that?
Okay, I don't need to.
They can still participate in Lattice, but I don't need to create a service for them.
Then I got cherry, apple, and pineapple.
Okay, so this is the fruits and vegetables.
So to start out I wanna create a network where all my fruits can talk to each other.
So I primarily just wanna share blueberry, cherry, and tomato.
So I create a service network, I call it fruits and I go ahead and I share fruits, AKA associate fruits to my service network...
Sorry to my VPCs four and three.
At this point as long as I don't have some crazy off policy denying things or I don't have security groups or network ACLs blocking things getting there, raspberry, banana, the two batch jobs, cherry, apple, pineapple, they can all talk to each other.
Right, they can all access each other.
They can automatically discover each other, connect to each other, and there's no load balancers here.
There's no load balancers, there's no network connectivity, VPC Lattice is both of those, right?
It's handling all of that functionality and this just works, right?
So that's the three steps.
But now what do I do about vegetables?
So vegetables needs to also connect to things like the shared service kale.
I'm sorry, kale is just one of the vegetables that carrot and cucumber need to access.
But what do I do with tomato?
Because sometimes people tell me tomatoes are fruit.
Sometimes people tell me tomatoes are vegetable and so I don't know, that's my shared service, right?
It needs to be accessed by both of them, just depending who you're talking to.
I think it's really a fruit.
But so this way, this is basically to show you that you can put services in multiple service networks, right?
And so it gives you this flexibility of kind of just like you define whatever your connectivity pattern is for that VPC, and it will figure out that connectivity, okay?
So you can attach service networks to multiple VPCs, you can attach services to multiple service networks and it gives you this hierarchical approach to kind of manipulate it as you see fit.
Okay, so here's kind of like a overall kind of view of how to think about these things.
By default out of the box, your services and your service network, even your VPCs if you really cut off all the boundaries, it's designed to be secure by default.
Services only have access to the services in their service network.
This is really important because if not even talking about specific like security group filtering, network ACL filtering, network level controls, application level authentication authorization, you get very strong protection just by scoping your network correctly, right?
Just by scoping what that VPC has access to in the first place, right?
So that's a very important part.
In this same design you can enforce encryption.
You can say only HTTPS traffic's allowed, you're not allowed to do HTTP.
That's a very strong one that can be an easy button to say, well check mark, at least I know everything is HTTP and encrypted.
You can enforce authentication authorization on every single request between your services and anything that leaves your VPC or comes into your VPC.
And again, I wanna emphasize like start with simplified policy just by scoping your network appropriately.
How we did the fruits and vegetables, right?
I needed something new.
So instead of making my policy crazy, I just spun up a new service network.
And then services can be associated with many service networks as we talked about.
Alright, so now that we've got some of the baseline type stuff together, these are the core things we launched with.
These are the core features and kind of functionalities, capabilities, whatever the word is that VPC Lattice supports.
So we don't care if your services are across VPCs or accounts, we're integrated with resource access manager.
So you can share services, individual services, you can also share service networks.
So it gives you this flexibility of, do I wanna share onesie twosie things and let my consumer put it in their own service network or do I wanna share a collection of services and just let them associate to their VPCs?
The other part on the connectivity piece is, this is doing network address translation.
In this example, and I'm gonna go through a packet flow in a couple minutes here.
Everything talks to VPC Lattice on something called a link local IP address range.
When you receive a packet it came from Lattice, right?
And that's the kind of protection that is made there.
And so we also handle with this process all of your network address translation.
It's literally everything was the same IP address, that's okay.
Because you're always talking to Lattice and we do that translation as necessary.
If one side's IPv4, one side IPv6, you're going through a transition where not both of your sides can do it at the same time, totally fine.
The protocol versions we support are HTTP1, HTTP2, HTTPS, gRPC.
So mostly HTTP service.
This is an application layer proxy.
For TCP today, you would combine this with something like private link, right?
So you do your private link stuff for your TCP or transit gateway, that stuff still works with it, but this is really for application layer stuff today.
Request routing, typical stuff that we talked about before, path header method way to target groups.
Things you would expect from a application layer proxy.
Then from a security perspective, you can put security group referencing on your VPC associations, which is really pretty powerful because you can kind of like pick and choose with security group referencing what instances or what resources in your VPC can talk to VPC Lattice in the first place.
It's a very, very strong kind of network layer control that isn't relying on IP addresses, right?
You're doing security group referencing.
But then again also walk up the stack, we have that IAM authentication on the service and the service network.
So you get that full defense in depth strategy.
The target types we support right now, Instances, Lambda Functions, ALBs, IP addresses, our Kubernetes integration, we have a controller for Kubernetes that goes and provisions this all for you.
It's called the AWS gateway API controller.
It's open source.
That shows up in Lattice as an IP target.
So you know, so that's why it's kind of calling it out there.
Sports autoscaling groups.
And then for ECS today we still require an ALB or NLB to fit in front of your ECS tasks, okay?
Alright, so what is new?
If you were around for our reinvent talk last year, you're probably carrying, okay, what happened since then?
(Justin chuckles) So there's a couple things we've really been trying to move quickly on a lot of these things.
We've introduced a couple new regions, I won't bore you with reading them.
We do have plans for a lot more regions next year, so stay tuned if your region is not there.
We integrated with a resource access manager that was there from GA, but we added a new feature called Customer Managed Permissions.
And I'm not gonna talk too much about this one right now 'cause I'm gonna have a slide that walks through it.
Shared VPC support, you can have services and shared subnet, you can have clients in a shared subnet, there's no funkiness there, it works.
So that's a really popular one for a lot of customers where the admin team owns the VPC and they want to divvy out subnets for their consumers.
VPC level compliance, this is a big one.
I mentioned that we're a VPC feature.
And so all of the VPC level compliances that are show up on our compliance webpages cover VPC Lattice.
The one caveat that I would pull up though is there's a couple of compliance certifications that are not covered with Lattice, things like FedRAMP and these are the services that actually certify your API or SDK individually instead of the feature.
So wherever you see individual SDKs listed, that's typically one that's probably not covered.
But all the major ones that you are typically used to are covered.
New event structure for Lambda and identity header propagation is basically kind of the same feature.
We're gonna go into that in just a second here.
Just one's for Lambda and the other one's for all the other platforms. So we've got support for both and then the gateway controller is now GA, that happened a couple weeks ago.
All right, so identity header propagation.
This is probably one of my favorite features that we've launched.
Super powerful, it's also probably the most thumbs up I've ever gotten on a LinkedIn post when we've launched this feature.
So this actually on every single request, we add a header that includes a whole bunch of metadata for your backends, for your targets.
That includes things like who the caller was?
If it was authenticated?
If it was authenticated what was the principle?
It includes stuff like what the source VPC was?
What the account was?
Like a ton of information.
And I'm again there is a slide right after this so you'll see all the different ones, but it also works with IAM Rolles Anywhere.
So this is what the kind of header looks like.
This is an actual snapshot from an HTTP server and this is what is covered here.
So there's some that are covered in just their standard X-Amazon Anywhere identity.
If you are using IAM Rolles Anywhere, there's some really nifty things you can derive out of this.
So we had some customers that were asking us to be able to use their SPIFFE roles as in policy.
So be able to write an authorization policy using a SPIFFE ID instead of just a IAM principle.
And so this gives you that, right?
Like you can actually identify that with some of these things like the X-509 SAN, name and URI.
It'll give you that common name.
Lambda event structure again, pretty much the same feature.
It's just it'll deliver it to Lambda as an event structure, right?
So you get all of this rich information, which is super powerful, right?
So like Why is this thing important in the first place?
Okay, well, not only do I now have all of the information about the entire identity and the environment the request came from directly in the developer's HTTP logs, two in the morning, they're typing away, they're looking at their logs, is it my service?
They see the whole path and everything that it came from right there.
Without looking at another tool, they're looking at their HTTP logs.
Super powerful, great for troubleshooting, but it also can be used for additional custom authorization on the backend.
It can also be used for personalization.
Maybe you wanna treat things differently, right?
When it gets there based on this information.
So very, very powerful feature that I definitely think you should take a look at if you didn't see the initial launch of that.
Customer managed permissions.
This is a really powerful one, I'll walk through it here.
But this sharing of resources, you share services and service networks, when you share a resource, there's actions that these people can do with these resources.
So like a service network, if I shared a service network, what can the person I shared it with do with it?
Can they put services in it?
Can they associate it to VPCs?
Can they do both?
These are very important questions.
And what this allowed you to do is, without getting into the destinations IAM configuration, I can attach a policy as I share it saying you can only associate this to VPCs.
You're not allowed to put services in it, right?
And so that basically looks like this, right?
So condition string equals source VPC and you're actually describing what that person's allowed to do without having to touch their IAM permissions on their side.
It's a very powerful feature, simplifies things quite a lot.
Okay, questions and answers.
These are just the typical ones.
I've picked a handful of questions that we get.
I talk to customers every single day and a lot of the time it's a lot of common things.
A lot of things that you think are unique, your people next to you probably have the same problems. So how does traffic flow through Amazon VPC Lattice?
Great question.
In this one I kind of get tired drawing it up so I wanna show all of you so it's on recording.
(Justin laughing) Alright, so VPC Lattice, we tried to make things as generic as possible so that it could be used with as many systems as possible without a big forklift.
We're not doing any kinda like weird custom, service discovery or anything like that.
I know it looks like it's magic, but we tried to make things as kind of basic as possible so that it was easy to troubleshoot.
So when you create a service in VPC Lattice, this assumes that you've already got your service network and your service set up, we generate a DNS name for it.
You know the region, the service name, a big long hash .on.aws, you know.
You can also define your own custom domain names.
This is what most people do.
They don't usually use our big long, ugly name.
And then you can put aliases to it.
So in this example, I've got an alias for inventory dot...
Sorry billing.myapp.com.
The inventory service is calling it.
So just standard DNS request goes out to the VPC Resolver says where's billing.myapp.com?
It will come back with how you have Route 53 resolver configured, the VPC resolver, and it might say billing.myapp.com
is actually an alias record to the VPC lot service, and the real IP is 169.254.171.25.
As you'll see, that's not the actual IP of the destination service, that's the same IP as the client.
But what that does is it's a link local address that says, "Cool, I know about this.
It's a VPC Lattice service, you're enabled for VPC Lattice since you're in the prod service network and I'm just gonna send the traffic directly to my ENI.
This service is directly attached to me."
So it basically goes out the ENI and the VPC since VPC Lattice is built into the VPC substrate knows exactly what to do with this now.
So it takes that packet into the proxy for further processing.
So now traffic is automatically sent.
A note on this is that we do honor security groups on purpose, right?
We do not want to bypass something.
So if your security group is blocking the VPC Lattice link local range, your traffic won't work.
So usually if something happens, it's probably the first place you should check, right?
Is my security group allowing it?
So in this case, as long as your security groups are allowing it, as long as your network case KCLs are allowing it, good to go.
You don't need to worry about the VPC Lattice IP addresses.
We have a managed prefix list that will actually be used for this.
So after the traffic gets to the VPC Lattice proxy, we validate traffic, we do the HTTPS transaction, we identify the headers.
If authentication is turned on, we run it through authorization policy, both the service and the service network.
These are not two different hops, it's done at one time.
We've merged them and do this at one time, so it's not some big latency thing where it's having to do it twice.
This is a typical policy that could look, first one is the request authenticated.
Okay cool, the request was authenticated from my org ID.
Great place to start again, this is the best way you could do it.
You don't have to have authentication turned on.
You don't actually have to have the client do authentication to use auth policies.
You can still do network level controls and auth policies.
It's just I encourage you to.
(Justin laughing) And then at the service network, obviously doing the fine grain policy.
So we've gotten the service network, we've routed the packet there, we've done it through auth policy.
Now we end up going to the traffic management section.
So after we've ran it through, we've did the DNS request, we routed the packet to VPC Lattice, we ran it through the auth policies, it's been approved, it's allowed, and so now it's able to do the traffic management stuff.
And this is where it looks at the the SNERT rules and targets where your developer or the service owner defined how to actually route these packets.
Okay, and then at that point the traffic is automatically routed even across VPCs and accounts to the destination.
It will arrive from another 169.254 address which is the Lattice range, okay?
Your security group will have to allow that traffic in, otherwise it will not be allowed.
And so this is a picture right here of the managed prefix list.
So don't use the IPs, don't worry about the IPs, use the managed prefix list.
And there's one for IPv4 and one for IPv6.
So just use both of those, put them on everything and you're good to go.
Okay, I'm not gonna read this.
I don't want you to read this.
If you wanted a recap of what I just said just for later reference, you could take a picture of this, but don't worry about reading it.
So I'll pause here for two seconds until I see people put their cameras down.
(Justin laughing) Okay, all right, how does pricing work?
Everybody wants to know this.
What is a service?
Am I paying for every single instance of my service?
All this kind of stuff.
So there's three dimensions of pricing.
Typically, most customers only are really looking at two of them.
It's there's an hourly charge per service, there is a data processing charge, how many gigabytes are being processed?
And then there is a request per hour.
The request per hour is the one that usually customers don't really care about because there's a free tier that covers 80%, 90%, sometimes a 100% of your services.
And so that's the first 300,000 per hour are free.
If you go beyond that, it's 10 cents per million requests.
So if you got some super hyperscale service, you might fall into that tier, but then your 80th percentile or your long tail wouldn't hit that.
On the hourly charge per service, in IED, you can think of this as like $18 per month per service.
And again, keep in mind this is handling your network level connectivity, your application load balancing take functionality, all that kind of stuff.
And it's a DNS name.
So if you have multiple paths behind a certain DNS name, that's still one service, right?
It's not multiple services.
So there's actually a very flexible kind of way of thinking about that.
There's no attachment or endpoint cost.
So this is slightly different than other AWS services where you pay for the consumer.
Like if you have a 10,000 VPCs consuming this thing, you're not paying each time you consume it.
You're just paying for the service, the consumer has no association charge.
The VPC association is free.
Again, a service can consist of thousands of targets, you're not paying for the targets, you're not paying for running a load balancer next to every single target.
It's just the front end, that logical abstraction.
Yeah, on the data processing, something to touch on here.
This is all encompassing, right?
It is the load balancer, it is the network connectivity and it is covering all your data transfer costs and everything else like that.
There is no data transfer cost if you're talking to things across Azs or anything like that.
It's just all encompassing in this data processing charge.
Benefits of Amanda service, right?
So this is important to touch on too because I think a lot of people forget that it kind of sounds weird, but it really is kind of important to have a line item for your infrastructure expense.
It's kinda that idea of how do you know how to drive efficiencies into your cost and infrastructure if you don't know what it costs in the first place?
And this is kind of the thing that we discover when a lot of customers are doing a lot of this stuff themselves.
They don't think about their cost that's associated, that might be showing up in an EC2 cost instead of an actual proxy cost or something like that.
And so it's just one of those things where it is important to actually see a line item or deeply understand the true cost of what your infrastructure is, and this kind of solution giving you a pay as you use model is really important 'cause you can actually use it to drive efficiencies.
On the staff Predictivity side of things, the whole point of Amanda service is to spend less time working on the platform and more time using the platform.
As your scale of your applications goes up to meet the demands of your growing business, speed and simplicity matter.
Take advantage of the innovations that are happening behind the scenes with a managed service, right?
This doesn't have to be VPC Lattice, this is just kind of generic, kind of information on the benefit of Amanda's service.
From an operational resilience perspective, at AWS we take operational resilience and availability at scale very, very, very seriously.
VPC Lattice has a 99.9% uptime SLA for single AZ deployments.
And if you deploy your applications across multiple AZs, we've got a 99.9% SLA, the benefit of a fully managed service.
It well, it gives you the comfort of not having to worry about minimizing the downtime in the event your sidecar proxy or some sort of zero day security event happens where you have to figure out how do I actually get in front of this?
This happens behind the scenes without you doing anything.
And then business agility, VPC Lattice helps you move faster, simple period, end of story.
Let your developers build new customer facing products and features, and be becoming better at what they need to deliver instead of becoming networking experts.
So the whole idea is to kind of just how do we take away undifferentiated heavy lifting, right?
If there's certain things you want to keep and do, obviously do that.
But if there's certain things that you really don't think aren't providing value to your actual business needs, it's something to evaluate, Okay, enough on pricing the most.
(Justin laughing) All right, can I have more than one VPC?
Sorry, can I have more than one service network per VPC?
Another super common question I get?
Okay, simple answer, no, but you shouldn't need to, okay?
Services can belong to as many service networks as you want.
And so if you have a VPC that has super unique connectivity requirements, create a new service network and put those services in there.
There's a couple of customers that are literally have a one-to-one mapping where they're just defining it almost like an application layer network for that VPC.
So in the bottom left top secret example, I'm showing just that, right?
There's a couple of services it still needs.
I can build a service network for just it and then connect it.
Is VPC Lattice only for microservices?
Really good one, TL;DR, no.
We don't care what it is, right?
It's not just for microservices.
It could be a monolith, it can be a combination of the two.
This is the whole point, right?
We wanna abstract all of that backend stuff.
That's the whole point of having that front end abstraction.
So what does this look like?
Everything behind the scenes in VPC Lattice is an IP address.
Like it's not hiding anything, it's not doing anything complicated, but it gives you an abstraction in front of these things.
So parking.com starts the migration process.
I can slowly move things over, modernize at my own pace and say my clients are still calling parking.com, but as I've moved the rates service over, I can load balance it over to the rates service.
When I move the payment service over, I can do that and then I can have everything else still do the default forward rule back to the monolith.
And I can keep it there forever or I can migrate that too, right?
So this is made for whatever application you're using, it's not just for microservices.
Is VPC Lattice a service mesh?
This is a big one too.
Okay, so what is a service mesh in the first place?
So typically a service mesh is a control plane and a data plane.
The control plane is typically things like Istio, LinkerD, App Mesh, and the data plane piece is something like Envoy LinkerD Nginx there's a whole bunch of them out there.
Typically you either manage or have a managed service to do the control plane piece and then you have a sidecar proxy, AKA, the data plane next to every single workload, right?
So if there's a 1,000 instances of that workload, you have a 1,000 proxies running next to every workload.
Now this is good and bad, right?
There's pros and cons, and it does handle a bunch of really cool application layer tasks.
You know, service discovery, request level routing, encryption, authentication, really neat handy things.
Typically it only works with Kubernetes.
There's some ways to do other things, but it can be challenging.
The other part here that I call out is that it doesn't handle network connectivity.
It really is kind of that overlay that lives in the systems. from a cost perspective, what you're paying for doesn't really have a line item a lot of the time.
You typically pay for whatever the control plane is, whatever you're running there on EC2 and then you probably pay for the data plane also on EC2 next to your workload.
As your workload scales up, guess what?
So does the proxy, the processing and memory that you're doing is scaling up right beside your workload.
And so the challenge there is you are paying for it in compute cost.
VPC Lattice, a little bit different.
It's a fully managed service.
Both the control plane and the data plane are managed for you.
Doesn't live in user space, it's built directly into the VPC.
It scales completely independently of the workload.
And that really matters from a performance perspective.
Because as your workload scales up, instead of latency increasing with your workload, VPC Lattice has a flat line and stays consistent because it doesn't live next to the actual workload as it's scaling up and needing more memory resources.
Handles all the same kind of common tasks, right?
Service discovery, request level routing, encryption, authentication also does authorization, and it also does the network level connectivity.
So that is a big difference, right?
It's both of those features and functions.
And again, like we covered, the model is slightly different from a pay perspective.
You pay for service usage, not for how many targets running a sidecar that you have.
So TL;DR, the summary there is, no, it's not a service mesh, but some of the features are the same and it's implemented a little bit differently.
It's built directly into VPC instead of your compute environment, your user space, and sidecars are optional.
If you want to continue using a sidecar to do certain tasks, if you have like the need to offload certain features and functionality, you can totally keep doing that.
We're not saying don't do that, it's just you don't have to, right?
And for a lot of customers, basic requirements, you shouldn't need to use one.
So why do that extra complexity?
But if you need it, you could definitely use it.
It's fully managed control plane and data plane and it works across instances, containers, and serverless.
So very, very flexible there.
Alright, top architecture patterns, Starting small.
Okay, everybody's been here at least at some point.
And this could be even big companies that are starting a new application or starting a new product.
Everybody starts here somewhere.
The thing I'd like to call out here is that whenever you hear all of us talk about VPC Lattice, we're usually talking about it as something that solves all your massive connectivity problems across VPCs and accounts, but I wanna emphasize like it actually helps the small user too by reducing complexity because it's just a load balancer, it's just a proxy.
And so you can greatly simplify even your app to app or service to service communication in a single VPC.
You don't have to have multiple VPCs.
It's the quickest way to get connectivity, application level, load balancing, authentication, authorization, so on and so forth without any kind of subnet interaction or anything like that.
It's a very kind of cool use case that just kind of makes that simple.
The other part could be the same application, but maybe this company's acquired another company or something like that.
You're scaling up.
This is where the multi VPC stuff comes in.
When you do add more applications across VPCs and accounts, your operations, your process to onboard things doesn't change.
It's the exact same and you don't have to realize anything different, right?
You get that same benefit without changing your infrastructure.
Without introducing no connectivity patterns between VPCs and having to worry about am I getting the subnet routing correctly, right?
It just works out of the box.
Why is VPC Lattice a good therapist?
Well it's because it's great at addressing problems. Emphasis on the addressing.
(man laughing) I'm glad at least one person laughed, that makes me feel better.
Okay, so we talked about this earlier.
The overlapping addresses in IPv6 migration is top of everybody's mind for all sorts of reasons.
We would love to see everybody in IPv6, I just have a feeling that it's gonna be 20 years from now, we're still having the exact same conversation we had 20 years ago.
So this really helps out here, right?
If you were talking service, service communication, east-west communication, I don't want you yet to worry about this, I don't want you to think about IP addresses.
This completely removes this need and it also allows you to abstract.
If some services move IPv6 'cause it's really important for them.
Maybe it's a public facing service, they can do that.
Don't make the client do it at the same time.
Right, like it could be some client that hasn't been funded in five years, right?
And like, do you really wanna spend all this effort upgrading that when it's not even a really useful product, right?
Let these teams make their own priorities.
If it makes sense for them to go IPv6, cool.
Let this kind of handle the rest.
Now again, I want to be very clear that I'm not advocating IPv4 is the right way to go, it's just this can ease that migration and you can use this as a strategy to really make you move fast.
If you wanna hear more about this, we did a great podcast with the Cables2Clouds folks, I don't know if you'll be able to pick up this QR code or not, but if you can't just let me know afterwards and we can figure it out.
But that podcast covers it and there's a great demo that is also done where it shows you how to do this through the whole configuration and everything.
Migrate at your own pace.
Okay, this one is killer because at the very beginning everyone was stressing out.
I want to go VPC Lattice but I have an existing infrastructure.
How do I do this?
Or I already have implemented the service mesh, how do I do this?
Really easy actually.
Turns out you can run these things side by side, it's just standard protocols.
So in this example, I've got a travel service living on the right hand side.
I think it's the right hand side for you too, living behind an application load balancer, okay?
And I'm connecting to it through transit gateway.
It's just a standard DNS request travel.myapp.com,
your 10.001 which is the load balancer IP and I get routed via route table entries and transit gateway over to that destination.
Now you can go and turn on a VPC lot of service without a doing anything.
Like you just making a service, a VPC lot of service doesn't change anything.
All your other stuff still works.
You can still connect to the DNS 10.001.
It's just that we've given a new DNS address so that if you want to connect to this new address, you can.
Okay, so what does that look like now?
So I've created the service, I've put it in this service network and I've associated that VPC with a service network, that service network.
At this point I'm still using the load balancer not changing anything.
But now I wanna change that, but I only wanna change it for one VPC.
I don't wanna change it for everybody 'cause I'm not ready yet.
And so what I do is I put a private host zone out there in VPC two and I tell it that I'm aliasing this DNS record the travel.myapp.com
to the VPC Lattice service instead of the DNS service.
And that greatly simplifies it.
You can keep both up and running for as long or as little as you'd like, right?
If it's all VPCs you're trying to do, you can just use a public hosted zone.
You don't have to use a private hosted zone either.
This is just a very flexible option that gives you a lot of flexibility there.
So summary of that is kind of, VPC allows works with your existing architecture.
Service discovery is nothing fancy, it's DNS.
The real power of it though is that the DNS is local.
You're not using DNS as like a failover mechanism or something like that.
So you're not really super concerned with TTLs and things like that.
The IPS could change so you still want to use DNS, it's just it's not very frequent.
You can migrate clients and services independently.
This is just an abstraction, right?
So you can just do this at your own pace.
You can do it for some VPCs, you can move services and clients independently, whatever makes sense to you.
External connectivity.
I'm gonna kind of gonna pick up my pace a little bit.
So Ingress another super popular topic.
This one's great so a lot of customers, super common pattern is, I got a lot of VPCs, I got a lot of applications living in their own VPCs, how do I do central Ingress?
Because dealing with internet gateways on every single VPC is kind of a nightmare.
How do I do this?
And so customers have moved to this model of having a centralized VPC and handling Ingress there.
Same thing works here.
The only difference though is that VPC Lattice does require the connection to come from a VPC that's been enabled for VPC Lattice.
Okay, so we have to have something there and that's what I'm showing here.
We've got an elastic load bouncer, something like an ALB or NLB, both work just fine, and then I've got some sort of EC2 or something running a proxy.
This can be your favorite proxy of choice.
It can be a static proxy that just has an auto-scaling group, just turning the request around, I have to have something there today.
Okay, so client on the outside connects to the ELB, it runs to the proxy that's just making a connection and then all of that network connectivity is solved for you.
You can do your authentication, authorization and literally cut off all Ingress points to the actual application VPCs.
Okay, another really popular example, can I put a firewall in front of it?
Absolutely, like this is just typical VPC architecture.
You could put your firewall, you could use the DS Network firewall, you could use your favorite firewall vendor of choice if you use it with a gateway load balancer and it just works out of the box.
Eddy Bush Shield, The WAF, same thing.
Just put that right in front of your ELB and now you have that on Ingress as well.
This is fully supported today.
What about multi-region?
What about Ingress where I might wanna dial connections from one region to the other, or I might be doing it for availability or performance reasons.
I can do the same thing with Global Accelerator.
Global Accelerator is integrated with ELB.
I can do that same exact architecture, cookie cutter, replicate the service and service network in multiple regions, and then use global load balancer...
Sorry, Global Accelerator to shift that traffic around as you see fit.
If you're interested in this topic, and you're gonna see a couple of these as we kind of wind down.
If you're interested in this topic, definitely check out the blog by Adam and Pablo, colleagues of mine that just did a fantastic job on writing about this architecture pattern.
They've got cloud formation templates in there that'll actually spin this up for you with ECS Fargate and NLB.
And the ECS Fargate task is just running a very, very simple and basic Nginx proxy.
So out the door you can kind of do this.
This same design is also pretty popular for multi-region connectivity.
So if you've got connectivity from a VPC in one region to the other, it doesn't have to be a user outside on the internet.
So definitely check that out.
Another pattern is the serverless model.
And so this was actually from a customer during our preview that was trying to solve this problem.
They were just like, "Hey, I'd like to do Ingress."
They did this on their own without saying anything and I was like, "Hey, how did you do that?"
And they're like, "Oh, I just whipped together an API gateway 'cause I didn't wanna have a VPC on the front with an internet gateway, and I had a Lambda proxy."
And that Lambda proxy is just a Lambda function and it's a private Lambda function that has a ENI in a centralized VPC.
Even the Ingress VPC doesn't have an internet gateway on it.
Like super, super powerful.
And the cool part here is that Lambda proxy can be whatever you want.
So we have a couple of examples in a blog I'm gonna show a QR code for in just a second.
But you can do header manipulation here, you could do whatever you want here.
Like it's very, very powerful to be able to do something like this.
And it's completely serverless Ingress.
This is the blog that talks about it.
Definitely recommend checking it out.
It's relatively new, but it's got a lot of traction so I definitely worth the read on this one.
Again, when these are posted, you'll be able to get them if you're not snapping the pictures on time.
Okay, that is the end of my kind of architecture patterns.
I do wanna leave you with a couple more follow up items really to kinda show you like how do we get ahold of us, how do you keep your education going, how do you get started?
All that kind of stuff just to kind of wrap things up.
So we've developed a whole bunch of workshops in Workshop Studio.
This is a fully guided workshop.
So you can go in, you can play around with the labs.
We've got ones for ECS, we've got ones for EKS, we've got ones for Lambda.
I think there's a bunch of different ones in there that'll actually walk you through how to do it.
So you can get your hands in there, kind of feel it and see for yourself.
Sometimes it's useful to kind of do that kind of stuff.
So Workshop Studio link is in that QR code.
I think there's four of them in there today.
We're adding more over time.
VPC Lattice blogs, there's a ton of them out there.
I highlighted a few of them.
These are the ones that I think that we've kind of talked about today that might be good further reading that go a little bit more in depth.
It's always that balance.
This is a 300 level talk and I'm like, I'm gonna go over some folks' heads and I'm gonna go under some other folks' heads.
But the one that we didn't talk about, so there's the IPv6 adoption one that shows you how to do the migration.
You could pair that with the Cables2Cloud podcast.
You've got the building typical how to build serverless VPC Lattice connections on the top one, and then the bottom one's a really interesting one and it's how you actually integrate VPC Lattice with your VMware environments.
So this one is a very purpose-built and kind of interesting solution that if you're dealing with VMware workloads, it's a great way to kinda get started.
I also wanna highlight a couple of our more popular videos we've done online.
I've put together a YouTube playlist.
I'll be adding things to this over time.
That's the one on the right.
There's a whole bunch of things that I've just found.
So not all of them are from Amazon too.
There's a bunch of other like podcasters and people that have just put together loudest demos that I thought were really interesting.
And so I put them in there and I'll keep adding stuff over there over time.
The one on the left is the routing loop on Twitch.
And if you don't watch that already, even if you're a developer, you should go watch it.
Really funny moderators on there, really good selection of content that they have on there.
But this one specifically is a really powerful one.
There's a blog we just released a couple days ago.
It's not the QR for code for that, but it's a way to actually use tags to remove the humans a little bit and automate all the service associations and service network to VPC associations.
So that's definitely a video you might wanna see where we're showing a demo of how you can do that, and now there's a blog out there that shares the code of how to do it as well.
And then last but not least, the Gateway API controller is now GA.
So this is for Kubernetes workloads.
It's an open source controller.
You can use this to automatically do all your VPC Lattice stuff without actually learning VPC Lattice.
You can just use the native Kubernetes APIs to go and do all this stuff for you.
So definitely join the community, we'd love to see any issues that you find.
Issues being like, if you'd like to see new features, new functionality, you can go ahead and create a PR and we will definitely be watching that.
And if you're interested in kind of getting an overview of the Gateway API controller, there is two blogs dedicated to just the API controller and Kubernetes integration by itself.
Okay, and lastly there is one session I would recommend if you're hanging around, if you're one of the hardcore crew, definitely go to Alex and Matt's talk.
The Wizard talk in breakout session format.
Is always on Friday and it's the advanced VPC design and capabilities.
Alex and Matt did a fantastic job the last couple years and they always do a good job.
VPC Lattice will be highlighted in that as well as all of the other typical VPC networking services.
And so definitely wanna check that if that out If you're still around, that's NET306.
So thank you and please remember, let us know how you like the session and even if it's just say hi all.
Loading video analysis...