What is IS-IS?

By The Art of Network Engineering

Summary

Topics Covered

Why ISIS is the superior routing protocol for modern networks
BGP is the internet's 'trash can' protocol
ISIS's Layer 2 origins and its advantage over OSPF
The decline of deep networking knowledge and its consequences
ISIS: A simpler, more reliable 'electric engine' for networks

Full Transcript

This is the art of network engineering, where technology meets the human side of it. Whether you're scaling networks,

it. Whether you're scaling networks, solving problems, or shaping your career, we've got the insights, stories, and tips to keep you ahead in the ever evolving world of networking. Welcome to

the Art of Network Engineering podcast.

My name is Andy Lapte, and in this episode, I am joined by uh luminaries, kings of their field, if you will. Um

we're going to start with Mr. Michael Bashan. Hi, Mike.

Bashan. Hi, Mike.

>> Mike, you're missing your crown. I'm

looking.

>> [laughter] >> No, I'm I'm sitting on it.

>> The wheels have already fallen off. Hi,

Mike. Have you said hi yet? [laughter]

>> Hey, Andy. Thrilled to be here.

>> Thank you for Thank you for stepping in.

We had somebody um dip out and Mike jumped in. So, thank you. Russ White,

jumped in. So, thank you. Russ White,

sir, how are you? It's great to see you.

>> I am a year older. [laughter]

>> Russ, I looked up the last time you were on the show and it was almost three years ago. It'll be three years in in

years ago. It'll be three years in in January. So, that's been way too long.

January. So, that's been way too long.

And I listened to the episode and I learn more stuff. So, you're one of those people that I learn something from every time I talk to them.

>> Anytime you want me on, you know how to find me. I'm always around doing

find me. I'm always around doing something.

>> Listen, pal. I'm still waiting for my hedge invite, buddy. [laughter]

>> Podcast, that's doable, too. [laughter]

You know, that's fine.

>> No, >> I've been trying to get Mike on the hedge. It's always a pain to get him,

hedge. It's always a pain to get him, you know, he's traveling so much, it's hard to get him scheduled. I love your show. I love listening to you

show. I love listening to you communicate about networking. It's it's

really u it's always a fantastic time.

So, uh this episode is going to be me kind of picking a fight with you about ISIS. And I say that lovingly. Kevin

ISIS. And I say that lovingly. Kevin

Meyers, a friend of ours, we did a BGP episode and it blew up and people were in love with it. And I thought, "Wow, people really want to talk for an hour about a protocol. That's interesting to me." So, but it was a good conversation.

me." So, but it was a good conversation.

And we were beating up on BGP. I think

he quoted you as having said BGP is the trash can of the internet or trash can of data center whatever it was right that yeah that the quote >> yes is the [laughter] internet right we just throw BGP on everything and so uh

somehow in that conversation during the BGP episode isis came up and I'll be honest with you in all my years of I mean I have certifications from multiple vendors um I've you know I've managed

the things and studied the things and I've never come across ISIS so I think the framing here would be yeah I So the framing here is it's a design discussion

I think and really why do we need ISIS right I've never used it and I've managed fortune50 global data centers so it must not be [laughter] right I'm being provocative right like we don't

need this thing what you know change my mind but um I'd like to talk about routing protocols like Mike and I were talking earlier and we were you know we were thinking about some different scenarios like or analogies you know

like we don't think about what the pipes in our house are made of right? Like I

really don't care. I just want the water to run. And really, if we think about

to run. And really, if we think about networking infrastructure, it's just plumbing, right? Highways are another

plumbing, right? Highways are another analogy. But I don't know if I care what

analogy. But I don't know if I care what the pipes are made of. I don't know if I care what routing protocol you're running. They all kind of do the same

running. They all kind of do the same thing. I mean, I've run, EIGRP, OPF,

thing. I mean, I've run, EIGRP, OPF, BGP, right? EDPNXLAN. To me, they all

BGP, right? EDPNXLAN. To me, they all kind of do the same thing. Now, I'm

provoking you because I know you're Mr. Design and you're going to tear me apart but >> No, it's it's okay.

>> I don't know where we want to start. Do

you want to start with why would you pick one routing proto protocol over another? Like who cares which one you

another? Like who cares which one you use? But I really want to jump into ISIS

use? But I really want to jump into ISIS because I know nothing about it.

>> Okay, so I'll begin here. First of all, BGP Tony Lee got really mad at me one time when I said this [laughter] because

I said it in his in his hearing. BGP to

me is less of a pure routing protocol and [clears throat] more of a policy distribution system that happens to do

shortest or happens to do loop-free paths sometimes. I mean 99% of the time

paths sometimes. I mean 99% of the time it does loop free. [laughter]

I don't know if 99%'s the correct amount of time, but BGP is slow, extremely slow, extremely policy rich. If you ever throw

uh I don't know 120,000 routes at BGP and a dense topology and try to pull the routes out, you will find that either it will not converge ever or you will find

unless you do the correct design work or you will find that it converges very slowly like minutes. But that's by design, correct? Because we don't want

design, correct? Because we don't want flaps on the internet to change all the things every minute.

>> Well, it's because it's not a distance vector. It's a path vector protocol. And

vector. It's a path vector protocol. And

it's just as radio would say, it's routing by rumor. Like it takes a long time. And because of the way it works

time. And because of the way it works and because it depends on TCP and it has a lot of overhead and dealing with all those 13 or whatever they are steps now, we keep adding to the steps in the best

path process.

>> You say relies on TCP like that's a bad thing.

Well, it means it's a bit slower because you're actually one one step up, right?

It's more uh well, I won't even say that it's more reliable because it's really not more reliable than flooding in either OPF or EGRP or ISIS, but it is more multihop reliable. Like ISIS

doesn't do multihop and you you would really struggle to do multihop with IS.

>> How did it get in the data center?

Because it was supposed to be for the internet. It wasn't supposed to be an

internet. It wasn't supposed to be an IGP, right? Right? Because Peter Lipkov

IGP, right? Right? Because Peter Lipkov said, "I need something that'll scale to 120,000 200,000 routes and I need something that'll do traffic engineering." And at the time he came up

engineering." And at the time he came up with these requirements. Number one,

nobody believed that ISIS or OPF could do 120,000 or 200,000 routes. There's

always been this thing that these IGPS can't scale. Now that you can't do 1.2 2

can't scale. Now that you can't do 1.2 2 million routes but you can do you can do 120,000 routes in ISIS or OPF >> following what he I mean so at that point though a bunch of people

standardized on so I mean it's been proven that it works and so the question is if the network works if the data center works if it scales

if it ain't broke like does anybody care at this point I mean I know that you care but like does the average network architect >> I think Mike's asking why I care

because I've asked you before and and so I but but like because to me it's not just ISIS, right? Like like I mean there's the whole Rift discussion. I

mean there's like a >> and the commercial challenge with all of that stuff for people who build product is that the set of people who care are you know infantesimally small compared

to the number of people who don't. Which

raises a a question like if if the average or even you know let's say above average person doesn't doesn't see much difference or enough difference to go

and and upend their architecture you know should we spend time on it at all >> so a couple of things the first is that most of the time when I sit down and spend time with people who are doing

large scale fabrics or midscale fabrics if I explain to them my reasoning most of them will go oh I agree we probably should run an IGP separate from BGP. But

then the second answer you'll get immediately is but nobody in my knock knows how to do anything but BGP. So

therefore, no matter how good I think the much better the design I might think might be, I don't really care because my knock can't support it. They can't

support anything but BGP. Now I have honestly very little sympathy for that answer, but that's me.

>> Hold on. Is there anything more complex than BGP with all with everything when it's doing everything that it does?

>> So, so I have two two problems with using BGP for all of it. The first is we are twisting this is why I say BGP is the garbage can of the internet. We are

totally, you know, BGP was designed so that you had to manually configure neighbors because I'm talking to a customer, I'm talking to another network that I don't know anything about and

it's very policy driven. So, I want to in be very intentional.

I mean, even the way that BGP is set up by by default, which wasn't always this way, by the way, which is why there's a lot of implementations that don't do this, but the way the RFC changed it so

that you have to have policy configured to advertise something. Everything is

meant to be intentional. It's meant to be slow. There you go, Andy. It's not

be slow. There you go, Andy. It's not

supposed to converge fast. Supposed to

converge slow. supposed to be very intentional and then we throw it in a data center and we go, "Oh, but it needs to converge faster." So, you know what we're going to do?

>> BFD.

>> We're going to change everything everything about the way it works. We're

going to configure it so it looks like fancy RIP. That's literally what we do.

fancy RIP. That's literally what we do.

We're going to make it so the adjacent the neighbors come up and you don't need to manually configure any pairing sessions. Now, at some point, you're

sessions. Now, at some point, you're like, "This is a completely different protocol. Why am I building a completely

protocol. Why am I building a completely new protocol on the same packet formats just because I can call it BGP?

>> Is it because the people are familiar with it? Like to Mike's point earlier,

with it? Like to Mike's point earlier, who cares? And you said the knock knows

who cares? And you said the knock knows BGP. Like I've I I came up on a knock

BGP. Like I've I I came up on a knock and if you told us who all knew BGP, now we all have to learn ISIS and make it reliable.

>> That's a that's a lift, right, for people who are working 32 tickets a shift.

>> Yeah, it is. The beauty of splitting it of having an IGP and a B and and an and an EGP is exactly the way we used to design transit networks is that I

separate my infrastructure from my workload. And today in the data center

workload. And today in the data center using BGP, my infrastructure and my workload are blended together in a single table. And people play games. Oh,

single table. And people play games. Oh,

I can make the infrastructure EBGP and the workload IBGP or the other way around. Various people do different

around. Various people do different things with this. But to me having them in separate protocols makes a lot more sense because >> the question is that to constrain failure domains like you said earlier.

>> Yeah. Yeah.

>> Yep. Exactly. That's why that's one reason. Another not just failure domains

reason. Another not just failure domains but it's also security domains. It's

also my attack surface is completely different. And one of the reasons I like

different. And one of the reasons I like ISIS in this role is because ISIS is not IP based. So you cannot no you cannot

IP based. So you cannot no you cannot send multihop ISIS packets.

>> Let's take your segue because when I looked up ISS earlier in my new teacher chiept the first thing I saw that exploded my brain was it's a layer 2 protocol and I refuse to understand or

like the fact not that anybody cares what I like. How the hell is layer 3 routing protocol operating at layer 2?

Please explain and why.

>> So originally the thinking was exactly the opposite. One of the big concerns

the opposite. One of the big concerns about using DNA, which you probably don't know what DANE is, but >> essentially when RPKI came out, one of

the things I was working on when I was at VeraSign was why are we building this upper database? We have this database

upper database? We have this database called DNS. It's already distributed.

called DNS. It's already distributed.

It's fast enough. I can add a new DNS record just like I have a CA or aert or an HTTP or whatever. And I can just add

a new one that is the 501 certificate for origin authentication and stick it right in DNS. So now the router can just query DNS and it's going to get a cached

response back with the correct origin stuff and it's just going to work. And

people were like no you can't do that because you can't have routing depend on DNS and DNS depends on routing.

>> It took 13 minutes for me to get lost.

So I I did pretty well. That's good for me [laughter] following Russ. So, what does this have

following Russ. So, what does this have to do with building it on layer 2? We

went to RPKI and DNS and Crazy Town.

>> Okay. So, when ISIS was invented by Radia and Mike Shand and >> radio helped invent >> Yeah. Radio was on the team. Yeah. With

>> Yeah. Radio was on the team. Yeah. With

Mike Chan.

>> I should have known the person responsible for spanning tree would invent something as terrible as is.

>> Oh, well, it's it's >> I love her. I'm teasing. We're going to have her on. It's even funnier than that because routing was she invented routing before she worked on routing. I

shouldn't say she invented she worked on routing before she did spanning tree.

>> Yeah. Yeah.

>> Spanning tree was she didn't want to do it like she don't she does not like spanning tree.

>> I know she doesn't like Ethernet. She

doesn't like spanning tree. It's my

favorite part about her. The thing she invented she calls garbage which I really think is >> she doesn't she likes Ethernet >> but she still like part of the reason she did Trill. She and I did Trill many

years ago, tried to do Trill was to get back to the original concept of routing even at layer 2. Don't don't do this spanning tree stuff and she doesn't she

doesn't she's not ever been a huge fan of that.

>> So you were there. Let's leverage that.

Why was I why was the decision to create another internal gateway protocol made?

I guess OPF wasn't doing something they needed it to do.

>> Nope, it's the other way around. So isis

was designed first >> even really before spanning tree to some degree and the reason it was layer 2 was because again they thought first of all

ISIS is not an IP protocol right it was designed for clp and cls it was not designed for IP at all >> you [clears throat] know what that stuff is that was just before IP right it was transport stuff before

>> IP it's it's parallel it's parallel >> it's at the same time OSI stuff all these all these ISO protocols at that time by the Okay, there was Bangan Vines

which was a layer 2 protocol that ran on top of something called VIP which was its own version of IP and there was Novo network which ran on top of something

called IPX which was an analog to IP but it was essentially a layer 2. And so the thinking was you don't want a layer 3 protocol dependent on layer 3. You want

a layer 3 protocol if you're going to carry layer 3 connectivity it needs to be in a layer 2 protocol. M

>> and so that was I mean that's part of the reason >> that wasn't the first IGP right like again why did they create ISIS what was it for >> so the first IGP was what was called the

hello protocols very similar to RIP but without the hop count limit and it crashed it crashed bad so they had a

flag day and they replaced it with a link state protocol not isis but a link state protocol and that developed eventually into the OSI I the ISO

protocols and everything else that resulted in ISIS.

>> So then why did OPF win?

>> So I don't first of all I don't know that OPF is one but that's another >> everybody knows it. You you and Rody are the only two people and three other people at hyperscalers that know [laughter] ISIS.

>> Yeah. Well I don't know. I mean, if I most of the large scale networks that I've ever touched, other than perhaps AT&T, I think AT&T is the probably the

lone standout on OPF, but Sprint has always been, >> but the question's the same. So, so ISIS kind of you create this separation, right? Separation of concerns and then

right? Separation of concerns and then that gives you some measure of resiliency, right? So, that's that's a

resiliency, right? So, that's that's a positive.

>> Mhm. presumably, you know, you'll at some point it probably didn't matter at the beginning, but you'll get different scaling characteristics as a result of that. So, okay, but then OPF comes

that. So, okay, but then OPF comes along. So, why create OPF if ISIS is the

along. So, why create OPF if ISIS is the answer? And and just so you know where

answer? And and just so you know where the story arc is going, if you create OPF because ISIS isn't getting there, and then you use BGP because OPF wasn't

getting there, >> how do you end up back at ISIS?

This episode of the art of network engineering is sponsored by Meter. Meter

delivers network infrastructure for the enterprise because every organization deserves seamless connectivity. Whether

you have a large team of network engineers or an IT team of one, Meter makes it simple to get online and stay online. Meter provides a full stack

online. Meter provides a full stack integrated platform that combines hardware software deployment and support so enterprises can ensure their networks have the performance, security,

and reliability they need without the inefficiencies of juggling multiple vendors. It's enterprise networking

vendors. It's enterprise networking reimagined, giving it teams the ability to spend less time managing complexity and more time driving business strategy.

Go to meter.com/aw1

to book a demo now. That's me ter.com ne. Now back to the show.

ne. Now back to the show.

>> Okay, so OPF was created because way back in the day, most processors were little 8bit processors and they didn't like TLVs at all.

>> Can you tell me what a TLV is, please?

>> Yeah, a type length vector or a type length value depending on like how you how you want to. So essentially on the wire I can tell you what I'm going to

tell you as part of the packet. I can

inline if you think about it as a grammar or dictionary. I can inline the dictionary or the grammar. I can say I'm going to tell you this word and I'm

going to give you the context of how I want you to understand this word in line.

>> Is it a header bit that notifies you what's coming?

>> Yes.

>> Frame. Yeah,

>> correct. It's a header that tells you.

Right. Right. So think of Spanish or Greek. Greek in particular is a is a big

Greek. Greek in particular is a is a big one for this more so and the conjugations in those languages, right?

Like in Greek. Oh, you know if it's a fa a female or a male verb. Why would you ever care such about such a thing?

Because you want to know what word in the sentence the verb applies to or the adjective or the adverb applies to. And you do that

with gender, gender differentiation and other ways of doing it. You don't do it by word order. In English, we do everything in word order. Where is my subject? It's before my verb. Where's my

subject? It's before my verb. Where's my

object? It's after my verb. Greek

doesn't work that way. Hebrew doesn't

work that way. A lot of languages don't work that way. You actually tell what what the object is by the ending on the word tells you this is the subject.

Could be the last sentence, the last word in the sentence, could be the first, doesn't matter where the position is. So, this is the way a tov works. I

is. So, this is the way a tov works. I

don't care where it's positioned in the packet. I'm telling you by conjugating

packet. I'm telling you by conjugating it effectively by telling you this is your metric. Your metric is going to be

your metric. Your metric is going to be six bits wide. And it doesn't matter where I put the metric in the packet because everyone knows TLV number one,

whatever it is, 133, 132, that's going to be the metric. So they so this is great from a from a flexibility perspective and this we'll get into

right maybe perhaps we'll have time to talk about OPF versus is tois in their flexibility as protocols but um the guys who designed OPF had two goals. Number

one was strip the TLVs because they their their bits on the wire. I don't

want the bits on the wire. I want the protocol. I mean, look, when I was in

protocol. I mean, look, when I was in the Air Force, we did an entire white paper taking a sniffer and measuring the

efficiency of sending an X megabyte file, 10 20 megabytes because at that time we weren't doing gig files over

Vines versus I IPX with network versus IP versus OSI. We actually did a white paper where we measured which one put more data on the wire cuz we were

running one and two and three meg per second networks and we were running inverse multiplexers on T1s and >> so chatty protocols weren't great.

>> Yeah, exactly. What

>> back when you're running a fractional T1 at 64K and that mattered where now we're running multi gig who cares, right?

>> So at the time ISIS was considered a much chattier protocol. We want to make a less chatty protocol. Um so that was one reason >> for bandwidth constraints and also like you said the CPUs that were in handle

all the >> that's correct. Yeah.

>> So is your argument that that that's all marginal now and so then the constraints well away marginal at this point and and what's happened over time is if I want

to support IPv6 in OPF I have to build a totally new protocol. The whole protocol is built

protocol. The whole protocol is built around the V4 address space. All of my fields and all of my packets are fixed length and set to 32 octets or whatever.

>> So BGP could add an address family and ISIS you could do a TLV but OPF you have to refactor the whole >> Yes. Exactly. So what's happened in the

>> Yes. Exactly. So what's happened in the long run is OPF become a very complex protocol. I mean

protocol. I mean >> you have however many different >> types TL uh you know packet types and everything else and now they've added

TLVs to it whereas ISS is I mean the biggest complexity with ISIS as far as packet format goes is oh we messed up and put in six octet or sixbit metrics

oh we need bigger metrics than that okay so we'll create a new TLV set to do these new metrics >> so I'm a network operator and I want to add a new TLV type let's say to support V6 or something cool that's new, Rocky

V2, whatever it is. Is that a Is that an upgrade of the code? Can I just How do you upgrade a TLV? Does it come with the next version or is it a command?

>> Version of the of the code. I mean, you would have to >> You have to upgrade.

>> Yeah, you'd have to create a [clears throat] new TLV type, get it standardized or run as an experimental or an opaque or whatever you want to do.

>> Who drives that? Is that F? Like who

where would the new TL Yeah. Okay. They

get together. They say

>> that's in the OSR working group.

Correct.

>> If the protocols so let's say they evolve to overcome limitations and kind of the load that that those limitations bring and then you end up removing the limitations if you've already moved on

to whatever the new protocol is. Is

there still justification to then make another change? And let me give you kind

another change? And let me give you kind of a a weird example. Cloud comes out um every board in the world gives their CEO a cloud directive. you must go to cloud.

What's you know how are you getting to cloud? Companies all over the world

cloud? Companies all over the world perform the great lift and shift. They

spend a bunch of money. They rewrite

they refactor their applications or rewrite them entirely. Now their app instead of running on prem runs in the cloud. There's no difference in what the

cloud. There's no difference in what the app does. Maybe it's marginally less

app does. Maybe it's marginally less expensive in the short term and then cloud proves out to be, you know, not quite the economic benefit that that everybody, you know, thought it would

have. Do they move their apps back? Like

have. Do they move their apps back? Like

they don't. They just leave them where they are because you're like, I've already made I've already incurred the cost once. The benefit is marginal. I'm

cost once. The benefit is marginal. I'm

not going to go back.

>> Yeah. Guess it would depend on how marginal the benefit is, right?

>> So, do you think that the benefits are large enough? I mean, like because you

large enough? I mean, like because you are like an ISIS zealot, right? I mean,

like you are [laughter] >> I wouldn't say I'm Yeah. Okay. Go ahead.

>> Well, you Okay, maybe not a zealot, right? You're like

right? You're like >> I don't I mean, you write on your sneakers iheart isis [laughter] and you you know you you put in like you were practicing like your last name like I'm

going to be Mr. Rock is like like you have a love affair with ISIS. So do you think that the benefit is there and then I'm going to set you up like this is a trap so just be aware you're about to

step into a trap >> because I got a follow-up question that once you answer this one >> okay that's fine. So isis was almost dead honestly almost no one was using

it. It was very very very few people but

it. It was very very very few people but several things happened generally around the age of the explosion of the dotcom bubble and the growth of transit network

sprint and sprint link being particular Peter Lothberg and his work being particular things that happened. Um

first of all the original implementation of OPF in Cisco IOS classic old classic Cisco IOS was marginal. It was okay. It

was fine. I don't want to put anybody down or anything, but it was not the most optimal implementation as far as shortest path

first flooding. All the components of

first flooding. All the components of it, they were there. There was literally a thing on the Cisco online website that Don Slice and I had to go and spend six

months proving was incorrect to get it taken off the Cisco online website, which was you should never have an OPF flooding domain or an area larger than

40 routers. We had so many problems in

40 routers. We had so many problems in TAC in global escalation with large with flooding domains larger than 40 routers in our OPF implementation. They

literally put it in the documentation, no flooding domains over 40 routers, no areas like everything beyond it's going to be redistributed blah blah blah all this other stuff.

>> Is that a scale constraint of OPF just full stop?

>> It's a scale constraint of that particular implementation on that particular hardware.

>> It's not the protocol itself. Right? So

I'm I'm being very careful here. The

distinction between implementation and and the reason they had to remove it was because they they took an implementation detail because they had so much weight

in the industry in terms of what was um accepted and not a post designed to reduce tac calls turns into a pseudo standard.

>> That's exactly what happened. That's

right. And so during this time, Sprint and all these other large providers, these transit providers, they were throwing a,000, 2,000 routers, 5,000

routers in their networks. Okay, you

design an OPF network with a thousand routers with a 40 router limit in your flooding domain. Okay, good luck with

flooding domain. Okay, good luck with that because that ain't going to happen.

You just hit practical reality. So they

just tried ISIS ice. Well, it just happens to have turned out that Tony Lee wrote the original ISIS implementation and there were some very good coders working on it, Hank Schmidt and some

other people. So, the ISIS

other people. So, the ISIS implementation was beyond excellent. It

was it was such a good implementation.

>> Does the ISIS documentation lead with that? Like this is better for scale?

that? Like this is better for scale?

>> No, I'm just not at all.

>> Was that an accident Tony such and such?

Like how did they know to do that? They

just tried it.

>> They just tried it.

>> Let me jump in with a question. So, do

you like ISIS because it's a better protocol or do you like it because it was implemented better?

>> I like it because I think it's a simpler protocol.

>> Simpler. Okay. So, simpler is doing a lot of work for you there.

>> Yes.

>> Unpack simpler.

>> It's rare networking, right?

>> Yeah. So, hang on. So, basically figured out we can do five to 6,000 nodes in a single ISIS flooding domain. a single

ISIS flood like you don't even need flooding domains in ISIS necessarily depends on your route count and your processors and >> it's like an area right is that >> yeah it's an area it's b it's the same

>> can you connect routing can you connect ISIS areas yeah >> yeah yeah yeah it's exactly like OPF you have an ABR blah blah blah there are some critical differences by the way

that again I prefer ISIS's implementation for so ISIS is a simpler protocol in that I have to vs. I don't

have all these types. I'm not going, oh, it's a type five and it hits an ABR and the type five generates an E bit and a

and a type four with a type whatever it is with this >> types and the LSP I've created.

>> I don't know that stuff, right?

>> But aren't TLVs just a different route type?

>> No.

>> Isn't it the same thing?

>> No.

>> No. Everything is also backwards in ISIS from OPF in some ways. So for instance in in OPF when I hit an ABR my default

is I create a type two which is essentially says everything that I get in my type one two. Yeah it's not a type two it's type three. I see this is why

it gets like it's a summary LSA. So I

create this summary LSA and I say everything that's a network LSA or a a link LSA or a router LSA gets summarized and the way it gets summarized is I take

the metric and I attach them to the type three the summary as if all those things were connected to me right that's what I do >> which seems like a good idea because you're summarizing routes which reduces the routing table right like I mean this

is all good >> right and so I do it in both directions area zero into my outlying area outlining area into area zero And if I want to only have a quad zero

in my outlying area, I've got to configure all that. I got to make it a totally stubby, I got to think through, am I going to redistribute? Does it need to be a not so totally stubby? Does it

need to be a totally stubby? Does it

need like there's all these different things I got to think about. In ISIS,

you put the same intermediate system in both flooding domains with net statements and it's automatically a totally stubby area. In fact, it's a totally not so stubby area.

It only sends a default from the level two domain into the level one domain and then you summarize just like you do an OPF from the level two one domain into

the level. So your outlying area into

the level. So your outlying area into your area zero.

>> So like everything is backwards. I have

to intentionally leak the routes.

>> It's embedded in the logic of the protocol instead of having to turn on the nerve like >> OSP. I just I just do it and boom, I'm

>> OSP. I just I just do it and boom, I'm in a totally not so stubby area. Like

the most complex thing I can configure in OPF is the default in ISO.

>> So the configuration is simpler. It's

less complex.

>> It's not even just simpler. The protocol

itself is simpler in that way to me.

>> I was think I was Yeah, >> it's going to do more like on its own.

Um so then Russ, if somebody was like if you had your way, right? You teach

classes, you build out curriculum, you educate people on how to do architecture. You're very committed to

architecture. You're very committed to having people understand the why. Um

because the the recipe model of building networks leads to misunderstanding and and complete architectural abdication and then you end up with things where Amazon goes, you know, Amazon West goes down.

>> Yep. and an entire company is unreachable because they didn't they they basically said we don't own the architecture that's Amazon.

>> Yeah.

>> Um if you were building like let's say you were educating the world today. So

let's say there's that's that magically we make networking cool again and >> Alexis wins >> and then there's uh 50,000 network engineers that enter the the workplace

over the next couple years and you're going to be training them all. What do

you tell them like what protocol you're like this is the architecture you should go with? What do you tell them?

go with? What do you tell them?

>> I would always do an ISIS underlay with with a BGP overlay personally and I'm not going to say that you shouldn't ever use OPF. That's not I mean that's okay

use OPF. That's not I mean that's okay if you want to use it. It's just not my preferred way of going about it's like using BGP alone for all of your data center fabric. You can do it. I know how

center fabric. You can do it. I know how to design it. I've done it many times.

Is it my ideal? No, it's not really my ideal because I know where the holes are.

>> But is your position that if people were starting from scratch, learning everything from from scratch, they would find ISIS simpler to to deploy and then sort of almost, you know, by definition

simpler to learn.

>> Yes. So when I used to teach OPF in and Cisco TAC and then I would teach ISIS because they insisted I do OPF first.

>> Okay. So here here's your trap then. You

ready for the trap?

>> Oh no. So hang on.

>> The trap, Russ, >> it's coming. I warned you ahead of time that I was walking you into a trap.

>> So, I would always teach OPF first and then when we got to the end of it, people would say, "Aren't you going to teach us ISIS?" And I would say, "Just forget half of everything you know about OPF and you already know ISIS."

>> Who are these weirdos who ask for ISIS?

[laughter] >> Well, I guess that was always my >> So, Russ, you're on the witness stand and so you are obligated by podcast law to tell the to speak the truth, the

whole truth, and nothing but the truth.

If you are indeed in a in a long-term relationship with ISIS, [laughter] do you have like what's what's going on with your side piece Rift? [laughter]

>> Okay, so I used to say this all the time. RIP is not a routing protocol. It

time. RIP is not a routing protocol. It

is what you put on your gravestone.

>> No, Rift. Rift, not Rift.

>> Oh, Rift. Oh, I thought you said RIP.

>> No, no, no.

>> Yeah. And actually Rift's a perfectly fine protocol in very small networks, but I never deploy it at scale. But

Rift, okay, so Rift is interesting. I

like Tony P a lot and we I'm actually on the Rift drafts, I suppose, still. I

don't know. Maybe my name is still there. I've looked.

there. I've looked.

>> Humble brag.

>> Huh?

>> Humble [laughter] brag.

>> I say No, I say this just like all of them. Stop it.

them. Stop it.

>> Yeah. I just say this to say that I don't hate Rift. It's not like I'm like down on Rift. I help design.

>> I'm looking it up in real time. I have

no idea what it is. Routing and fat trees.

>> But if I issively better and simpler, why like why look outside the marriage?

So, Rift is essentially an attempt to make it where you can fire and forget in extremely large scale

because Tony P and I have a bit of a maybe a small disagreement over the scale to which you can push is tois with modifications. Because if you go look up

modifications. Because if you go look up desktop flood, which Mike, you know all about distop flood, I think you can push ISIS to 5,000 routers with distop flood in a dense environment and have a lot of

routes in it and it's fine. So the

biggest problem with doing that is that your top of rack switches, which are your cheap switches most of the time in most designs, you have very cheap boxes up there, have very small forwarding

table sizes. So, if I have to throw

table sizes. So, if I have to throw 120,000 routes at this thing or 200 routes, 200,000 routes at this thing, those poor edge switches are going to

run out of gas really fast. And not only that, they share their table size with their filtering tables. So Tony's

perspective is that in order to save that memory and make those small boxes viable in a large fabric, you build a protocol that sends nothing but a default down from the spine towards the

top of rack. So you reduce the table size. So he basically took is tois. He

size. So he basically took is tois. He

took distop flood plus is tois and then he said well you know what is there a way for me to only send a default from to each one of the top of racks? And so

he figured out that he could do basically a distance vector from the spine to the leaf and a link state from the leaf to the spine and make an

incredibly scalable fire and forget routing protocol and it gives you all sorts of interesting things like it does. So distop flood originally many

does. So distop flood originally many years ago had the ability to calculate which stage I'm in. I could actually figure out from the router's perspective. I could say, "Oh, I am a

perspective. I could say, "Oh, I am a top of rack. Oh, I am in the spine. Oh,

I am in the fabric. I am up at the plane layer." I could actually figure that

layer." I could actually figure that out. Um, it's not actually that easy.

out. Um, it's not actually that easy.

It's it's numer It's mathematically impossible. As Ivon Pepinak read my

impossible. As Ivon Pepinak read my draft and he went, "What you're doing doesn't work." And I'm like, "Uhoh,

doesn't work." And I'm like, "Uhoh, that's a problem." So, I spent some time with Ivon figuring out how to make it work. Thank you,

work. Thank you, >> Russ. This is all rift, correct?

>> Russ. This is all rift, correct?

>> No, no, no. This was before Rift. This

was to stop flood.

>> So, that's where I was getting at.

You've mentioned that a few times. I

don't know what to stop flood. Is that a >> It's a flooding optimization draft for isis and OPF.

>> DIST off. Is that what you're saying?

>> Dis distributed optimum flooding.

>> Thank you.

>> Yeah. And so, um, he took all of that work and he added it to his distance vector plus link state and came up with Rift. Rift has a lot of really cool

Rift. Rift has a lot of really cool properties. My concern with Rift, I

properties. My concern with Rift, I suppose, is is I'm not really sure we need it all the time. Like, I'm much more of a just use what's already there.

Why am I like you said, Mike? Right.

Like, it's there. Everybody's

implemented ISIS.

>> Well, it's there for scale, I think.

Right. Once you get over that 5,000 ISIS unspoken limit.

>> Okay. Well, if you're over the 5,000 router limit, you have other problems. >> But that's your whole bread and butter, right? Don't you specialize in hypers

right? Don't you specialize in hypers scale with a bajillion routers? I

thought that was kind of >> Yeah, but if you have a single fabric that's large and I'm not saying I'm saying that in a single flooding domain, >> it shouldn't get >> Yeah, a single flooding domain shouldn't get that big, honestly. I mean,

>> when you talked about how big those route tables in the top rack switches were, I I got a little confused like why are they that big? And and and then I thought I assume maybe that's just hypers scale world and yeah.

>> Yeah. Right. If you start thinking about if I have a 2500 router fivestage butterfly and you start thinking about

if I have a 120,000 edgeport all of a sudden I can have 120,000 right now.

Theoretically, I shouldn't in ISIS because what I should really carry in my underlay is my loop backs just so BGP can build the BGP sessions and I can build tunnel tails and heads. I should

never ever carry workload routes.

My edge routes should not exist. Any of

my workload ports should not exist in my underlay.

>> So, you don't think we need Rift because your network should never get bigger than a 5,000 ISIS.

>> I wouldn't say we don't need it. I would

say it fits a it's it fits a narrower slice >> very small >> number of environments that are going to operate at that scale [laughter] is >> and and Tony and Tony by the way will disagree with me he'll say it's the opposite and that's cool that's fine

>> yeah but who's right >> this is you know always an ongoing discussion [laughter] >> he wouldn't say that it's widely applicable and all he would say that the that the protocol can work at at subscale but he wouldn't say that

everyone's operating at that same scale and that the and that the same scale requirements are have proliferated broadly.

>> Yeah, he really targeted Rift at the 2500 3,000 router fabric is what he really targeted it at.

>> If you're down at 100 routers or even up to about a,000 routers, I think he would say just use ISIS and be done with it.

Like this is just not the only other thing that was kind of cool about Rift is the ability to deploy EVPN, a kind of a lightweight cheating version of EVPN and you don't even need BGP.

There is a way to do that with Rift.

>> Blasphemy. We always see BGP. Yeah, I

mean it's cool. How many people in the world I mean a ballpark obviously not specific but like know enough to express

like a a strong opinion about the protocol choices in the underlay cuz I like you have this this discussion right but like if you go to even like very large networking players like the number

of people to have that that can have that conversation and actually have an opinion I mean they like a lot of people can can recap you know here's how a thing works because they've read about

but to to know it well enough to say and here's why I agree or disagree and then I'm I'm willing to debate on it. How

many people can do that?

>> It's not very many. And I find that unfortunate. I find that a failure in

unfortunate. I find that a failure in our network engineering training world myself.

>> Do you think it gets better or worse over time?

>> I think it's gotten worse over time.

>> Like from here like at some point do you think it like do protocols become sexy again?

>> No, I don't think so. I don't I don't think so. I think I think we are moving

think so. I think I think we are moving out of that realm. Um, I think we're going to move now. See, the thing I like about running ISIS as an underlay is you actually don't have to know much other

than just basic troubleshooting on it.

It's literally a fire and forget protocol. Like you turn it on, you tell

protocol. Like you turn it on, you tell it to run on everything except your workload ports, which is probably that and the running the net addresses, figuring out the net addresses are the

two most complex things about running isis. and you just let it go and it just

isis. and you just let it go and it just does stuff and all of a sudden you have IP reachability V6 and V4 for all your loop back addresses in all your routing

tables on the whole fabric. It just

works like it's not very difficult to do. It's actually really simple. Um, I'm

do. It's actually really simple. Um, I'm

building I probably shouldn't talk about this very much, but I'm building a test topology I've been trying to play with.

And the highest configuration in this little I'll say more than a thousand router test topology I'm trying to build is eight lines of code and and one line of configuration for every interface. I

mean, seriously, there's nothing. My BGP

configuration is 20, 30, 40 lines of configuration on every box. 80 lines

ISIS. It's literally like router ISIS labet or whatever you want to call it.

You have to give it a name, a net address and oh that's it. And then you on each interface you say run ISIS.

Okay, I'm done. And all of a sudden you have a routing table. Like it's really pretty simple.

>> You know what's funny, Ross? I'm

thinking of how much pride I used to have in my complex BGP configs with my prefix lists and my complicated route maps and you know all kinds of other stuff we're doing and BFD to make it

faster. We better put UDL for one like

faster. We better put UDL for one like it was just it was so complicated and I was so proud of it because look look at what we've done like >> oh yeah oh yeah there's a lot of that network engineering there's a lot of

that and >> yeah it's cultural I think and and I think it starts so for me and where where I'm going with this is I my peers and I have never come across ISIS in any

of the vendor trainings that we all consume to grow our careers. I don't

understand that gap. I don't understand why I've spent a bajillion hours on EIGRP, BGP, and OPF. And hearing what I'm hearing from you about ISIS, I don't know why our educational system is doing

this that disservice if everything you've said is true, which I believe it to be.

>> I think it's people are scared of it to some degree. They've heard really, oh,

some degree. They've heard really, oh, the net address is so long. Oh, it's so complex. It's really, it's actually a

complex. It's really, it's actually a very simple protocol. In fact, the training is to the point that like who is it? Somebody and I wrote I should

is it? Somebody and I wrote I should remember all these things, right?

[laughter] A book on is tois which Mike Shand wrote the intro for one of the people who invented it and that is the book that

most vendors buy when they want to train somebody on ISTI. It's on the protocol and it's like 15 years old 101 15 years old now. Nobody ever writes anything

old now. Nobody ever writes anything about it. It's completely

about it. It's completely >> I think it's the story you told about like the Cisco the OPF you know breakpoint with 40 routers like to me that that's a that's an interesting

symptom of of the actual root cause the number of people who have enough confidence in almost any of the really deep networking topics. it's declined

over the years. Um and and as it declines then you end up with you know anecdotes and stories and fears and whatever they travel further and are

more durable than truth and then it will get worse because I think you know we are in a a bit of an attention span economy. I know Russ has talked about

economy. I know Russ has talked about this a lot in other forums. If people are no longer willing to invest the time required to to learn the thing >> Yeah. then they'll never develop the

>> Yeah. then they'll never develop the confidence or conviction to do anything different than what is done. If that's

the case, right, are the operational concerns do they outweigh the architectural benefits? And even the

architectural benefits? And even the operational benefits, by the way, >> to me that's an interesting question.

And Andy, I mean, when you and I were talking earlier, I'll I'll share the the same analogy I shared with you earlier, like, you know, my wife and I, we and I only say we because I don't want her to

feel bad. It was her, though. Um, but we

feel bad. It was her, though. Um, but we we we bought a new a car, you know, I don't know, five years ago or whatever it was. We got a Subaru Ascent and you

it was. We got a Subaru Ascent and you know, I'm not a car guy. I don't know anything about cars. Um, we got this car. It was nice to drive or whatever.

car. It was nice to drive or whatever.

Turns out it was the first year of this new like powertrain, like new transmission, new whatever. And then,

you know, does that matter? Like I don't know much about cars as long as, you know, to me that's plumbing, right? As

long as it works. I just, you know, I want the outcome of being in a car. or

like I want the outcome of getting from point A to point B. So, I didn't care about the details. And then you know what happened? The car broke down when

what happened? The car broke down when my wife was driving with our two kids, >> you know, toddlers at the time and three dogs and it broke down on the freeway and I wasn't there.

>> And then good luck getting a tow and then getting an Uber to pick you up on the freeway to pick up your two kids and your three dogs. That's impossible. So

they were stranded for hours until they found a enterprise rental person who was willing to come pick you up in a car large enough to handle everything. But

that was like a crazy ordeal and trying to juggle like with the the tow driver or whatever. You know, it's that's

or whatever. You know, it's that's that's crazy stressful. And so we're in the process now of evaluating a new car and you want to know what questions I'm asking about like tell me about the transmission. Like it turns out the

transmission. Like it turns out the plumbing it does matter. Y

>> you just have to know that it matters and then once you find out like once it breaks like it's it can be catastrophic.

And so for the people who don't peak under the hood, the people who don't take the time to understand, I think there's some value in abstracting out the detail, I'm not suggesting you have to go down or that everyone in the company should go down to that that

level of detail, but I do think there's an argument to be made because and then here's my my thrilling conclusion for my analogy. The reason electric cars are

analogy. The reason electric cars are interesting, you know, partially, you know, yeah, there's a power or a climate impact or whatever, but it like it just has fewer parts.

>> Yeah.

>> And if you don't care what's under the hood, then you miss some of the actual value that comes out of that.

>> Yeah.

>> You know, ISIS to me is is the equivalent of an electric engine, right?

It has has fewer parts.

>> It has fewer parts. That's exactly

right. And there's a couple of things there. Um, first of all, we don't

there. Um, first of all, we don't realize that most we don't seem to think about. Most of the major failures we've

about. Most of the major failures we've seen in large networks recently have not been protocol failures. They've been

interaction failures.

>> Correct.

>> Interactions between protocols and something like DNS. DNS in particular is very problematic. And again, like BGP,

very problematic. And again, like BGP, we throw trash at DNS. Oh, it's DNS distributed data. We can do anything we

distributed data. We can do anything we want to with that. Yeah. Till it breaks.

Then all of a sudden it don't work no more and now it's a big problem and it takes a long time to recon converge. And

the other thing I'll tell you about the decline you talk about the decline of knowledge. Look in my world when I was

knowledge. Look in my world when I was coming up in the ITF. I've spent 20 years in the ITF and I don't know if I'll be doing a lot more of it but and in the nogs and at Cisco Live and

everything. I don't believe and you're

everything. I don't believe and you're going to laugh Andy and Mike or whatever but I am not the world's best expert at any of these protocols. I'm actually

pretty stinky compared to a lot of people that I know. And it saddens me that that is not only true, but that now most of those people have retired and

they are gone. Like there are very few people I hit a problem. I somebody

emailed me the other day like a week ago and said I have this problem. I'm

implementing this RFC and it's is toisis and it's um encryption for isis on the wire and I don't you know you were a co-author on the draft and this is you

know I'm struggling with this little bit of it. So I read it and I'm like yeah

of it. So I read it and I'm like yeah that wasn't my part of the draft. So,

[laughter] so, so I thought about it and I asked somebody I know who's implemented it and then I thought the person I ask is

semi-retired. Who am I ever going to ask

semi-retired. Who am I ever going to ask once they're retired? Who do you go to next? And it's true for every protocol

next? And it's true for every protocol in the world. And I don't I mean BGP is the only one I can think of people who

are under 25 or under 30 or under 40 who are as steeped in it as anything I would know or knowing more than I do.

>> Do you think we'll go back to big closed systems like big blackbox things?

>> Oh, I hope not. Oh, I hope not. I know

that's probably where we're trending.

That's where cloud trends to a large degree, but I really hope not because I don't think it's good for the global internet. I don't think it's good for

internet. I don't think it's good for network engineering. I don't think it's

network engineering. I don't think it's good for companies personally. Like I I don't know how you I Yeah, I really I hope not. I hope the open source the

hope not. I hope the open source the open standards and open source vision can weather whatever this storm is we're seeing and and come back to life or or

be serious. I don't I don't know how to

be serious. I don't I don't know how to make it, but that's kind of my my belief. There's a lot of tribal

belief. There's a lot of tribal knowledge that that dies off as as people retire and go off into the sunset. I I think it's incumbent upon

sunset. I I think it's incumbent upon those of us left here.

>> You know, my my little hope is that conversations like this might poke at, you know, some curiosity to folks and have them dig in. That's what happens to me. So, as we're wrapping here, I'm I'm

me. So, as we're wrapping here, I'm I'm looking up what network operating systems support ISIS that I can try to lab this with. Because what I'd like to do, Russ, because I have access to you and why wouldn't I? I'd like to go lab

this, fire it up because I've never done that. and and see what happens and maybe

that. and and see what happens and maybe we could talk about it.

>> I have some labs in container lab that I can send you some small labs that run ISIS. I actually don't have any OPF labs

ISIS. I actually don't have any OPF labs in container lab. They're all is tois and bgp. Um again, not because I dislike

and bgp. Um again, not because I dislike OPF, but I just find ISIS easier to configure.

>> But this past year, I've leaned into uh Linux and containerized NASA's and SR Linux being one of them. I see that it supports ISIS and that that'll probably be just, you know, based on where I work and what my interests are right now,

what I'll what I'll do.

>> Yeah. So so so most of my labs are in FR routing for a very specific reason because I give them out for training.

>> Yeah.

>> And I built a lot of them before I started working for Nokia. So, you know, it's it's kind of a thing. I will

probably rebuild some of them in SR Linux. The other thing is that's going

Linux. The other thing is that's going to I don't know Mike might not like me saying this but one thing about FR routing is that unlike almost any other protocol stack I can shut off the

protocols I don't care about. So it

becomes very very lean for a lab environment and building large topologies in a lab. It just is the way it works. Most of my labs I only run OPF

it works. Most of my labs I only run OPF or I only run is tois and BGP. I don't

run gRPC. I don't run Yang models. All

of that is shut off. And not because I don't like them, not be, you know, it's fine. RIP is there. Europe is. I just

fine. RIP is there. Europe is. I just

turn them off. And I turn off PIM. I

mean, I turn off everything. And why?

Because I can get the image down so small that I can push a lot of routers very quickly. It improves my lab speed,

very quickly. It improves my lab speed, which has nothing to do with the quality of SRL versus FR routing versus blah blah blah blah blah. I don't really care. I'm just trying to build labs,

care. I'm just trying to build labs, right?

>> When you turn them off, how do you know they're not still chatting? And the

reason I'm mentioning this so late in this episode, my friend Lexi Cooper discovered that most vendor NASA when you turn off auto negotiation with her

oscilloscope, she discovered it's still sending auton link pulses. And

>> I thought, wow, you you can turn off a protocol and it can completely ignore you and keep doing its thing. So, I'm

being half facitious here. Like,

>> depends on the architecture of the of the knobs. Yeah,

the knobs. Yeah, >> most commercial network operating systems do not have the heritage of FR routing, >> which is a good thing. Honestly, it's

not a bad thing. It's a good thing because most of them are built as complete systems. They the protocols interact with each other. They have all sorts of APIs. F routing is literally a

bunch of routing dams that were written by different people and thrown together under a single management plane just to make something, right? That's so that

heritage has some good pieces. The the

good piece is I can actually go into the Damon's file, turn off RIP, and if you go into the router and you say router RIP, it says, "What are you talking about? I don't even know what you're

about? I don't even know what you're talking router RIP. What is RIP?" Like

those commands don't even exist. So

>> you're also an FR routing zealot. ISIS

and FR routing.

>> No, not not necessarily. [laughter]

>> I like FR routing because I like it for labing. It's lightweight. It's easy.

labing. It's lightweight. It's easy.

It's open source. And I am a maintainer, so I do have to be a little >> Well, now that I can ping you on teams, sir, and if I have any isis questions, if you're open to that, you may be

>> sure >> hearing from me. I'm sure that's what you want.

>> I'm sure PMing me tomorrow going, "Buddy, you better learn some SR Linux."

[laughter] >> Russ, is there any uh closing thoughts you have around this? I mean, I would like to learn ISIS and I'm going to lab it because that's what these conversations are about for me. A little

bit of curiosity, learn something from somebody smart and then go touch it and see what it's like. Do you have any educational material around ISIS? Have

you written books or done classes on it?

>> Yeah, I have a book and I have a training course here. I have a couple of training courses I've done.

>> Can you plug the book if some you just look up is Russ White? I guess it'll just look upis.

>> Okay.

>> It's probably one of the very few books, but you know what? You might not be able to get it in physical format any longer.

>> I know a guy.

>> It's It's really out of print at this point, but you can get it in digital format. Safari Books or

format. Safari Books or >> Cool, >> whatever. Yeah. So, no, not much else. I

>> whatever. Yeah. So, no, not much else. I

mean, thanks for having me on. Probably

need to do this more often. And um Mike, I promise I'll learn SRL better and do some labs with it. [laughter]

>> Thanks for coming on, guys. Uh for all things art of network engineering, you can check out check out our link tree linkree/artnetenge.

We have new merch up. We have a community called it's all about the journey discord server. Thousands of

folks in there uh studying, learning, helping each other out. Hop in if you don't have a community. Check it out. Um

it's a great place to be. Um as always, thanks so much for joining us and we'll catch you next time on the Art of Network Engineering podcast.

Hey folks, if you like what you heard today, please subscribe to our podcast and your favorite podcaster. You can

find us on socials at Art of Netge. And

you can visit linktree/arttofnetenge for links to all of our content, including the A1 merch store and our virtual community on Discord called it's all About the Journey. You can see our pretty faces on our YouTube channel

named The Art of Network Engineering.

That's youtube.com/arttonetge.

Thanks for listening.

Loading...

Loading video analysis...