Routing without tears: Bridging without danger

Similar documents
RBridges: Transparent Routing

Myths and Mysteries in the Network Protocol World. Radia Perlman

Tutorial on Bridges, Routers, Switches, Oh My! Radia Perlman

RBRIDGES LAYER 2 FORWARDING BASED ON LINK STATE ROUTING

Tutorial on Network Layers 2 and 3. Radia Perlman Intel Labs

Tutorial on Network Layers 2 and 3. Radia Perlman Intel Labs

Computer Network Protocols: Myths, Missteps, and Mysteries. Dr. Radia Perlman, Intel Fellow

Network Myths and Mysteries. Radia Perlman Intel Labs

RBRIDGES/TRILL. Donald Eastlake 3 rd Stellar Switches AND IS-IS.

Transparent Bridging and VLAN

Network Security Fundamentals. Network Security Fundamentals. Roadmap. Security Training Course. Module 2 Network Fundamentals

CompSci 356: Computer Network Architectures. Lecture 8: Spanning Tree Algorithm and Basic Internetworking Ch & 3.2. Xiaowei Yang

Hands-On Network Security: Practical Tools & Methods

RBRIDGES AND THE TRILL PROTOCOL

DD2490 p Layer 2 networking. Olof Hagsand KTH CSC

CSCI-1680 Link Layer Wrap-Up Rodrigo Fonseca

THE OSI MODEL. Application Presentation Session Transport Network Data-Link Physical. OSI Model. Chapter 1 Review.

Top-Down Network Design, Ch. 7: Selecting Switching and Routing Protocols. Top-Down Network Design. Selecting Switching and Routing Protocols

Computer Network Architectures and Multimedia. Guy Leduc. Chapter 2 MPLS networks. Chapter 2: MPLS

Communication Networks

1 GSW Bridging and Switching

CSCI-1680 Link Layer Wrap-Up Rodrigo Fonseca

CSCI-1680 Link Layer Wrap-Up Rodrigo Fonseca

MPLS MULTI PROTOCOL LABEL SWITCHING OVERVIEW OF MPLS, A TECHNOLOGY THAT COMBINES LAYER 3 ROUTING WITH LAYER 2 SWITCHING FOR OPTIMIZED NETWORK USAGE

Internetworking/Internetteknik, Examination 2G1305 Date: August 18 th 2004 at 9:00 13:00 SOLUTIONS

CSCI Computer Networks

Da t e: August 2 0 th a t 9: :00 SOLUTIONS

CSCI-1680 Link Layer Wrap-Up Rodrigo Fonseca

Request for Comments: S. Gabe Nortel (Northern Telecom) Ltd. May Nortel s Virtual Network Switching (VNS) Overview

Principles behind data link layer services

Chapter 3 Part 2 Switching and Bridging. Networking CS 3470, Section 1

Switching and Forwarding Reading: Chapter 3 1/30/14 1

Top-Down Network Design

What is Multicasting? Multicasting Fundamentals. Unicast Transmission. Agenda. L70 - Multicasting Fundamentals. L70 - Multicasting Fundamentals

Administrivia CSC458 Lecture 4 Bridging LANs and IP. Last Time. This Time -- Switching (a.k.a. Bridging)

Principles behind data link layer services:

Principles behind data link layer services:

Table of Contents. (Rapid) Spanning Tree Protocol. An even worse bridge loop. A simple bridge loop. Bridge loops Two bridges Three bridges (R)STP

Multicast Communications. Slide Set were original prepared by Dr. Tatsuya Susa

Some portions courtesy Srini Seshan or David Wetherall

LARGE SCALE IP ROUTING LECTURE BY SEBASTIAN GRAF

CSE 461: Bridging LANs. Last Topic

Inter-networking. Problem. 3&4-Internetworking.key - September 20, LAN s are great but. We want to connect them together. ...

Summary of MAC protocols

Internetwork Expert s CCNP Bootcamp. Hierarchical Campus Network Design Overview

Advanced Network Training Multicast

Table of Contents. (Rapid) Spanning Tree Protocol. A simple bridge loop. An even worse bridge loop. Bridge loops Two bridges Three bridges (R)STP

ECE 435 Network Engineering Lecture 12

Examination IP routning inom enkla datornät, DD2490 IP routing in simple networks, DD2490 KTH/CSC. Date: 20 May :00 19:00 SOLUTIONS

1 Connectionless Routing

Date: June 4 th a t 1 4:00 1 7:00

Growth. Individual departments in a university buy LANs for their own machines and eventually want to interconnect with other campus LANs.

MANET Architecture and address auto-configuration issue

USING BRIDGES, ROUTERS AND GATEWAYS IN DATA ACQUISITION NETWORKS

Medium Access Protocols

CMPE 150 Winter 2009

CS Networks and Distributed Systems. Lecture 5: Bridging. Revised 1/14/13

Lecture (03) Internet Protocol tcp/ip> OSI>

Computer Networks CS 552

Data Communication Prof. A. Pal Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur Lecture 34 TCP/ IP I

Link layer: introduction

Putting it all together

Module 7 Implementing Multicast

CSCI-1680 Network Layer:

Introduction to MPLS APNIC

Good day. Today we will be talking about Local Internetworking What is Internetworking? Internetworking is the connection of different networks.

Mobile Communications. Ad-hoc and Mesh Networks

Data Link Layer. Our goals: understand principles behind data link layer services: instantiation and implementation of various link layer technologies

Unicast Routing. TCP/IP class

Introduction to MPLS. What is MPLS? 1/23/17. APNIC Technical Workshop January 23 to 25, NZNOG2017, Tauranga, New Zealand. [201609] Revision:

To contain/reduce broadcast traffic, we need to reduce the size of the network (i.e., LAN).

Implementing Spanning Tree Protocol

Computer Networks Prof. S. Ghosh Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture 28 IP Version 4

ECE 435 Network Engineering Lecture 21

L2 Addressing and data plane. Benjamin Baron

SWITCHING, FORWARDING, AND ROUTING

CS 43: Computer Networks Switches and LANs. Kevin Webb Swarthmore College December 5, 2017

Outline. Routing. Introduction to Wide Area Routing. Classification of Routing Algorithms. Introduction. Broadcasting and Multicasting

EEC-684/584 Computer Networks

EPL606. Internetworking. Part 2a. 1Network Layer

Campus Networking Workshop. Layer 2 engineering Spanning Tree and VLANs

CS519: Computer Networks. Lecture 2: Feb 2, 2004 IP (Internet Protocol)

Introduction to OSPF

2D1490 p Bridging, spanning tree and related issues. Olof Hagsand KTHNOC/NADA

Image courtesy Cisco Systems, Inc. Illustration of a Cisco Catalyst switch

Campus Networking Workshop CIS 399. Core Network Design

Internetworking Part 2

PUCPR. Internet Protocol. Edgard Jamhour E N G L I S H S E M E S T E R

CSE 123A Computer Networks

Last time. Network layer. Introduction. Virtual circuit vs. datagram details. IP: the Internet Protocol. forwarding vs. routing

Connection Oriented Networking MPLS and ATM

Midterm Review. Topics. Review: Network. Review: Network. Layers & Protocols. Review: Network WAN. Midterm Review EECS 122 EECS 122

Midterm Review EECS 122. University of California Berkeley

MultiProtocol Label Switching - MPLS ( RFC 3031 )

Chapter 2 - Part 1. The TCP/IP Protocol: The Language of the Internet

Internetworking Concepts Overview. 2000, Cisco Systems, Inc. 2-1

EEC-484/584 Computer Networks

CompSci 356: Computer Network Architectures. Lecture 7: Switching technologies Chapter 3.1. Xiaowei Yang

Link State Routing. Link State Packets. Link State Protocol. Link State Protocols Basic ideas Problems and pitfalls

Outline. Circuit Switching. Circuit Switching : Introduction to Telecommunication Networks Lectures 13: Virtual Things

Transcription:

Routing without tears: Bridging without danger Radia Perlman Sun Microsystems Laboratories Radia.Perlman@sun.com 1

Before we get to RBridges Let s sort out bridges, routers, switches... 2

What are bridges, really? Myth: bridges/switches simpler devices, designed before routers OSI Layers 1: physical 3

Why this whole layer 2/3 thing? Myth: bridges/switches simpler devices, designed before routers OSI Layers 1: physical 2: data link (nbr-nbr, e.g., Ethernet) 4

Why this whole layer 2/3 thing? Myth: bridges/switches simpler devices, designed before routers OSI Layers 1: physical 2: data link (nbr-nbr, e.g., Ethernet) 3: network (create entire path, e.g., IP) 5

Why this whole layer 2/3 thing? Myth: bridges/switches simpler devices, designed before routers OSI Layers 1: physical 2: data link (nbr-nbr, e.g., Ethernet) 3: network (create entire path, e.g., IP) 4 end-to-end (e.g., TCP, UDP) 6

Why this whole layer 2/3 thing? Myth: bridges/switches simpler devices, designed before routers OSI Layers 1: physical 2: data link (nbr-nbr, e.g., Ethernet) 3: network (create entire path, e.g., IP) 4 end-to-end (e.g., TCP, UDP) 5 and above: boring 7

Definitions Repeater: layer 1 relay 8

Definitions Repeater: layer 1 relay Bridge: layer 2 relay 9

Definitions Repeater: layer 1 relay Bridge: layer 2 relay Router: layer 3 relay 10

Definitions Repeater: layer 1 relay Bridge: layer 2 relay Router: layer 3 relay OK: What is layer 2 vs layer 3? 11

Definitions Repeater: layer 1 relay Bridge: layer 2 relay Router: layer 3 relay OK: What is layer 2 vs layer 3? The right definition: layer 2 is neighborneighbor. Relays should only be in layer 3! 12

Definitions Repeater: layer 1 relay Bridge: layer 2 relay Router: layer 3 relay OK: What is layer 2 vs layer 3? True definition of a layer n protocol: Anything designed by a committee whose charter is to design a layer n protocol 13

Layer 3 (e.g., IPv4, IPv6, DECnet, Appletalk, IPX, etc.) Put source, destination, hop count on packet Then along came the EtherNET rethink routing algorithm a bit, but it s a link not a NET! The world got confused. Built on layer 2 I tried to argue: But you might want to talk from one Ethernet to another! Thought Ethernet was a competitor to layer 3 14

Layer 3 packet source dest hops data Layer 3 header 15

Ethernet packet source dest data Ethernet header No hop count 16

Layer 3 packet source dest hops data Layer 3 header Addresses have topological significance 17

Ethernet packet source dest data Ethernet header Addresses are flat (no topological significance) 18

It s easy to confuse Ethernet with network Both are multiaccess clouds Why can t Ethernet replace IP? Flat addresses No hop count Missing additional protocols (such as neighbor discovery) Perhaps missing features (such as fragmentation, error messages, congestion feedback) 19

So, we had layer 3, and Ethernet People built protocol stacks leaving out layer 3 There were lots of layer 3 protocols (IP, IPX, Appletalk, CLNP), and few multiprotocol routers 20

Problem Statement Need something that will sit between two Ethernets, and let a station on one Ethernet talk to another A C 21

Basic idea Listen promiscuously Learn location of source address based on source address in packet and port from which packet received Forward based on learned location of destination 22

What s different between this and a repeater? no collisions with learning, can use more aggregate bandwidth than on any one link no artifacts of LAN technology (# of stations in ring, distance of CSMA/CD) 23

But loops are a disaster No hop count Exponential proliferation S B1 B2 B3 24

But loops are a disaster No hop count Exponential proliferation S B1 B2 B3 25

But loops are a disaster No hop count Exponential proliferation S B1 B2 B3 26

But loops are a disaster No hop count Exponential proliferation S B1 B2 B3 27

But loops are a disaster No hop count Exponential proliferation S B1 B2 B3 28

What to do about loops? Just say don t do that Or, spanning tree algorithm Bridges gossip amongst themselves Compute loop-free subset Forward data on the spanning tree Other links are backups 29

Algorhyme I think that I shall never see A graph more lovely than a tree. A tree whose crucial property Is loop-free connectivity. A tree which must be sure to span So packets can reach every LAN. First the Root must be selected By ID it is elected. Least cost paths from Root are traced In the tree these paths are placed. A mesh is made by folks like me. Then bridges find a spanning tree. Radia Perlman 30

A 2,2,11 X 11 2,1,6 7 6 3 2,3,3 9 2,1,7 2,0,2 2,2,4 10 2 5 4 2,0,2 2,2,4 2,1,14 14 2,1,5 31

Notice you don t get optimal pairwise paths 32

A A talks to X X 11 7 6 3 9 10 2 5 4 14 33

Bother with spanning tree? Maybe just tell customers don t do loops First bridge sold... 34

First Bridge Sold A C 35

Bridges are cool, but Routes are not optimal (spanning tree) STA cuts off redundant paths If A and B are on opposite side of path, they have to take long detour path Temporary loops really dangerous no hop count in header proliferation of copies during loops Traffic concentration on selected links 36

Bridge meltdowns They do occur (a Boston hospital) Lack of receipt of spanning tree msgs tells bridge to turn on link So if bridge can t keep up with wire speed In contrast with routers: lost messages will cause link to be brought down Note: original Digital bridge spec said bridges had to be wire speed Also, some additions to bridging involve configuration, which if wrong meltdowns 37

Why are there still bridges? Why not just use routers? Bridges plug-and-play Endnode addresses can be per-campus IP routes to links, not endnodes So IP addresses are per-link Need to configure routers Need to change endnode address if change links 38

What if you routed to endnodes, not to links? Suppose you could have a whole campus, all with one prefix That s what DECnet/CLNP called level 1 routing Used a special protocol called ES-IS Endnodes periodically announce to routers Routers periodically announce to endnodes 39

True level 1 routing CLNP addresses had two parts area (14 bytes ) node (6 bytes) An area was a whole multi-link campus Two levels of routing level 1: routes to exact node ID within area level 2: longest matching prefix of area 40

IP-style One prefix per link Hierarchy CLNP-style One prefix per campus 22* 2835* 28* 292* 25* 2* 2* 41

CLNP level 1 routing Autoconfiguration Rtrs discover area prefix Tell endnodes, which plug in their MAC to form their layer 3 address Rtrs tell each other (using link state routing protocol), within area, location of all endnodes in area 42

Level 1 routing with IP IP has never had true level 1 routing Each link has a prefix Multilink node has two addresses Move to new link requires new address Bridging is used with IP to sort of do level 1 routing But not as good: spanning tree paths rather than optimal pt-to-pt paths, meltdowns 43

One prefix per campus vs per link Advantages Zero configuration of routers within campus Move nodes within campus without changing address Multiple points of attachment: same address Don t partition address space Disadvantages Bigger routing tables of level 1 routers 44

Bridging vs CLNP-style level 1 Routing Better routes: optimal pt-to-pt routes, can use all links, can path split, do traffic engineering Stable protocol (lost messages bring link down, not up) Forwarding with safe hdr (hop count, and specify next hop) But CLNP depended on ES-IS: We ll have to do our best without it 45

Link State Routing meet nbrs Construct Link State Packet (LSP) who you are list of (nbr, cost) pairs Broadcast LSPs to all rtrs ( a miracle occurs ) Store latest LSP from each rtr Compute Routes (breadth first, i.e., shortest path first well known and efficient algorithm) 46

IS-IS A specific link state protocol Similar to OSPF, but more suitable for RBridges because No need to configure IP addresses (which OSPF depends on) Easy to add new fields with TLV encoding 47

What we d like Best features of bridges Transparency to endnodes Plug-and-play switches Best features of routers Optimal paths Safe header for forwarding Interoperate with existing routers, endnodes, and bridges 48

RBridges Compatible with today s bridges and routers Like routers, terminate bridges spanning tree Like bridges, glue LANs together to create one IP subnet (or for other protocols, a broadcast domain) Like routers, optimal paths, fast convergence, no meltdowns Like bridges, plug-and-play 49

RBridging layer 2 Link state protocol among Rbridges (so know how to route to other Rbridges) Like bridges, learn location of endnodes from receiving data traffic But since traffic on optimal paths, need to distinguish originating traffic from transit So encapsulate packet 50

Layer 2 routing Think of it just like routers At each hop, add a layer 2 header to get to the next hop RBridge The destination is the egress RBridge First RBridge Looks up destination endnode, finds egress RB Adds shim header, forwards to egress RB Egree RBridge decapsulates 51

Rbridging R4 R7 c R1 R3 R5 R6 a R2 52

Encapsulation Header S=Xmitting Rbridge D=Rcving Rbridge pt= transit hop count In/out RBridge original pkt (including L2 hdr) Outer header: new p-t: otherwise, ordinary Ethernet hdr Shim: hop count, and ingress or egress RBridge If unicast, it is egress RBridge If multicast or unknown destination, it is ingress RBridge 53

Hdrs inside hdrs R1 S R2 R3 D 54

Hdrs inside hdrs R1 S R2 R3 D S: D = S = 55

Hdrs inside hdrs R1 S R2 R3 D R1: D = TTL D = S = R3 p-t=new S = shim 56

Hdrs inside hdrs R1 S R2 R3 D R2: D = TTL-1 D = S = R3 p-t=new S = shim 57

Hdrs inside hdrs R1 S R2 R3 D R3: D = S = 58

Flooded traffic Some traffic needs to be sent to all links Unknown destinations Multicast traffic Could use a single spanning tree Spanning tree computed from link state database (not separate spanning tree protocol) But we decided on per-ingress trees 59

Why per-ingress trees? Yeah, it s more computation (but not more protocol messages link state database gives all the info you need) Advantages IP multicasts, and VLAN delivery not to all links, so optimized delivery Packets won t get out of order when switching from unknown to known (take same path) 60

VLANs VLANs allow partitioning of a bridged campus A packet marked with a VLAN tag must only be delivered to links in that VLAN With bridges, VLAN membership of links is configured into the bridges Usually bridges add and remove the VLAN tag 61

Calculating a spanning tree No need for separate spanning tree protocol Link state protocol gives enough info For per-rbridge: calculate a spanning tree rooted at that RBridge 62

Need a broadcast domain per VLAN That s the definition of a VLAN A packet for VLAN A must only be delivered to links in VLAN A Could be filtered at egress RBridges Makes forwarding info for intermediate RBridges trivial, but packet delivery more expensive Or could be filtered if no downstream receivers for VLAN A 63

Delivery path for VLAN A if one spanning tree, and pinks are VLAN A 3 2,2,11 2,3,3 9 11 2,1,7 2,1,6 7 6 2,0,2 2,2,4 4 10 2,0,2 2 5 2,2,4 2,1,14 14 2,1,5 64

Delivery path for VLAN A if per VLAN spanning tree, and pinks are VLAN A 3 9 11 4 10 7 14 2 6 5 65

Likewise for IP-derived multicasts If receivers are on just a few links, then it would be worth calculating per-rbridge spanning tree IP multicast has IGMP to let RBridges know where receivers are 66

Filtering info With spanning tree, have a set of ports For each port, mark what VLANs, and IP multicasts are downstream on that branch Select the spanning tree (based on ingress RBridge from shim header, or VLAN tag), then apply filtering rules 67

Shim Header Information needed Hop count Ingress RBridge Egress RBridge But never need BOTH ingress and egress RBridge, so can overload the field 68

MPLS-like header Some people argued that some hardware already adds 4-byte shim header (for MPLS) MPLS contains TTL, and label But label is 20 bits How to fit 6-byte address into 20 bits? 69

Nicknames Piggyback nickname acquisition protocol onto link state protocol Stick I want this nickname into LSP If someone else claims your nickname, whoever has lower system ID keeps the nickname; the other one chooses a different one 70

Endnode Learning On shared link, only one Rbridge (DR) can learn and decapsulate onto link otherwise, a naked packet will look like the source is on that link have election to choose which Rbridge When DR sees naked pkt from S, announces S in its link state info to other Rbridges 71

Pkt Forwarding: Ingress RBridge If D known: look up egress RBridge R2, encapsulate, and forward towards R2 Else, send to destination=flood, meaning send on spanning tree Which tree: ingress RBridge each DR decapsulates 72

IP optimization: proxy ARP For IP, learn (layer 3, layer 2) from ARP (ND) replies Pass around (layer 3, layer 2) pairs in LSP info Local RBridge can proxy ARP (i.e., answer ARP reply) if target (layer 3, layer 2) known 73

Possible IP optimization: tighter aliveness check Can check aliveness of attached IP endnodes by sending ARP query Can assume endnode alive, until you forward traffic to it, or until someone else claims that endnode 74

VLANs VLAN A endnodes only need to be learned by RBridges attached to VLAN A All RBridges must be able to forward to any other RBridge Egress RBridge in the encapsulation header 75

Endnode Information Only need to know location of VLAN A endnodes if you are attached to VLAN A So, different instances of IS-IS One for basic RBridge-RBridge connectivity One per VLAN, to share endnode membership for that VLAN 76

Possible variant Don t explicitly pass around endnode information Instead, egress RBridge learns (ingress RBridge, source) from data packet 77

Decided not to do that because Sometimes layer 2 is definitive about enrollment (so better than caches and timeouts) Won t need to flood as often Don t want to starve between two bales of hay 78

Conclusions Routing without tears Zero configuration, optimal paths Bridging without danger No meltdowns Can gradually replace bridges with RBridges 79

Algorhyme v2 I hope that we shall one day see A graph more lovely than a tree. A graph to boost efficiency While still configuration-free. A network where RBridges can Route packets to their target LAN. The paths they find, to our elation, Are least cost paths to destination. With packet hop counts we now see, The network need not be loop-free. RBridges work effectively. Without a common spanning tree. Ray Perlner 80