Multiprotocol Label Switching (MPLS) Petr Grygárek rek 1
Technology in Brief Inserts underlying label-based forwarding layer under traditional network layer routing label forwarding + label swapping similar to ATM/FR Forwarding tables (switching paths) may be constructed by various mechanisms providing enormous flexibility switching tables constructed using IP routing protocol(s) or some other mechanism Completely decouples data plane forwarding from path determination (control plane) Packet forwarding does not depends only on routing protocols that search for shortest path for particular L3 routed protocol based on particular IGP metric Integrates advantages of traditional packet switching and circuit switching worlds 2
MPLS Advantages & Applications improves the price/performance of network layer routing MPLS switching algorithm might be simpler and faster than traditional IP routing (longest match) Processor-intensive packet analysis and classification happens only once at the ingress edge But MPLS should not be primarily considered a method to make routers much faster anymore today integrates various traditional applications on single setvice provider platform Internet, L3 VPN, L2 VPN, L2 virtual P2P lines, Voice (->QoS, fast reconvergence), Wide range of traffic-engineering and node/link protection options improves the scalability of the network layer eliminating huge IP routing tables by establishing forwarding hierarchy provides greater flexibility in the delivery of (new) routing services new routing services may be added without change to the forwarding paradigm Multiple VRF-based VPNs (with address overlap), traffic-engineering, integrates IP routing with VC-based networks (like ATM) 3
Frame Mode and Cell Mode Frame mode frame switching, used today in service provider's and other core networks encapsulate IP or any other payloads (even L2 frames) Cell mode Used to integrate connectionless packet forwarding applications with connection-oriented networks (ATM) Mostly historical, not used anymore today 4
MPLS position in OSI RM MPLS operates between link and network layer Deals with L3 routing/addressing Uses L2 labels for fast switching Additional shim headers placed between L2 and L3 headers it s presence indicated in L2 header Ethernet EtherType, PPP Protocol field, Frame Relay NLPID, 8847 unicast, 8848 multicast Inherent labels of some L2 technologies ATM VPI/VCI, Frame Relay DLCI, optical switching lambdas, 5
Label-based packet forwarding Packet marked with labels at ingress MPLS router (label imposition) Allows to apply various rules to impose labels destination network prefix, QoS, policy routing (traffic engineering), VPNs, labels imply both routes (IP destination prefixes) and service attributes (QoS, TE, VPN, ) Multiple labels can be imposed (label stack) allows special applications (hierarchical MPLS forwarding) Packet quickly forwarded according to labels through MPLS core uses only label swapping, no IP routing IP routing information may be used only to build forwarding tables, not for actual (potentially slow) IP routing Label removed at egress router and packet forwarded using standard L3 IP routing table lookup In reality, penultimate hop removes label to avoid double lookup on egress device 6
Components of MPLS architecture Forwarding Component (data plane) brute force forwarding using label forwarding information base (LFIB) Control Component (control plane) Control plane implementation for MPLS-based IP routing: Creates and updates label bindings (LFIB) <IP_prefix, label> MPLS node has to participate in routing protocol (IGP or static routing) and/or some other signalling mechanism including ATM switches in MPLS cell-mode Labels assignment is distributed to other MPLS peers using some sort of label distribution protocol (LDP) Control and forwarding functions are separated 7
Label-Switch Router (LSR) MPLS Devices Any router/switch participating on label assignment and distribution that supports label-based packet/cell switching LSR Classification Core LSR (P-Provider) Edge LSR (PE-Provider Edge) (Often the same kind of device, but configured differently) Frame-mode LSR MPLS-capable router with Ethernet interfaces Cell-mode LSR ATM switch with added functionality (control software) 8
Functions of Edge LSR Any LSR on MPLS domain edge, i.e. with non-mpls neighboring devices Performs label imposition and disposition Packets classified and label imposed Classification based on routing and policy requirements Traffic engineering, policy routing, QoS-based routing Information of L3 (and above) headers inspected only once at edge of the MPLS domain 9
Forwarding Equivalence Class (FEC) Packets classified into FECs at MPLS domain edge LSR according unicast routing destinations, QoS class, VPN, multicast group, traffic-engineered traffic class, FEC is a class of packets to be MPLS-switched the same way 10
Label switching path (LSP) Sequence of LSRs between ingress and egress (edge) LSRs + sequence of assigned labels (local significance) Unidirectional (!) Reverse path can take completely different route For every forward equivalence class May diverge from IGP shortest path Path established by traffic engineering using explicit routing and label switching paths tunnels 11
Upstream and downstream neighbors From perspective of some particular LSR Related to particular destination (and FEC) Routing protocol s Next-hop address determines downstream neighbor Upstream neighbor is closer to data source whereas downstream neighbor is closer to the destination network 12
Label and label stack Label format (and length) dependent on L2 technology Labels have local-link significance, each LSR creates it s own label mappings although not a rule, same label is often propagated from different links for the same prefix Multiple labels may be imposed, forming the label stack Label bottom indicated by s bit Label stacking allows special MPLS applications (VPNs etc.) Packet switching is always based on the label on the top of stack 13
MPLS header Between L2 and L3 header MPLS header presence indicated in EtherType/PPP Protocol ID/Frame Relay NLPID 4 octets (32b) 20 bits label value 3 bits Exp (experimental) used for QoS today 8 bits MPLS TTL (Time to Live) 1 bit S bit indicates bottom of stack 14
MPLS Operation basic IP routing Standard IP routing protocol used in MPLS routing domain (OSPF, IS-IS, ) <IP prefix, label > mapping created by egress router i.e. router at MPLS domain edge used as exit point for that IP prefix Label distribution protocols used to distribute label bindings for IP prefixes between adjacent neighbors label has local significance Ingress LSR receives IP packets Performs classification and assigns label Forwards labeled packet to MPLS core Core LSRs switch labeled packets based on label value Egress router removes label before forwarding packet out of MPLS domain performs normal L3 routing table lookup 15
MPLS and IP routing interaction in LSR Incoming unlabeled packets IP routing process IP routing table routing information exchange (routing protocol) Outgoing unlabelled packets Incoming labeled packets MPLS Signalling protocol Control plane Label forwarding table Data plane label bindings exchange Outgoing labeled packets 16
Interaction of neighboring MPLS LSRs IP routing process IP routing table MPLS Signalling Protocol Routing information exchange label bindings exchange IP routing process IP routing table MPLS Signalling Protocol Label forwarding table Labeled packets Label forwarding table 17
Operation of edge LSR Resolving of recursive routes Incoming unlabeled packets Incoming labeled packets IP routing process IP routing table MPLS Signalling protocol IP forwarding table Label disposition and L3 lookup Label forwarding table routing information exchange ge label bindings exchange Outgoing unlabeled packets Outgoing labeled packets 18
Penultimate hop behavior Label at the top of label stack is removed not by egress routes at MPLS domain edge (as could be expected), but by it s upstream neighbor (penultimate hop) On egress router, packet could not be label-switched anyway Egress router has to perform L3 lookup to find more specific route commonly, egress router advertises single label for summary route Label-based lookup and disposition of label imposed by egress router s upstream neighbor would introduce unnecessary overhead For that reason, upstream neighbor of egress router always pops label and sends packet to egress router unlabeled Egress LSR requests popping of label through label distribution protocol advertises implicit-null label for particular FEC 19
Label Bindings Distribution 20
Label Distribution Protocol Functionality Used to advertise <IP_prefix, label> bindings Still not available for IPv6 on most of platforms Used to create Label Information Base (LIB) and Label Forwarding Information Base (LFIB) LIB maintains all prefixes advertised by MPLS neighbors FIB (HW copy of routing table) may contain label to be imposed for particular destination network LFIB maintains only labels advertised by next hops for individual prefixes i.e. those actually used for label switching next-hop determined by traditional IGP LFIB used for actual label switching, LIB maintains labels which may be useful if IGP routes change 21
Label Retention Modes Liberal mode (mostly used in Frame mode) LSR retains labels for FEC from all neighbors Requires more memory and label space Improves latency after IP routing paths change Conservative mode Only labels from next-hop for IP prefix are maintained next-hop determined from IP routing protocol Saves memory and label space 22
Label Distribution Modes Independent LSP control LSR binds labels to FECs and advertises them whether or not the LSR itself has received a label from it s next-hop for that FEC Most common in MPLS frame mode Ordered LSP control LSR only binds and advertises label for FEC if - it is the egress LSR for that FEC - it received a label binding from next-hop LSR 23
Label allocation Per device / per interface For all or just for specified prefixes Label range may be explicitly specified 24
Protocols for Label Distribution Label Distribution Protocol (LDP) IETF standard TCP port 646 RSVP-TE used for MPLS traffic engineering BGP implements MPLS VPNs (peer model) PIM enables MPLS-based multicasts Tag Distribution Protocol (TDP) Cisco proprietary, obsolete LDP predecestor TCP port 711 Label bindings are exchanged between neighboring routers in special cases also between non-neighboring routers targeted LDP session e.g. MPLS-based pseudowire 25
Label Distribution Protocol (LDP): Message Types Discovery messages (hellos) UDP/646 Used to discover and continually check for presence of LDP peers Once a neighbor is discovered, LDP session is established over TCP/646 messages to establish, maintain and terminate session label mappings advertisement messages create, modify, delete error notification message LDP Neighbor ID Corresponding address must be reachable from LDP peer 26
Frame-mode Label Distribution (LDP) Unsolicited downstream Labels distributed automatically to upstream neighbors Downstream LSR advertises labels for particular FECs to the upstream neighbor Independent control of label assignment Label assigned as soon as new IP prefix appears in IP routing table (may be limited by ACL) Mapping stored into LIB LSR may send (switch) labeled packets to next hop even if next- hop itself does not have label for switching that FEC further Liberal retention mode All received label mappings are retained 27
MPLS Applications IP header and forwarding decision decoupling allows for better flexibility and new applications 28
Some Popular MPLS Applications BGP-Free core 6PE/6VPE Carrier Supporting Carrier MPLS Traffic engineering MPLS VPN Integration of IP and ATM or with other connection-oriented network 29
BGP-Free Core Design of transit AS without BGP running on transit (internal) routers BGP sessions between PE routers only full mesh or using route reflector(s) P routers know only routes to networks in the core including PE loopback interfaces LDP creates LSPs into individual networks in the core (including PEs' loopbacks) PEs' loopbacks are used as next hops of BGP routes passed between PE routers 30
6PE (1) Interconnection of IPv6 islands over MPLS non-ipv6-aware core PE routers has to support both IPv6 and IPv4, but P routers do not need to be upgraded (can be MPLS + IPv4 only) Outer label identifies destination PE router (IPv4 BGP next hop), inner label identifies particular IPv6 route Inner label serves as 'index' into egress PE's IPv6 routing table IPv6 prefixes plus associated (inner) labels are passed between PE routers through MP-BGP (using TCP/IPv4) Inner label needed because of PHP, even if egress PE needs to do IPv6 route table lookup anyway penultimate hop cannot handle now exposed IPv6 header Technical implementation: inner label not unique per-route, but one of 16 reserved labels is chosen single reserved value is not enough because of load balancing 31
6PE (2) BGP Next Hop attribute is the IPv4-mapped IPv6 address of egress 6PE router Only LDP for IPv4 is required LDP for IPv6 not implemented yet Does not support multicast traffic Only proposed standard RFC 4798 (Cisco, 2007), but implemented by multiple vendors See http://www.netmode.ntua.gr/presentations/6pe%20-%20ipv6%20ov for further details 32
6VPE VRF-aware 6PE Allows to build MPLS IPv6 VPNs on IPv4-only MPLS core See http://sites.google.com/site/amitsciscozone/ho me/important-tips/mpls-wiki/6vpe-ipv6-over- mpls-vpn for configuration example (Cisco) 33
Carrier Supporting Carrier (1) Hierarchical application of label switching concept A MPLS super-carrier provides connectivity between regions (POPs) for others MPLS-based customer carriers Concept of MPLS VPN in super-carrier networks CSC-P, CSC-PE, CSC-CE Customer carriers regions may also implement MPLS VPN or be pure IP networks Enables global MPLS/VPN 34
Carrier Supporting Carrier (2) Utilizes label stack with multiple labels sub-carrier's labels are untouched during transport over super-carrier Customer carriers do not exchange their customer's routes with super-carrier Just loopback interfaces of PE routers 35
MPLS Traffic Engineering 36
MPLS TE Goals Minimizes network congestion, improve network performance Spreads flows to multiple paths i.e. diverges them from shortest path calculated by IGP More efficient network resource usage 37
MPLS TE Principle Originating LSR (headend) sets up a TE LSP to terminating LSR (tailend) through a explicitly specified path defined by sequence of intermediate LSRs either strict or loose explicit route dynamic (IGP-based path is also an option) LSP is calculated automatically using constraint- based routing or manually using some sort of central management tool in large networks 38
MPLS-TE Mechanisms Link information distribution Path computation (constrained SPF) LSP signalling RSVP-TE accomplishes label assignment during MPLS tunnel creation signalling needed even if path calculation is performed manually Selection of traffic that will take the TE-LSP by QoS class or another policy routing criteria static routes, policy routing, autoroute, forwarding adjacency,... 39
Link Information Distribution Utilizes extensions of OSPF or IS-IS to distribute links current states and attributes OSPF LSA type 10 (opaque) Maximum bandwidth, reservable bandwidth, available bandwidth, flags (aka attributes or colors), TE metric Constraint-based routing Takes into account links current states and attributes when calculating routes Constraint-based SPF calculation excludes links that do not comply with required LSP parameters bandwidth, affinity bits (link colors ), Uses TE-metric instead of IGP metric if defined on individual links 40
RSVP Signalling Resource reservation Protocol (RFC 2205) was originally developed in connection with IntServ, but should be understood as completely independent signalling protocol Reserves resources for unidirectional (unicast/multicast) L4 flows soft-state May be used with MPLS/TE to signal DiffServ QoS PHB over the path 41
RSVP Messages Message Header (message type) Resv, Path, ResvConfirm, ResvTeardown PathTeardown, PathErr,ResvErr Variable number of objects of various classes TLVs including sub-objects Support for message authentication and integrity check 42
Basic RSVP Operation PATH message travels from sender to receiver(s) from TE tunnel headend to tailend in our case allows intermediate nodes to build soft-state information regarding particular session includes flow characteristics (flowspec) RESV message travels from receiver interested in resource reservation towards the sender from TE tunnel tailend back to headend actually causes reservation of intermediate nodes' resources provides labels to upstream routers Soft state has to be periodically renewed 43
LSP Preemption Support for creation of LSPs of different priorities with preemption option setup and holding priority setup priority is compared with holding priority of existing LSPs 0 (best) 7 (worst) Preemption modes Hard just tears preempted LSP down Soft signalls pending preemption to the headend (PathTear/ResvTear) of existing LSP to give it an opportunity to reroute traffic 44
LSP Path Calculation in Multiarea Environment Splitting network into multiple areas limits state information flooding Headend specifies path to route LSP setup requests using list of ABRs loose routing Each ABR calculates and reserves path over connected area and requests another ABR on the path to take care of next section In practise, service providers prefer flat core network (OSPF area0 / L2-only IS-IS) 45
Dynamic routing & TE tunnels Autoroute all destinations located behind TE tunnel endopoint are directed to TE tunnel interface (unidirectional) tunnel's metric normally corresponds to IGP metric between headend and tailend shortest path, regardless of actual tunnel path Logic local to tunnel headend router Forwarding adjacency Headend-tailend link (TE tunnel) is propagated into OSPF/IS-IS database Needs to be configured both on headend and tailend 46
MPLS Fast Reroute In case of node or link failure, backup LSP may be automatically initiated (in tens of milliseconds) 50 ms failover is a goal (compare to SDH) Fast Reroute option must be requested during LSP setup Global or Local restoration Similar functionality exists in IP-only environment (IP Fast Reroute) 47
Fast Reroute - Global restoration New LSP is set up by headend LSP failure is signalled to the headend by PathErr RSVP message Headend has the most complete routing constraints information to establish a new LSP Backup tunnel can be pre-signalled or signalled when primary tunnel goes down latter option incurs tunnel break detection and signalling delays 48
Fast Reroute - Local restoration Detour LSP around failed link/node LSR that detected the failure (called Point of Local Repair) start to use alternative LSP Detour LSPs are manually preconfigured or precalculated dynamically by Point of Local Repair and pre-signalled Detour joins back the original LSP at the Merge Point i.e. at Next hop for link protection, Next Next hop for Node protection Facility Backup (commonly used) - double labeling is used on detour path external tag is dropped before packet enters Merge Point packets arrive to the Merge Point with the same label as they would if they came along original LSP (just from different interface) Different input interface is not an issue as labels are allocated per- platform, not per-interface One-to-One backup does not use label stacking Each LSP has it s own backup path 49
MPLS QoS 50
MPLS and Diffserv LSR uses the same mechanism as traditional router to implement different Per-Hop Behaviors (PHBs) 2 types of LSPs (may coexist on single network): EXP-inferred LSPs (mostly used) can transport multiple traffic classes simultaneously EXP bits in shim header used to hold DSCP value Map between EXP and PHB signaled during LSP setup extension of LDP and RSVP (new TLV defined) Label-inferred LSPs can transport just one traffic class Fixed mapping of <DSCP, EXP> to PHB standardized 51
Diffserv Tunneling over MPLS There are two markings of the packet (EXP, DSCP). There are different models to handle interaction between multiple markings. Pipe model transfers IP DSCP marking untouched useful for interconnection of two Diffserv domains using MPLS Uniform Model Uniform customer and provider QoS models makes LSP an extension of DiffServ domain 52
MPLS VPNs 53
VPN Implementation Options Solution to implement potentially overlapping address spaces of independent customers: Overlay model Infrastructure provides tunells between CPE routers FR/ATM virtual circuits, IP tunnels (GRE, IPSec, ) Peer-to-peer model Provider edge router exchange routing information with customer edge router Customer routes in service provider s IGP Need to solve VPN separation and overlapping customer addressing traditionally by complicated filtering Optimal routing between customer sites through shared infrastructure data don t need to follow tunnel paths 54
MPLS VPN Basic Principles MPLS helps to separate traffic from different VPNs without usage of overlay model tunneling techniques Routes from different VPNs kept separated, multiple routing tables implemented at edge routers (one for each VPN) Uses MPLS label stack: outer label identifies egress edge router, inner label identifies VPN single route in particular VPN To allow propagation of IP prefixes from all VPNs to the core, potentially overlapping addresses of separated VPNs is made unique with Route Distinguisher (different for every VPN) Those IP-VPN (VPNv4) addresses are propagated between PE routers using extended BGP (Multiprotocol BGP, MP-BGP) New address family: VPNv4 address = RD + IPv4 address MP-BGP also distributes (inner) labels identifying particular route in target VRF at egress edge router (using BGP attributes) MP-BGP runs only between PEs, Ps are not involved at all 55
MPLS VPN advantages Integrates advantages of overlay and peer-to- peer model Overlay model advantages: security and customer isolation Peer-to-peer model advantages: routing optimality Simplicity of new CPEs addition 56
MPLS VPN Implementation VPN defined as set of sites sharing the same routing information Site may belong to multiple VPNs Multiple sites (from different VPNs) may be connected to the same PE router PE routers maintains only routes for connected VPNs and backbone routes needed to reach other PEs Increases scalability Decreases performance requirements of PE router PE router uses IP at customer network interface(s) and MPLS at backbone interfaces Backbone (P routers) uses only label switching IGP routing protocol used only to establish optimal label switch paths between PEs Utilizes MPLS label stack Inner label identifies VPN/VRF (or particular route in destination VRF) Outer label identifies egress LSR 57
Routing information exchange P-P and P-PE routers Using IGP Needed to determine paths between PEs over MPLS backbone PE-PE routers (non-adjacent) Using MP-iBGP sessions Needed to exchange routing information between routing tables (VRFs) for particular VPN 58
Routing information in PE routers PE routers maintain multiple separated routing tables Global routing table filled with backbone routes (from IGP) allows to reach other PE routers VRF (VPN routing & forwarding) Separate routing tables for individual VPNs Every router interface assigned to a single VRF VRF instance can be seen as virtual router 59
VPN routing and forwarding VPN A CE VPN A CE VRF A PE P VPN B CE VRF B MPLS domain VRF = virtual router VPN B CE VRF for VPN A VRF for VPN B 60
VRF usage VPN A CE VPN A CE packet VPN A CE VRF A PE P PE VPN B CE VRF B PE VPN B CE VPN A VPN B CE CE 61
MPLS VPN example OSTRAVA 10.0.0.1/24 10.0.1.1/24 TACHOV Customer A e0 e1 Customer B 10.0.0.1/24 I-PE 1.0.0.0/24 2.0.0.0/24 S0 S1/0 S1/1 S0 G-P.1.2.1.2 MPLS Core J-PE e0 Customer B e1 Customer A 10.0.2.1/24 62
VPN Route Distinguishing and Exchange Between PEs OSTRAVA Customer A Customer B RD 100:1 RT 100:10 VRF CustomerA-I 10.0.0.1/24 e0 e1 10.0.0.1/24 VRF CustomerB-I RD 100:2 RT 100:20 I-PE S0 MP-BGP 1.0.0.0/24 2.0.0.0/24 S1/0 S1/1 G-P S0.1.2.1.2 lo0 lo0 3.0.0.1/32 3.0.0.2/32 MPLS Core IGP (OSPF, IS-IS, ) RD 100:2 RT 100:20 VRF CustomerB-J J-PE 10.0.1.1/24 e0 e1 10.0.2.1/24 VRF CustomerA-J RD 100:1 RT 100:10 TACHOV Customer B Customer A 63
PE-to to-pe VPN Route Propagation PE router exports information from VRF to MP-BGP prefix uniqueness ensured using Route Distinguisher (64bit ID) VPN-V4 prefix = RD + IPv4 prefix Route exported with source VRF ID (route target) Multiprotocol (MP) ) ibgp i session between PE routers over MPLS backbone (P routers) Full mesh (route reflectors often used) Propagates VPNv4 routes BGP attributes identify site-of-origin and route target(s) Opposite PE router imports information from MP-BGP into VRF routes imported into particular VRFs according to BGP Route Target attribute values 64
MPLS VPN BGP attributes Site of Origin (SOO) Identifies site where the route originated from avoids loops Route Target In fact, it identifies source VRF Each VRF may configure which RT(s) it import 65
Customer route advertisement from PE router (MP-BGP) PE router assigns RT, RD based on source VRF and SOO PE router assigns VPN (MPLS) label Identifies particular VPN route (in VPN site s routing table, i.e. in VRF) Used as second label in the label stack Top-of-stack label identify egress PE router Route s next-hop rewritten to advertising PE router loopback interface MP-iBGP update sent to other PE routers 66
CE to PE routing information exchange CE router always exchanges routes with VRF assigned to interface connecting that CE router Static routing or directly y connected networks External BGP IGP (RIPv2,OSPF) Multiple instances of routing process (for every VRF) are running on PE router or separated routing contexts in single routing process 67
Overlapping of VPNs Site (VRF) may belong to multiple VPNs provided that there is no addresses overlap Useful for shared server farms, extranets, Internet VRFs etc. Multiple RT imports configured for particular VRF Typical usages both in SP networks and in DC cores 68
Overlapping VPNs example OSTRAVA Customer A Customer B RD 100:1 RT 100:11 VRF CustomerA-I 10.0.0.1/24 e0 e1 10.0.0.1/24 VRF CustomerB-I RD 100:2 RT 100:21 I-PE S0 1.0.0.0/24 2.0.0.0/24 S1/0 S1/1 G-P S0.1.2.1.2 lo0 lo0 3.0.0.1/32 3.0.0.2/32 RD 100:2 RT 100:22 VRF CustomerB-J J-PE 10.0.1.1/24 e0 e1 10.0.2.1/24 VRF CustomerA-J RD 100:1 RT 100::12 TACHOV Customer B Customer A 69