LARGE SCALE IP ROUTING LECTURE BY SEBASTIAN GRAF MODULE 07 - MPLS BASED LAYER 2 SERVICES 1 by Xantaro
MPLS BASED LAYER 2 VPNS USING MPLS FOR POINT-TO-POINT LAYER 2 SERVICES 2 by Xantaro
Why are Layer-2 VPNs needed? Layer-2 VPNs based on ATM or Frame Relay circuits have been around for a long time Still today very popular for datacenter interconnects or wholesale solutions As there is a lot of progress in network convergence, Service Providers have a single infrastructure for both layer 3 and layer 2 services Customer can have more control No routing involved with service provider Frame-based forwarding for Frame Relay, Ethernet, ATM, Layer-2 Circuit provides and end-to-end layer-2 connection between two locations over an MPLS core 3 by Xantaro
Standardization Layer-2 VPN () services are developed by the Working Group of the IETF Reference model is defined in RFC 4664 (Framework for Layer 2 Virtual Private Networks) Flavors of include Virtual Private LAN Services (VPLS) (layer-2 point-to-multipoint service) Virtual Private Wire Services (VPWS) (layer-2 point-to-point service) WG is in charge of service definition (control plane); transport technologies across a core network are defined by the Pseudowire Edge to Edge Emulation (PWE3) Working Group (data plane) Defined in RFC3985: Pseudowire Architecture Develops standards for the encapsulation & service emulation of pseudo wires (RFC 4448/4618/4619/4717/...) 4 by Xantaro
Building Block: Pseudowire (PW) Pseudowires are defined by the IETF Pseudo Wire Edge to Edge Emulation (PWE3) working group A pseudowire (PW) defines a mechanism that emulates the essential attributes of a native service while transporting over a packet switch network (PSN) Unlike LSPs, pseudowires are bidirectional Pseudowires support edge-to-edge emulation (network interworking), but no service interworking Pseudowire types as defined in RFC 4446 include Frame Relay DLC (0x0001) ATM AAL5 mode (0x0002) ATM cell mode (0x0003) Ethernet tagged mode (0x0004) Ethernet port mode (0x0005) 5 by Xantaro
Pseudowire Reference Model Pseudowire is a point-to-point connection between two attachment circuits (AC) connected to PE routers Native data units (bits, cells, packets) presented to the PW are encapsulated in a PW-PDU and carried across MPLS tunnel. these services are transparent so that frames are not changed during transit. In case of Ethernet, PE routers do not perform any lookup or learning based on the MAC address 6 by Xantaro
Pseudowire Setup and Maintenance PWE3 architecture does not specify the way, how pseudowires are setup and maintained, this is defined in other standards these other standards are commonly referred to by their primary authors Kireeti Kompella (Juniper Networks) Luca Martini (Cisco Systems) There are two ways to do this across an MPLS core Kompella-Draft (RFC 6624) BGP signaling in control plane Martini encapsulation in data plane Martini-Draft (RFC 4447) LDP signaling in control plane Martini encapsulation in data plane 7 by Xantaro
Terminology Customer Edge (CE) device is either a router or a switch located at customer site that provides access to provider network Provider Edge (PE) router connects to customer device and maintains VPN-related information Provider (P) router forwards traffic transparently (not VPN-aware) RED CE-R1 Customer Edge (CE) Device Site 1 P1 PE2 Site 2 CE-B2 BLUE Customer Site Provider (P) Router BLUE CE-B1 Site 1 PE1 Provider Edge (PE) Router P2 P3 PE3 Site 2 CE-R2 RED 8 by Xantaro
VPN Forwarding Table (VFT) Each VPN forwarding table is populated with information provisioned for the local CE device on each PE (one VFT per site) Local site ID (given by administrator) Remote site ID (given by administrator) Logical interfaces provisioned to the local CE device (given by administrator) Label base used to associate received traffic with one of the logical interfaces (automatically assigned by PE router) RED CE-R1 VFT is distributed for each VPN site to remote PEs Site 1 P1 VFT VFT PE2 Site 2 CE-B2 BLUE VFT VFT MP-iBGP session BLUE CE-B1 Site 1 PE1 P2 P3 PE3 VFT VFT Site 2 CE-R2 RED 9 by Xantaro
VFT and CE Device Virtual Forwarding Table (VFT) is provisioned at each PE router for each local CE device Each CE devices within VPN is assigned a CE ID which uniquely identifies the CE device CE device can have multiple logical interfaces (attachment circuits) associated with the VFT in case of Ethernet VLANs may be used to separate them PE generates a contiguous set of labels called label block First label within label block is called label base number of labels within the label block is called label range Why use label blocks and not a single label? while it would be sufficient to just advertise a single label for point-to-point services, label blocks have the advantage that they may signal multiple services with one BGP routing update additionally label blocks can later on be used for point-to-multipoint services, where we need multiple labels 10 by Xantaro
Updating VFTs PE router first checks received VCTs information on route target Assume PE1 is connected to CE-k and remote PE2 advertised label block with label base LB m, label offset LO m and label range LR m for CE-ID m Check matching encapsulation type Check if k=m. If so, then stop and issue error. Find among label blocks advertised by PE2 block with LO m k < LO m + LR m Find among all label block of CE k block with LO k m < LO k + LR k Find appropriate tunnel label X to PE2 Install mapping of local circuit to VPN label sending label: LB m + k - LO m receiving label: LB k + m - LO k 11 by Xantaro
Circuit Status Vector With a real cable, devices can recognize whether the link is up to the connected systems With s, there is no longer a direct connection, but ideally a pseudowire should be able to simulate that behaviour circuit is a bidirectional connection composed of two simplex connection. Each simplex connection has three segments local attachment circuit MPLS tunnel remote attachment circuit Monitoring requires monitoring of the status of both simplex connections by the PE routers PE router knows status of local attachment circuit and tunnel towards PE Circuit status vector contains a single bit for each label in a label block Setting bit to 1 indicates either local circuit or LSP to remote router is down Receiving PE router is able to report failure to attached CE device 12 by Xantaro
Data Forwarding (1/5) CE-R3 send Ethernet frame to PE3 CE device can be either switch or router frame can be either tagged or untagged PE router is agnostic to Layer-3 protocol used RED CE-R1 P1 PE2 CE-B2 BLUE VFT CE-B1 BLUE PE1 P2 P3 PE3 VFT CE-R2 RED Ethernet 13 by Xantaro
Data Forwarding (2/5) PE router associates incoming interface with VFT and looks up remote PE router and VC-ID -specific label (300 in this example) is push on the packet depending on the VCT (learned by MP-BGP from PE1) Outer tunnel label (100 in this example) is pushed on the packet depending on egress PE router (learned by LDP or RSVP from P3) RED CE-R1 P1 PE2 CE-B2 BLUE VFT CE-B1 BLUE PE1 P2 100 300 P3 Ethernet PE3 VFT CE-R2 RED 14 by Xantaro
Data Forwarding (3/5) Packet is forwarded along LSP P routers are not VPN-aware, so service label 300 is ignored Transit LSR only perform label-operation based on the top label in this case P1 is the next-hop on the shortest path from P3 to PE1 Label 200 was learned via LDP or RSVP from P1 RED CE-R1 200 300 P1 Ethernet PE2 CE-B2 BLUE VFT CE-B1 BLUE PE1 P2 P3 PE3 VFT CE-R2 RED 15 by Xantaro
Data Forwarding (4/5) Penultimate Hop Popping is done on the penultimate LSR signaled by PE1 by advertising Label 3 for its loopback IP this prevents dual lookup on PE1 RED CE-R1 VFT 300 Ethernet P1 PE2 CE-B2 BLUE CE-B1 BLUE PE1 P2 P3 PE3 VFT CE-R2 RED 16 by Xantaro
Data Forwarding (5/5) Egress PE router does label lookup to find corresponding attachment circuit it finds label 300 which it advertised to PE3 for this specific Layer 2 VPN Egress PE pops label Native frame is forwarded out logical interface corresponding to the VFT From the perspective of the CE routers, this connections looks like a direct link between them (hence the name pseudowire) RED CE-R1 Ethernet P1 PE2 CE-B2 BLUE VFT CE-B1 BLUE PE1 P2 P3 PE3 VFT CE-R2 RED 17 by Xantaro
Configuration Prerequisites Checklist: Which steps have to be configured before can be deployed? Basically the same as with MPLS based Layer 3 VPNs: Choose and configure an IGP (e.g. OSPF or IS-IS) Configure MP-iBGP peering among PE routers (in case BGP signaling is used) enable appropriate address family for layer 2 VPNs Enable MPLS and desired MPLS signaling protocol (LDP / RSVP) on PE and P routers PE routers perform all VPN-related configuration P routers are unware of VPN services 18 by Xantaro
VIRTUAL PRIVATE LAN SERVICES USING MPLS FOR POINT-TO-MULTIPOINT LAYER 2 SERVICES 19 by Xantaro
VPLS Overview Virtual Private LAN Services (VPLS) defines an architecture that allows MPLS network to offer Layer-2 multipoint Ethernet services Service Provides virtually emulates an IEEE Ethernet bridge network Virtual bridges are linked with MPLS Pseudowires Data plane is the same as for point-to-point s basically a VPLS instance is a collection of point-to-point s with MAC learning enabled There are two standards for VPLS control plane One uses BGP for auto-discovery and signaling of PWs (RFC 4761) The other one uses LDP for signaling of PWs (RFC 4762) 20 by Xantaro
Virtual Private LAN Services Private Ethernet network constructed over a shared infrastructure which may span several metro areas Point to multipoint Ethernet connectivity where service provider looks like an Ethernet broadcast domain From the perspective of the customer, all CE routers are connected to the same layer 2 Switch 21 by Xantaro
VPLS Data Plane Operation Both BGP and LDP VPLS have the same data plane operation It is quite similar to the operation of ordinary Layer 2 switches Flooding, learning, aging, broadcast, multicast are mostly the same as for bridges One major difference is that PE-to-PE flooding used to use ingress replication or point-tomultipoint LSPs, which are out of the scope of this presentation Both assume a full mesh of pseudowires (s) among PEs This leads to the split-horizon forwarding rule Frames received from one PE via a pseudowire must not be forwarded to another pseudowire 22 by Xantaro
VPLS Unicast Forwarding From customer point of view, a VPLS instance is a virtual switch, thus including MAC address learning If Ethernet frame with unknown destination MAC address is received on a local port, the frames is flooded to all other local ports and to all VPLS PE-routers via established pseudowires If Ethernet frame with unknown destination MAC address is received on pseudowire, the frame is only flooded to all local port (due to fact that PEs are fully meshed) Receiving frame with source MAC address enable PE router to learn location of this MAC address (either local port or pseudowire) 23 by Xantaro
MAC Learning When a customer device starts sending a frame, the source MAC address is unknown to the PE routers. Receipt of a frame triggers MAC learning process PFE would notify RE about new source MAC address being learnt The new source MAC address would be programmed on the VPLS forwarding table and be associated to the corresponding interface where the frame is received If a destination MAC address of a frame matches this entry, it would be forwarded to this interface If frame is received by remote PE router, the same MAC learning process would be done Instead of pointing to the local PE-CE interface, the MAC address points to the remote PE 24 by Xantaro
VPLS BGP Control Plane BGP for VPN Auto-discovery and Signaling Common framework for Multiple VPN Services, a single MP-BGP Session may carry routes for: IP-VPNs (2547bis) L2 MPLS VPNs (draft-kompella-ppvpn-l2vpn) IPv6 VPN (draft-ietf-ppvpn-bgp-ipv6-vpn) VPLS (draft-kompella-ppvpn-vpls) BGP scalability, redundancy and operational simplicity Route Reflectors, Refresh, etc Supports Multi-AS/Multi-provider operations Forwarding plane is the same as for s (defined by PWE3) 25 by Xantaro
VPN Forwarding Table One VFT is configured per PE per VPLS instance using Route Distinguisher, e.g. 65000:100 may be common for a VPLS instance across the network Layer-2 encapsulation, e.g. Ethernet must be the same on all PE routers for a certain VPLS instance VPLS Edge ID (VE ID), e.g. 3 each site must have a unique identifier Route Target e.g. target:65100:123 if all sites should be able to talk to each other, all PE routers use the same value here. Hub and Spoke VPLS instances can be created with modified route targets Each PE automatically allocates a VPN label block to be used as demultiplexors VFT information distributed using MP-BGP VPLS Forwarding Table (VFT) on each PE holds all the information learned from remote PE routers 26 by Xantaro
Label Allocation When VPLS instance is created, unique label ranges called label blocks are created on a per box basis The label range together with other information will be advertised to all remote PE routers using MP-BGP Each advertised prefix is 96 bit long Label range (or VE block size) determines the number of remote CE sites supported for this instance Example: 10.10.10.1:123:20:17/96 10.10.10.1:123 is the route distinguisher 20 is local site ID or VPLS Edge (VE) ID 17 is the VE block offset (VBO) 27 by Xantaro
Label Blocks Label block is defined by a label base (LB) and a VE block size (VBS) Results in contiguous set of labels (LB, LB+1, LB+VBS-1) Label block advertised by MP-BGP to remote PE routers To avoid allocating large blocks at once, VE block offset (VBO) is introduced building a set (LB+VBO, LB+VBO+1, LB+VBO+VBS-1) All PEs configured with unique site IDs Labels are calculated based on local site ID X and remote parameters: If VBO<=X<VBO+VBS, outgoing label used is LB+X-VBO If remote VE ID V is part of a local set, incoming label is LB +V-VBO If no corresponding label block is found, new announcement must be sent. 28 by Xantaro
VPLS Label Block Structure Below is an example for two label blocks the first label block uses label 64 for site 1, 65 for site 2,, 71 for site 8 the second label block uses label 72 for site 9, 73 for site 10,..., 79 for site 16 Both label base and offset do not necessarily have to be contiguous 29 by Xantaro
Sample VPLS Topology We will use the following topology as an example for label allocation and packet forwarding We will take a closer look at the perspective of PE1 which labels should PE1 use, when it wants to send frames to the endstation of PE3? which labels does PE1 expect when PE3 sends traffic to the endstation of PE1? PE1 PE3 00:20:20:20:20:20 Site ID 20 Site ID 3 00:03:03:03:03:03 PE2 Site ID 1 30 by Xantaro 00:01:01:01:01:01
Understanding Label Calculation: Outgoing Label The following excerpt shows routes that PE1 received from PE3 for this specific VPLS instance xuser@pe1> show route receive-protocol bgp 10.10.10.3 detail VPLS.l2vpn.0: 4 destinations, 4 routes (4 active, 0 holddown, 0 hidden) * 10.10.10.3:1:3:1/96 (1 entry, 1 announced) Import Accepted Route Distinguisher: 10.10.10.3:1 Label-base: 262145, range: 8 Nexthop: 10.10.10.3 Localpref: 100 AS path: I Communities: * 10.10.10.3:1:3:17/96 (1 entry, 1 announced) Import Accepted Route Distinguisher: 10.10.10.3:1 Label-base: 262153, range: 8 Nexthop: 10.10.10.3 Localpref: 100 AS path: I Communities: Label Calculation PE1 uses local site ID of 20 PE1 receives two VPLS prefixes from PE3 1 st Advertisements covers site IDs 1 8 2 nd advertisement covers site IDs 17..24 Because local VE ID is 20, PE1 uses 2 nd one with label base of 262153 Outgoing Label=262153 + 20-17 =262156 31 by Xantaro
Understanding Label Calculation: Incoming Label The following excerpt shows routes that PE1 advertised to PE3 for this specific VPLS instance xuser@pe1> show route advertising-protocol bgp 10.10.10.3 detail VPLS.l2vpn.0: 4 destinations, 4 routes (4 active, 0 holddown, 0 hidden) * 10.10.10.1:1:20:1/96 (1 entry, 1 announced) BGP group internal type Internal Route Distinguisher: 10.10.10.1:1 Label-base: 262560, range: 8 Nexthop: Self Flags: Nexthop Change Localpref: 100 AS path: [65000] I Communities: * 10.10.10.1:1:20:17/96 (1 entry, 1 announced) BGP group internal type Internal Route Distinguisher: 10.10.10.1:1 Label-base: 262590, range: 8 Nexthop: Self Flags: Nexthop Change Localpref: 100 AS path: [65000] I Communities: Label Calculation PE3 uses site ID 3 (remote site ID) PE1 advertises two VPLS prefixes to PE3 1 st Advertisements covers site IDs 1 8 2 nd advertisement covers site IDs 17..24 Because remote VE site is 3, PE2 uses 1 st label block with label base of 262560 Incoming Label=262560 + 3-1 =262562 32 by Xantaro
VPLS MAC Learning Step 1 Consider the following network with 3 PE routers we assume that all sites are part of the same VPLS instance (same route targets on PEs) no traffic has passed the MPLS core yet each site has a local system attached for simplicity you can recognize the site ID in the actual MAC addresses PE1 and PE3 use the label values shown on the previous slides PE1 PE3 00:20:20:20:20:20 Site ID 20 Site ID 3 00:03:03:03:03:03 PE2 Site ID 1 33 by Xantaro 00:01:01:01:01:01
VPLS MAC Learning Step 1 Station on site 20 connected to PE1 will broadcast a frame (e.g. because of ARP request) PE1 will learn the source MAC address and remember the interface on which this frame was received (xe-0/0/0.0 in this case) PE1 knows that it needs to forward the frame to PE2 and PE3 (flooding) frame can be duplicated, which is known as ingress replication alternatively point-to-multipoint LSPs may be used (out of scope of this presentation) S: 00:20:20:20:20:20 D: ff:ff:ff:ff:ff:ff PE1 00:20:20:20:20:20 Site ID 20 Site ID 3 00:03:03:03:03:03 PE2 PE3 00:20:20:20:20:20 xe-0/0/0.0 <empty MAC table> Site ID 1 34 by Xantaro 00:01:01:01:01:01
VPLS MAC Learning Step 2 We assume that PE1 uses ingress replication A frame is sent towards PE3 using outer label 100345 and inner label 262156 label 100345 learned with LDP or RSVP from next-hop of PE1 to reach PE3 label 262156 was learned via BGP from PE3 - label base = 262153 (label to be used for the first site identified by offset) - offset = 17 => Site 17 should use label 262153 to send frames to PE3 - label-range = 8 => this label block is valid for sites 17 to 24 a similar operation is done to send a packet from PE1 to PE2 which is not shown here 100345 262156 Layer 2 Payload PE1 PE2 00:20:20:20:20:20 Site ID 20 Site ID 3 00:03:03:03:03:03 PE3 00:20:20:20:20:20 xe-0/0/0.0 Site ID 1 <empty MAC table> 35 by Xantaro 00:01:01:01:01:01
VPLS MAC Learning Step 3 Upon receiving the MPLS frame, PE3 will inspect the service label and find a value of 262156 PE3 knows from the label value that this frame was transmitted by PE1 into this specific VPLS instance from site ID 20 PE2 will also learn this MAC address as it was replicated by PE1 to PE2 and PE3 also PE2 will learn based on label values that this frame was sent by PE1 PE1 will use the service label advertised by PE2 for this purpose PE1 00:20:20:20:20:20 Site ID 20 Site ID 3 00:03:03:03:03:03 PE2 PE3 S: 00:20:20:20:20:20 D: ff:ff:ff:ff:ff:ff 00:20:20:20:20:20 xe-0/0/0.0 00:20:20:20:20:20 PE1 Site ID 1 36 by Xantaro 00:01:01:01:01:01
VPLS MAC Learning Step 4 Next we assume that station in site 3 replies with a unicast frame As a first step, PE3 will put the source MAC of the frame into its local VPLS table referring to the local interface (xe-0/0/1.0 in this case) PE3 consults its VPLS table for this instance it knows that mac 00:02:02:02:02:02 is behind Site 20, which is connected to PE1 PE2 forwards this frame only to PE1, not to PE2 as it knows where the this MAC address is located PE1 00:20:20:20:20:20 Site ID 20 Site ID 3 00:03:03:03:03:03 PE2 PE3 S: 00:03:03:03:03:03 D: 00:20:20:20:20:20 00:20:20:20:20:20 xe-0/0/0.0 00:20:20:20:20:20 PE1 Site ID 1 00:03:03:03:03:03 xe-0/0/1.0 37 by Xantaro 00:01:01:01:01:01
VPLS MAC Learning Step 5 PE1 will inspect label value of 262562 and find that it assigned this label to be used when sending packets from site 3 to its local site hence, PE1 know that this Source MAC is behind PE3. PE2 will not see this frame and hence not learn MAC address 00:03:03:03:03:03 if PE2 will not see further packets from station 00:20:20:20:20:20, it will flush this MAC address from its table (aging principal) timeout is depending on platform but often set to 5 minutes by default 100987 262562 Layer 2 Payload PE1 PE3 00:20:20:20:20:20 Site ID 20 Site ID 3 00:03:03:03:03:03 PE2 00:20:20:20:20:20 xe-0/0/0.0 00:03:03:03:03:03 PE3 Site ID 1 00:20:20:20:20:20 PE1 00:03:03:03:03:03 xe-0/0/1.0 38 by Xantaro 00:01:01:01:01:01
VPLS Multihoming VPLS Multihoming can be used to provide redundancy Care must be taken to avoid spanning tree loops VPLS multihoming implementation is based on BGP-signaling, i.e. not available with LDPsignaling No-active interface are blocked 39 by Xantaro
VPLS Baseline Configuration All provider routers (PE and P routers) Run link-state routing protocol between all routers (OSPF or ISIS) Create full-mesh of LSPs between the PE routers using either LDP or RSVP PE routers only Setup BGP peering with family l2vpn for VPLS route exchange Optionally use LDP as a signaling protocol Create VPLS routing instance Configure PE-CE link 40 by Xantaro