Ethernet VPN (EVPN) in Data Center

Similar documents
BESS work on control planes for DC overlay networks A short overview

IP Fabric Reference Architecture

Internet Engineering Task Force (IETF) Request for Comments: N. Bitar Nokia R. Shekhar. Juniper. J. Uttaro AT&T W. Henderickx Nokia March 2018

Hierarchical Fabric Designs The Journey to Multisite. Lukas Krattiger Principal Engineer September 2017

Traffic Load Balancing in EVPN/VXLAN Networks. Tech Note

Cloud Data Center Architecture Guide

Solution Guide. Infrastructure as a Service: EVPN and VXLAN. Modified: Copyright 2016, Juniper Networks, Inc.

HPE FlexFabric 5940 Switch Series

EVPN Multicast. Disha Chopra

Virtual Extensible LAN and Ethernet Virtual Private Network

Ethernet VPN (EVPN) and Provider Backbone Bridging-EVPN: Next Generation Solutions for MPLS-based Ethernet Services. Introduction and Application Note

Contents. EVPN overview 1

Configuring VXLAN EVPN Multi-Site

VXLAN Design with Cisco Nexus 9300 Platform Switches

Data Center Configuration. 1. Configuring VXLAN

VXLAN EVPN Multihoming with Cisco Nexus 9000 Series Switches

Network Virtualization in IP Fabric with BGP EVPN

Provisioning Overlay Networks

Introduction to External Connectivity

Spirent TestCenter EVPN and PBB-EVPN AppNote

Configuring VXLAN EVPN Multi-Site

Designing Mul+- Tenant Data Centers using EVPN- IRB. Neeraj Malhotra, Principal Engineer, Cisco Ahmed Abeer, Technical Marke<ng Engineer, Cisco

Internet Engineering Task Force (IETF) ISSN: A. Sajassi Cisco J. Uttaro AT&T May 2018

Optimizing Layer 2 DCI with OTV between Multiple VXLAN EVPN Fabrics (Multifabric)

Huawei CloudEngine Series. VXLAN Technology White Paper. Issue 06 Date HUAWEI TECHNOLOGIES CO., LTD.

Building Blocks in EVPN VXLAN for Multi-Service Fabrics. Aldrin Isaac Co-author RFC7432 Juniper Networks

Configuring VXLAN EVPN Multi-Site

EXTREME VALIDATED DESIGN. Network Virtualization in IP Fabric with BGP EVPN

Intended status: Standards Track. Cisco Systems October 22, 2018

Enterprise. Nexus 1000V. L2/L3 Fabric WAN/PE. Customer VRF. MPLS Backbone. Service Provider Data Center-1 Customer VRF WAN/PE OTV OTV.

Unicast Forwarding. Unicast. Unicast Forwarding Flows Overview. Intra Subnet Forwarding (Bridging) Unicast, on page 1

Implementing VXLAN. Prerequisites for implementing VXLANs. Information about Implementing VXLAN

LARGE SCALE IP ROUTING LECTURE BY SEBASTIAN GRAF

Hochverfügbarkeit in Campusnetzen

H3C S6520XE-HI Switch Series

VXLAN Cisco and/or its affiliates. All rights reserved. Cisco Public

EVPN Overview. Cloud and services virtualization. Remove protocols and network simplification. Integration of L2 and L3 services over the same VPN

DCI. DataCenter Interconnection / Infrastructure. Arnaud Fenioux

EVPN for VXLAN Tunnels (Layer 3)

Building Data Center Networks with VXLAN EVPN Overlays Part I

Open Compute Network Operating System Version 1.1

Multi-site Datacenter Network Infrastructures

Extreme Networks How to Build Scalable and Resilient Fabric Networks

Configure EVPN IRB EVPN IRB

Virtual Hub & Spoke with BGP EVPNs

Provisioning Overlay Networks

Building Blocks for Cloud Networks

IP fabrics - reloaded

Implementing DCI VXLAN Layer 3 Gateway

H3C S7500E-X Switch Series

Junos Fusion Data Center

Evolved Campus Core: An EVPN Framework for Campus Networks. Vincent Celindro JNCIE #69 / CCIE #8630

H3C S6520XE-HI Switch Series

VXLAN Overview: Cisco Nexus 9000 Series Switches

MP-BGP VxLAN, ACI & Demo. Brian Kvisgaard System Engineer, CCIE SP #41039 November 2017

Creating and Managing Admin Domains

Border Provisioning Use Case in VXLAN BGP EVPN Fabrics - Multi-Site

VXLAN EVPN Multi-Site Design and Deployment

EVPN Command Reference

Cisco Virtual Topology System Release Service Provider Data Center Cisco Knowledge Network. Phil Lowden (plowden) October 9, 2018

Configuring VXLAN Multihoming

Network Configuration Example

VPLS, PPB, EVPN and VxLAN Diagrams

Higher scalability to address more Layer 2 segments: up to 16 million VXLAN segments.

VXLAN Technical Brief A standard based Data Center Interconnection solution Dell EMC Networking Data Center Technical Marketing February 2017

VXLAN EVPN Automation with ODL NIC. Presented by: Shreyans Desai, Serro Yrineu Rodrigues, Lumina Networks

VXLAN Design Using Dell EMC S and Z series Switches

MPLS design. Massimiliano Sbaraglia

EVPN Routing Policy. EVPN Routing Policy

WAN. Core Routing Module. Data Cente r LAB. Internet. Today: MPLS, OSPF, BGP Future: OSPF, BGP. Today: L2VPN, L3VPN. Future: VXLAN

Cisco ACI Multi-Pod/Multi-Site Deployment Options Max Ardica Principal Engineer BRKACI-2003

MPLS VPN--Inter-AS Option AB

SharkFest 18 US. BGP is not only a TCP session

Segment Routing on Cisco Nexus 9500, 9300, 9200, 3200, and 3100 Platform Switches

IOS-XR EVPN Distributed Anycast IRB Gateway, L2/L3VPN Service with MPLS Data Plane

www. .org New Quagga fork with open development and community Martin Winter

Technical Brief. Achieving a Scale-Out IP Fabric with the Adaptive Cloud Fabric Architecture.

E-VPN & PBB-EVPN: the Next Generation of MPLS-based L2VPN

draft-rabadan-sajassi-bess-evpn-ipvpn-interworking-00

Network Configuration Example

Network Configuration Example

Module 5: Cisco Nexus 7000 Series Switch Administration, Management and Troubleshooting

Cisco VTS. Enabling the Software Defined Data Center. Jim Triestman CSE Datacenter USSP Cisco Virtual Topology System

Virtual Subnet (VS): A Scalable Data Center Interconnection Solution

PassTorrent. Pass your actual test with our latest and valid practice torrent at once

SP Datacenter fabric technologies. Brian Kvisgaard System Engineer CCIE SP #41039

Cisco Nexus 7000 Series NX-OS VXLAN Configuration Guide

Network Configuration Example

Implementing VXLAN in DataCenter

VXLAN EVPN Fabric and automation using Ansible

BGP IN THE DATA CENTER

Implementing IEEE 802.1ah Provider Backbone Bridge

Lab 1: Static MPLS LSP-RTX4-RTX1 LSP-RTX1-RTX4 LSP-RTX3-RTX2 LSP-RTX2-RTX3

ARISTA DESIGN GUIDE Data Center Interconnection with VXLAN

MPLS VPN Inter-AS Option AB

Feature Information for BGP Control Plane, page 1 BGP Control Plane Setup, page 1. Feature Information for BGP Control Plane

E-VPN & PBB-EVPN: the Next Generation of MPLS-based L2VPN

Service Graph Design with Cisco Application Centric Infrastructure

Pluribus Data Center Interconnect Validated

Network Configuration Example

Transcription:

Ethernet VPN (EVPN) in Data Center Description and Design considerations Vasilis Stavropoulos Sparkle GR

EVPN in Data Center The necessity for EVPN (what it is, which problems it solves) EVPN with MPLS transport (RFC7432) EVPN with VXLAN transport (draft-ietf-bess-evpn-overlay-07) Design considerations Configuration examples (Junos) L4-L7 Services integration

Data Center L2 issues Traditionally in Data Centers (DC), tenant separation is performed at L2 level with VLANs. This introduces spanning-tree limitations and dangers (data plane flooding, broadcasting + related storms, partially used uplinks) Slow recovery times due to STP convergence. Potential scalability problems imposed by the maximum number of vlans(4096). Proprietary vendor solutions (vpc, MC-LAG), in order to bypass spanning-tree limitations

EVPN Benefits EVPN brings mac learning through control plane, via another extension (evpn-signaling) of our favorite protocol, BGP. It allows to tunnel L2 traffic (overlay) through an IP fabric (underlay) Faster convergence times. Service provider level scalability (route-reflectors). All active multi-homing from hosts to the network without vendor proprietary solutions Anycastgateway, identical gateway (IP/MAC) for all hosts/vms in the fabricleading to reduced ARP flooding and traffic optimization.

EVPN Terminology EVI : EVPN instance, the instance that spans among all PEs participating in the specific EVPN. ES : Ethernet Segment defines the connection between the Hosts and the PEs, in the case of active/active uplinks, an ES represents the link aggregation set. ESI : Ethernet Segment Identifier, which significantly identifies the connected hosts on the PE and it has a zero value for single-homed hosts and a non-zero unique value for multi-homed hosts

EVPN Terminology Route Types Route Type 1: Ethernet Auto-Discovery (AD) Route - Provides auto-discovery for multi-homed host and represents the ESI (also known as mass withdraw route). Route Type 2: MAC/IP Advertisement Route - EVPN allows end hostsip and MAC addresses to be advertised within the EVPN network layer reachability information (NLRI). This allows for control plane learning of end systems MAC addresses. Route Type 3: Inclusive Multicast Ethernet Tag Route - This route sets up a path for broadcast, unknown unicast, and multicast (BUM) traffic from a PE device to the remote PE device on a per VLAN, per ESI basis (Ingress replication method). Route Type 4: Ethernet Segment Route - These routes are needed in multihomed scenarios (active/active) and help determine the Designated ForwarderPE. Designated Forwarder is elected per ESI for BUM traffic handling.

EVPN Network (MPLS transport) IP fabric (CLOS) with MPLS enabled ibgpbetween the Leaf routers with evpn signaling extension, OSPF as IGP LDP or RSVP as MPLS signaling protocol We achieve MAC and MAC/IP advertisement through MP- BGP (control plane learning) VMs of Host1 and Host2 think that they are on the same broadcast domain, although an IP fabric is in the middle AnycastGW offer transparent VM mobility between hosts

EVPN Network (MPLS transport) ESI is the same for Leaf-1,2 and simplifies link aggregation towards two distinct physical switch/routers (no vpc, MC- LAG, etc.) Via Route Type 1, Leaf-1 learns that MACs of Host2 VMs are behind both Leaf2, Leaf3, so it load balances traffic towards them. Route Type 2 describes individual MAC/IP advertisements through BGP Through ESI also faster convergence times are achieved If Host2-Leaf3 link goes down, Leaf3 withdraws RT 1 and all related MACs are purged immediately from other PEs

EVPN Network New Extended communities MAC mobility extended community Sequence number to help PEs withdraw old MAC/IP routes during VM relocations between hosts. Default GW extended community Extended community carried by the MAC/IP route to indicate that the route is associated with a default GW. Alternatively, manually configure the IP/MAC per interface on all PEs

EVPN Network (MPLS transport) EVPN VLAN based (different instance per vlan), 1:1 Mapping Vlan 10 ---EVI 10 ---Vlan 10 or translated Vlan 20 ---EVI 20 ---Vlan 20 or translated EVPN VLAN bundle based (same instance, different BDs) Vlan 10 ---EVI 10 (bridge-domain 10) ---vlan10 or translated Vlan 20 ---EVI 10 (bridge-domain 20) ---vlan20 or translated

EVPN Network (VXLAN transport) Leaf switches are usually lower spec devices not supporting or having limited features. MPLS is not popular in enterprise world and is not supported by hypervisors So, EVPN with VXLAN transport is the most popular choice for the overlay It provides a theoretical upper limit of VNIs (VXLAN Network Identifiers) to 16.7M (24bit field in header), compared to 4096 VLANs

EVPN Network (VXLAN transport) VXLAN provides L2 overlay tunneling through encapsulation of MAC frames over IP/UDP, creating an independent overlay network over the IP fabric It uses Virtual Tunnel End Point interfaces (VTEP) in Hypervisors or physical switches, to perform this encapsulation VTEP is a function with two interfaces, one L2 interface towards the LAN segment (Hosts/VMs) and one L3 interface towards the IP fabric VLAN-to-VXLAN mapping at LAN side before encapsulation Initial implementation of VXLAN included flood and learn mechanism through multicast protocol for VTEP discovery in the fabric Not scalable and not very elegant to enable multicast in DC for such reason EVPN solves this by enabling VTEP discovery through control plane learning (BGP)

VXLAN+VTEP https://www.cisco.com/c/dam/en/us/products/collateral/switches/nexus-9000-series-switches/white-paper-c11-729383.doc/_jcr_content/renditions/whitepaper-c11-729383_2.jpg https://www.cisco.com/c/dam/en/us/products/collateral/switches/nexus-9000-series-switches/white-paper-c11-729383.doc/_jcr_content/renditions/white-paperc11-729383_1.jpg

EVPN Network (VXLAN transport) IP fabric (CLOS) without MPLS VTEP is a function with two interfaces, one L2 interface towards the LAN segment (Hosts/VMs) and one L3 interface towards the IP fabric VTEP IP discovery through the MP-BGP EVPN control plane MAC frames are encapsulated in UDP/IP before being transported through the IP fabric via MP-BGP De-encapsulation process takes place at the remote VTEP

EVPN Network (VXLAN transport-vtep function) Local VTEP IP MAC VXLAN id Remote VTEP ab:cd:ef:12:34:56 5001 10.10.10.1 ac:dd:11:22:33:aa 5010 10.10.10.2 be:af:12:ac:22:ac 5020 10.10.10.3 VLAN to VXLAN mapping (LAN side) VLAN VXLAN 501 5001 510 5010 520 5020

EVPN Network (Vlan vs VXLAN) root@vmx1> show configuration routing-instances EVPN-100 instance-type virtual-switch; route-distinguisher 10.10.10.1:100; vrf-import VL100-vrf-import; vrf-target target:100:100; protocols { evpn{ extended-vlan-list 100-101; default-gateway do-not-advertise; bridge-domains { VL-100 { vlan-id 100; interface ge-0/0/1.100; routing-interface irb.100; VL-101 { vlan-id 101; interface ge-0/0/1.101; routing-interface irb.101; root@vmx1> show configuration routing-instances EVPN-100 vtep-source-interface lo0.0; instance-type virtual-switch; route-distinguisher 10.10.10.1:100; vrf-target target:100:100; protocols { evpn{ encapsulation vxlan; extended-vni-list 1000-1010; default-gateway do-not-advertise; bridge-domains { VL-100 { vlan-id 100; interface ge-0/0/1.100; vxlan{ vni 1000; ingress-node-replication;..

EVPN Network Even with VXLAN there are limitations on merchant silicon switches usually used as TOR Most of the smart things take place at Spine level, where all the features are available (e.g. VLXAN L3 gateway) This leads to more complex scenarios and configurations between different type of equipment Exercise: collapse the Leaf architecture inside the hypervisor (virtual router as a Leaf) and proceed with VLAN ids and MPLS transport as an example

EVPN Network

EVPN packet walkthrough VM-B sends arprequest for IP of VM-A Packet is flooded to all PEs participating in the EVI using the Type 3 Route (Ingress replication) via BGP Packet reaches VM-A which replies to the ARP request with its own IP address The reply is unicast and is sent only the specific remote PE due to MAC learning from MP-BGP (Route type 2, IP/MAC) Different MPLS label allocation for RT-2 and RT-3

EVPN routes show route advertising-protocol bgp 10.10.10.1 table EVPN-100.evpn.0 detail Route Type-2 (MAC) * 2:10.10.10.2:100::100::00:50:56:95:5d:11/304 (1 entry, 1 announced) BGP group IBGP type Internal Route Distinguisher: 10.10.10.2:100 Route Label: 299776 ESI: 00:00:00:00:00:00:00:00:00:00 ------> Single homed Nexthop: Self Flags: Nexthop Change Localpref: 100 AS path: [65001] I Communities: target:100:100 Route Type-3 * 3:10.10.10.2:100::100::10.10.10.2/304 (1 entry, 1 announced) BGP group IBGP type Internal Route Distinguisher: 10.10.10.2:100 Route Label: 299785 PMSI: Flags 0x0: Label 299785: Type INGRESS-REPLICATION 10.10.10.2 Nexthop: Self Flags: Nexthop Change Localpref: 100 AS path: [65001] I Communities: target:100:100 PMSI: Flags 0x0: Label 299785: Type INGRESS-REPLICATION 10.10.10.2 Route Type-2 (MAC/IP) * 2:10.10.10.2:100::100::00:50:56:95:5d:11::192.168.10.10/304 (1 entry, 1 announced) BGP group IBGP type Internal Route Distinguisher: 10.10.10.2:100 Route Label: 299776 ESI: 00:00:00:00:00:00:00:00:00:00 Nexthop: Self Flags: Nexthop Change Localpref: 100 AS path: [65001] I Communities: target:100:100 2: Route-type (MAC/IP advertisement) 10.10.10.1:100 : RD 00:50:56:95:5d:11: Mac address of VM 192.168.10.11 : IP of VM 100 : VlanID

EVPN routes root@vmx1> show route table EVPN-100 2:10.10.10.1:100::100::00:50:56:95:20:bf::192.168.10.11/304 *[EVPN/170] 00:01:42 Indirect 2:10.10.10.2:100::100::00:50:56:95:5d:11::192.168.10.10/304 *[BGP/170] 00:01:40, localpref 100, from 10.10.10.2 AS path: I, validation-state: unverified > to 10.20.30.2 via ge-0/0/2.0 root@vmx2> show route table EVPN-100 2:10.10.10.1:100::100::00:50:56:95:20:bf::192.168.10.11/304 *[BGP/170] 00:03:28, localpref 100, from 10.10.10.1 AS path: I, validation-state: unverified > to 10.20.30.1 via ge-0/0/2.0 2:10.10.10.2:100::100::00:50:56:95:5d:11::192.168.10.10/304 *[EVPN/170] 00:03:27 Indirect

EVPN vleaf Each vpe/leaf has pretty much identical configuration May be templated/automated However, special care is needed for optimizing the resources (CPU, Memory, network) Various optimized techniques for compute resources (NUMA, CPU pinning..) The same for the networking part (PCI pass-through, SR-IOV, DPDK.)

L4-7 Services integration In order to route traffic outside the IP fabric and maintain the desired multi-tenancy function, we need to implement L3 VRFs These VRFs have different RD and RT than the EVPN ones, but contain the same routinginterface, which continues to be the default GW per tenant So, for each vlanwe have one bridge-domain in EVPN instance and one L3 VRF, which contains e.g. a static or dynamic route towards outside the fabric, via another edge device (firewall)

L4-7 Services integration

L4-7 Services integration Tenant may use its own vfirewall or Provider s Tenant s default gw remains the Leaf (VRF) Inside interface of vfw is terminated at different port participating only in the L3 VRF This ensures independent vmotion of the VMs compared to the vfw+ more flexibility to inter-vlan forwarding (vrf importexport policies) EVPN instance for East/West traffic, VRF instance for routing outside the fabric

L4-7 Services integration L3 VRF instance-type vrf; interface ge-0/0/3.0; ---- > vfw inside interface irb.100; route-distinguisher 10.10.10.1:10; vrf-target target:100:10; vrf-table-label; routing-options { static { route 0.0.0.0/0 next-hop x.x.x.x; protocols { ospf{ area 0.0.0.0 { interface ge-0/0/3.0 { metric 100; EVPN instance instance-type virtual-switch; route-distinguisher 10.10.10.1:100; vrf-import VL100-vrf-import; vrf-target target:100:100; protocols { evpn { extended-vlan-list 100-101; default-gateway do-not-advertise; bridge-domains { VL-100 { vlan-id 100; interface ge-0/0/1.100; ------ > LAN side (Tenant VMs) routing-interface irb.100; VL-101 { vlan-id 101; interface ge-0/0/1.101; routing-interface irb.101;

L4-7 Services integration One vleaf per node without a centralized lifecycle manager could be a problem, depending on the scale However, the configuration per vleaf is similar and can be easy(ier) templated and automated EVPN with MPLS transport could work at DC level for small/medium design scenarios Repeated rack configuration using the same vlan ids Easier integration with the rest of the service provider network, especially for potential Data Center Interconnect (DCI) needs.

Summary/References Legacy DC designs with L2 domains (vlans) using Spanning-Tree is long considered obsolete for all the well known reasons Intermediate solution with vendor proprietary protocols (vpc etc.) to reduce the STP topology and better utilize uplinks However, there are still limitations e.g. in routing protocols usage EVPN brings control plane into the game of MAC learning, eliminating the need for proprietary solutions and of course spanning-tree EVPN/MPLS or EVPN/VXLAN? EVPN/VXLAN in DC and EVPN/MPLS at the core/sp is the trend, while other encapsulation methods are available https://tools.ietf.org/html/rfc7432 (BGP MPLS-Based Ethernet VPN) https://tools.ietf.org/html/draft-ietf-bess-evpn-overlay-07