MP-BGP VxLAN, ACI & Demo Brian Kvisgaard System Engineer, CCIE SP #41039 November 2017
Datacenter solutions Programmable Fabric Classic Ethernet VxLAN-BGP EVPN standard-based Cisco DCNM Automation Modern NX-OS with enhanced NX-APIs Automation Ecosystem (Puppet, Chef, Ansible etc.) Common NX-API across N2K-N9K Service Providers/DIY fabric Classic Datacenters
Data Center Growth L3 L2 ACI / MP-BGP EVPN (VXLAN) 4
Why VXLAN Overlay? Desing Needs Any workload anywhere VLANs limited by L3 boundaries VM Mobility Scale above 4k Segments (VLAN limitation) Secure Multi-tenancy VXLAN Delivered Any Workload anywhere- across Layer 3 boundaries Seamless VM Mobility Scale up to 16M segments Traffic & Address Isolation VXLAN Overlay VTEP VTEP VTEP VTEP VTEP
Challenges with Traditional VXLAN Deployments Scale and Mobility Limitations VXLAN Overlay VTEP VTEP VTEP VTEP VTEP LIMITED SCALE Flood and learn (BUM)- Inefficient Bandwidth Utilization Resource Intensive Large MAC Tables LIMITED WORKLOAD MOBILITY Centralized Gateways Traffic Hair-pining Sub-Optimal Traffic Flow Barrier for Scaling out Large Data Centers and Cloud Deployments
Next-Gen VXLAN Fabric with BGP-EVPN Control Plane Delivering Multi-Tenancy and Seamless Host Mobility at Cloud Scale Route Reflector Route Reflector Mature Solution Shipping since 2015 BGP-EVPN VXLAN Overlay BGP Peers VTEP VTEP VTEP VTEP VTEP INTEROPERABLE Standards Based BGP-EVPN VXLAN INCREASED SCALE Eliminates Flooding Conversational Learning Policy-Based Updates OPTIMIZED MOBILITY Distributed Anycast Gwy Integrated Routing /Bridging vpc & ECMP Breaking the Traditional VXLAN Scale Barriers
Flood-&-Learn EVPN Control Plane Overlay Services L2+L3 L2+L3 Underlay Network IP network with ECMP IP network with ECMP Encapsulation MAC in UDP MAC in UDP Peer Discovery Data-driven flood-&-learn MP-BGP Peer Authentication Not available MP-BGP Host Route Learning Local hosts: Data-driven flood-&-learn Remote hosts: Data-driven flood-&-learn Host Route Distribution No route distribution. MP-BGP Local Host: Data-driven Remote host: MP-BGP L2/L3 Unicast Forwarding Unicast encap Unicast encap BUM Traffic forwarding Multicast replication Unicast/Ingress replication Multicast replication Unicast/Ingress replication
Design V RR vtep EVPN Route Reflector Spine1 RR RP RR RP Spine1 RP Rendezvous Point (Underlay) VXLAN/EVPN Fabric V Service Leaf Border Leaf1 V V Border Leaf2 V Leaf WAN 9
Layer 3 Multi Tenancy VRF 1 VRF VXLAN 2Overlay VRF 3 VTEP VTEP VTEP VTEP VTEP
Distributed Anycast Gateway Host Mobility R R R R R R MP-BGP EVPN UPDATE R R NLRI: Host MAC1, IP1 NVE IP 3 VNI 5000 Ext.Community: Encapsulation : VXLAN Sequence 1 IP1 L1 IP3 L3 SVIs SVIs SVIs R R Route Reflector Host 1 VLAN 10 VNI 5000 Host H1 moves to Leaf Switch 3 (L3) L3 detects Host 1 and advertises H1 with updated sequence number 1 L1 sees most recent update and withdraws its route
PBR Support for VXLAN BGP EVPN Fabric Border Border Solution Shipping Redirect Layer-3 Traffic based on 5- tuple. VXLAN EVPN Service Redirection to Load-Balancers and Firewalls. VRF Tenant1 Baremetal VTEP VTEP VTEP VTEP Baremetal Firewall Next-Hop can be IPv4/IPv6 Hosts in VRF behind VTEP Host A 192.168.10.101 Host B 192.168.20.102 PBR From To Next-Hop Host A Host B Firewall Host B Host A Firewall
Centralized Route Leaking Extranet Support 7.0(3).I7(1) Solution Border External Network Border Use Cases Shared Services, External Connectivity VXLAN EVPN VRF to VRF or VRF to Default VRF Tenant1 VTEP VTEP VTEP VTEP VRF Tenant2 Centralize Location for leaking routes Baremetal Baremetal Baremetal Host A 192.168.10.101 Host B 192.168.20.102 Host C 192.168.30.103
7.0(3).I7(1) Tenant Routed Multicast Spine Spine Cisco ASIC EX/FX VXLAN EVPN VRF Tenant1 VTEP VTEP VTEP VTEP DR DR DR DR Baremetal Baremetal Baremetal Baremetal Baremetal Baremetal Baremetal SRC-10 239.10.10.10 10.10.10.100 RCVR-10 10.10.10.10 RCVR-20 10.20.20.20 SRC-99 239.10.10.99 10.30.30.199 RCVR-30 10.30.30.30 RCVR-11 10.10.10.11 RCVR-40 10.40.40.40
VXLAN Operations, Administration & Management (OAM) Delivering Carrier Grade VXLAN Manageability Solution Ping, Trace-route, Pathtrace BGP-EVPN VXLAN Overlay BGP Peers VTEP VTEP VTEP VTEP VTEP End-Point Locator / Trace Ping / Path MTU Trace-route / Pathtrace Pro-Active Monitoring Locate end-host given segment and Device-Id information Check liveliness of End-host Option to specify Payload Parameters Trace paths to host and tunnelendpoint Get path, interface and error statistics along route Proactive ping with threshold and state notifications Specify Payload Parameters for path selection
VXLAN EVPN Multi-Site Site 1 Cisco ASIC Advantage EX/FX Site 2 7.0(3).I7(1) Border Gateways Border Gateways Site 1 VXLAN Tunnel Overlay Multi-Site Site 2 VXLAN Tunnel Scale through Hierarchical Forwarding Fault Containment Convergence independent of Network Size Separate Admin Domains Single Box
Multi-Site Overlay Data Plane Inter-site VXLAN Data Plane De-capsulation and Re-encapsulation on BGW DC Core (Layer-3 Unicast) De-capsulation and Re-encapsulation on BGW DCI Fabric VTEP BGW Spine VIP1 VIP2.. 10.1.1.111 VTEP VTEP BGW BGW 10.2.2.222 VXLAN EVPN Site1 Spine Intra-site VXLAN Data Plane Spine VXLAN EVPN Site2 VTEP BGW Spine VTEP VTEP VTEP VTEP VTEP VTEP VTEP VTEP Host1 0000.3010.1101 192.168.10.101 Host2 0000.3020.2101 192.168.20.101 Host3 0000.3010.1102 192.168.10.102 57
Datacenter solutions Application Centric Infrastructure Programmable Fabric Classic Ethernet DB DB Web Web App Web App Turnkey integrated solution with security, centralized management, compliance and scale Automated application centric-policy model with embedded security Broad and deep ecosystem VxLAN-BGP EVPN standard-based Cisco DCNM Automation Modern NX-OS with enhanced NX-APIs Automation Ecosystem (Puppet, Chef, Ansible etc.) Common NX-API across N2K-N9K Commercial, Enterprises, Public sector,hosters Service Providers/DIY fabric Classic Datacenters
Interconnecting ACI Networks Deployment Options Single APIC Cluster/Single Fabric Stretched Fabric Multiple APIC Clusters/Multiple Fabrics Multi-Fabric (with L2 and L3 DCI) DC1 ACI Fabric APIC Cluster DC2 ACI Fabric 1 Inter-Site App ACI Fabric 2 L2/L3 DCI Multi-Pod (from 2.0 release) Multi-Site (August 17) Pod A IPN Pod n Site A IP Site n MP-BGP - EVPN MP-BGP - EVPN APIC Cluster ACI Multi-Site
ACI Multi-Site Overview Availability Zone A IP Network Availability Zone B MP-BGP - EVPN Site 1 ACI Multi-Site Site n Separate ACI Fabrics with independent APIC clusters Each fabric is considered as a different availability zone Scoping of configuration changes DR and Active/Active use cases support REST API GUI ACI Multi-Site pushes cross-fabric configuration to multiple APIC clusters MP-BGP EVPN control plane between sites Data Plane VXLAN encapsulation across sites End-to-end policy definition and enforcement 20
ACI Multi-Site Namespace Normalization Translation of VTEP, GBP-ID, VNID (scoping of name spaces) IP Network Translation of VTEP, GBP-ID, VNID (scoping of name spaces) MP-BGP - EVPN Site 1 VTEP IP Leaf to Leaf VTEP, Class-ID is local to the Fabric VNID Class-ID Tenant Packet Site to Site VTEP traffic (VTEPs, VNID and Class-ID are mapped on spine) VTEP IP ACI Multi-Site VNID Class-ID Tenant Packet Leaf to Leaf VTEP, Class-ID is local to the Fabric VTEP IP VNID Class-ID Site n Tenant Packet Maintain separate name spaces with ID translation performed on the spine nodes Requires specific HW support for this functionality 21
ACI Multi-Site Hardware Requirements Support all ACI leaf switches (NS, -E, -EX and -FX) IP Network Can have only a subset of spines connecting to the IP network Only spine nodes with EX line cards (or newer) to connect to the inter-site network (required FM-E fabrics) 1 st Gen 1 st Gen -EX -EX New FX non modular spine (64 ports) will be supported in Q4CY17 timeframe 1 st generation spines (including 9336PQ) not supported Can still leverage those for intra-site leaf to leaf communication ACI Multi-Site
ACI Multi-Pod and ACI Multi-Site What Option to Choose? Multi-Pod Multi-Site Pod A IPN Pod n Site A IP Site n APIC Cluster ACI Multi-Site Operational Simplicity Feature Richness across Pods Change Domain Isolation Fabric Nodes Scale Lower Number of APIC Nodes Single VMM across Pods High Latency across Sites No Multicast required in the IP Network
ACI Anywhere Any Workload, Any Location, Any Cloud ACI ANYWHERE Remote PoD Multi-Pod / Multi-Site Hybrid Cloud Extension IP WAN IP WAN Remote Location On Premise Public Cloud Security Everywhere Analytics Everywhere Policy Everywhere 22
Migrating to ACI extend your network via Layer 2 Extend the VLAN/EPG Existing Network HSRP GW 1.1.1.1 VLAN 2143 Start ACI small, then grow on it VLAN 2143 VLAN 2143 VLAN 2143 EPG VLAN 2143 Static EPG Path mapping EPG 1.1.1.3 1.1.1.99 1.1.1.7 VLAN 10 1.1.1.5 1.1.1.3 BD Existing App Single Policy Group (one extended EPG) Leverage vpc for interconnect BPDU should be enabled on the interconnect ports on the vpc domain
SPRITE tennant - migrated APIC APIC APIC vcenter Outside L3out VRF: common VRF: VRF_SPRITE BD: VLAN200 IP Routing: Yes, 10.101.9.1/24 WEB client 10.99.2.202 EPG: SHARED Tennant = common WEB_Frontend 10.101.9.10 MySQL DB 10.101.9.2 Tennant = SPRITE ANP: WEBAPP
SPRITE tennant Secured Database APIC APIC APIC vcenter Outside L3out VRF: common VRF: VRF_SPRITE (Preferred-Group=Enabled) BD: VLAN200 IP Routing: Yes, 10.101.9.1/24 WEB client 10.99.2.202 EPG: Shared EPG: MYSQL ANP: WEBAPP Tennant = common WEB_Frontend 10.101.9.10 tcp/3306 MySQL DB 10.101.9.2 Tennant = SPRITE
SPRITE tennant - Secured APIC APIC APIC vcenter Outside L3out VRF: common VRF: VRF_SPRITE (Preferred-Group=Enabled) BD: VLAN200 IP Routing: Yes, 10.101.9.1/24 WEB client 10.99.2.202 EPG: WEBSRV EPG: MYSQL EPG: SHARED ANP: WEBAPP Tennant = common WEB_Frontend 10.101.9.10 tcp/3306 MySQL DB 10.101.9.2 Tennant = SPRITE
Cisco Data Centre Networking Architectures Application Centric Infrastructure Programmable Fabric Classic ethernet DB DB Web Web App Web App VXLAN-based Forwarding, Multi-Tenancy & Security Turn-key Integrated Controller with Enhanced APIs Agnostic Hypervisor integration Wide deployment scenarios Standards-based VXLAN BGP EVPN Forwarding & Multi-Tenancy Multi-site support Open NX-OS DCNM Automation (option) BYO-Network, scales well Simple and well-known deployment Open NX-OS Enhanced APIs and Automation Ecosystem (DevOps)