Hochverfügbarkeit in Campusnetzen Für die deutsche Airheads Community 04. Juli 2017, Tino H. Seifert, System Engineer Aruba
Differences between Campus Edge and Campus Core Campus Edge In many cases no High-Availability needed If High-Availability needed, always direct connection to end device In almost all cases LACP only available redundancy option Even in High-Availability deployments a downtime from time to time is possible Campus Core In most cases High-Availability needed In most cases no direct access to end device Lots of different High-Availability options Number of networks with no downtime requirement ist growing Campus Edge Protocols are also valid on Core 3
High Availability at the Edge A short Overview 4
STP with VRRP Aggregation/Core VRRP Master VRRP Backup Configuration intensive: switch by switch configuration and management Access x x x x x x STP prevents loops, but blocks paths - reducing effective bandwidth Slow reconvergence impact business, applications and users Legend Configuration effort Operational complexity Active link Convergence time User/app experience Blocked/standby link 5
VRRP without STP Aggregation/Core Access VRRP Master VRRP Backup VLAN1 VLAN2 VLAN3 Medium Configuration intensive: switch by switch configuration and management, but no STP configuration No STP needed, as each set of VLANs only on one Access Switch Fast convergence decided by VRRP Legend Configuration effort Operational complexity Active link Convergence time User/app experience Blocked/standby link 6
Access switch stacking Backplane stacking Aruba 2930M Switch Series Topologies Ring: up to 10 units Aruba 3810 Switch Series Topologies Full mesh: up to 5 units Ring: up to 10 units Front plane stacking Aruba 5400R Switch Series Up to 2 chassis V3 modules only Up to 8 physical links per VSF link VSF-ports 10GbE or 40GbE port aggregations Aruba 2930F Switch Series Front-plane stacking up to 4 units VSF-ports: SFP uplink models: 1GbE SFP+ unlink models: 10GbE 7
IRF/VSF simpler, resilient & high performance networks IRF??? IRF Vastly simplifying L2/3 design, configuration and management Link aggregation adds resilience and network bandwidth Fast reconvergence enhances business continuity and user experience 40G server bandwidth Legend Configuration effort Operational complexity Active link Convergence time User/app experience Blocked/standby link 8
DRNI - Distributed Resilient Network Interconnect Based on DRNI defined in IEEE 802.1AX specifications. DR peer devices exchange information through DRCP*. Forwards traffic locally. Supports only two DR peer devices. Supports only one IPL** which must be an aggregation interface. IPL DRPort??? IPL DRPort 40G server bandwidth Legend Configuration effort Operational complexity Active link Convergence time User/app experience Blocked/standby link * Distributed Relay Control Protocol **Intra-Portal Link 10
High Availability at the Core A short Overview 11
IRF/VSF simpler, resilient & high performance networks IRF IRF IRF Very easy to configure Link aggregation adds resilience and network bandwidth Fast reconvergence enhances business continuity and user experience BUT: what happens in the case of an software update ISSU may help, but realistically ZERO downtime is not possible 40G server bandwidth Legend Configuration effort Operational complexity Active link Convergence time User/app experience Blocked/standby link 12
HP IRF Traditional Layer 2 Fabric Pros Mature technology Simple topology (Single IP for management, peering etc) Simplified configuration file Cons Proprietary ISSU not always available Hash algorithms not perfect Scalability Single IRF core = single point of failure? 13
CLOS Fabric CLOS (physical) network architecture provide edge/core multi-tier design Each leaf switch is connected to all spine switches Customers may choose to deploy a 2 spine fabric (2 x 40G uplinks) and expand to 4+ spines (4 x 40G uplinks or more) when they require additional bandwidth Protocol independent (STP/TRILL/SPB/L3) over the physical fabric Spine Switches Spine Switches Leaf Switches Leaf Switches 2 Spine CLOS Fabric 4 Spine CLOS Fabric 14
L3 Fabric with distributed control plane (OSPF/IS-IS/BGP) An L3 Core removes STP while still providing end to end connectivity Very high resiliency, but only possible if VLANs don t need to be distributed to multiple Leaf nodes The practice shows, that almost always someone wants to have at least one VLAN on multiple Leafs Spine Switches Spine = independent function 10G/40G interconnects L3 Nework OSPF/IS-IS/BGP Leaf Switches Leaf = IRF Pair 15
L2 Fabric with distributed control plane (TRILL/SPB) TRILL/SPB removes STP while still providing a loop free L2 network for east/west traffic Distributed control plane (No single point of failure), but lack of control plane interoperability Architecture Neutral (leaf to leaf or spine leaf) Spine Switches Spine = independent function 10G/40G interconnects L2 TRILL / SPB Fabric Over IS-IS L3 network Leaf Switches Leaf = IRF Pair 16
VxLAN for Campus networks A short excurse 17
VXLAN and Overlay Networking Introduction Virtual Extensible Local Area Network (VXLAN) is a network encapsulation mechanism first introduced in 2011 Supports up to 16 million virtual overlay tunnels over a physical layer 2/3 underlay network for L2 network connectivity and multi-tenancy VXLAN allows traffic to be load shared across multiple equal cost paths Supports both intra-campus and inter-campus deployment scenarios VXLAN capable device = VXLAN Tunnel End Point (VTEP) Data Center (DC) 1 Virtual Overlay VXLAN tunnels Inter-DC Extended Over WAN Data Center (DC) 2 VM VM VM VM VM VM Hypervisor L2 or L3 VM VM VM VM VM VM Hypervisor L3 WAN VM VM VM VM VM VM Hypervisor L2 or L3 VM VM VM VM VM VM Hypervisor Physical Underlay Network Physical Underlay Network Intra-DC 18
VXLAN Deployment With Centralized Control Plane VXLAN with centralized control plane (e.g. DCN VSC, IMC with HPE switches) Typically a VM or application installed on a server and includes clustering capabilities for High Availability (HA) OVSDB / NETCONF are examples of protocols used between controller and VTEPs to setup/teardown VXLAN tunnels and share MAC addresses Network Virtualization Controller OVSDB / NETCONF VM traffic bridged to physical network via overlay VXLAN tunnel VM VM VM VM VM VM Hypervisor Software VTEP A in rack 1 172.16.10.0/24 Underlay Network Layer 3 Hardware VTEP B in rack 100 Physical Routers VM1 10.0.0.2/24 Physical Firewalls Bare Metal Server 10.0.0.6/24 19
VXLAN Deployment Without Centralized Control Plane VXLAN without centralized control plane (e.g. HPE Comware switches) VXLAN tunnels can be setup manually (CLI) or dynamically (MP-BGP EVPN) CLI Implementation is Vendor proprietary, don t expect interoperability EVPN is standardized Traffic between segments via overlay VXLAN tunnel VM VM VM VM VM VM Hypervisor VM1 10.0.0.2/24 Hardware VTEP A in rack 1 Bare Metal Server 2 Physical Router 172.16.10.0/24 Underlay Network Layer 3 Physical Firewalls Hardware VTEP B in rack 100 Bare Metal Server3 10.0.0.6/24 VM VM VM VM VM VM Hypervisor 20
EVPN MP-BGP as VxLAN control plane VxLAN traditional MAC address auto-learning challenge Tunnel establishment and VXLAN VNI-tunnel mapping require manual configure Tradition VXLAN only define the VXLAN encapsulation, MAC + ARP address learning rely on data plane flooding, which make BUM packets flood throughout all Fabric Spine Spine BUM flooding VXLAN tunnel establish VXLAN Network BUM flooding BUM flooding Leaf Leaf Leaf Control plane:evpn (RFC 7432) MP-BGP to establish VXLAN tunnel between VTEP and map VXLAN VNI to tunnel MP-BGP to synchronize MAC+IP between VTEP, so as to reduce broadcast Advantage: Simple Automated Overlay ECMP Standard Based 21
VXLAN architecture IP LAN/MAN/WAN VXLAN VNI 45501 VXLAN VNI 45502 VTEP VTEP VTEP VTEP VTEP Campus 1 Campus 2 Data Center VLAN 501 VLAN 501 VLAN 501 VLAN 501 VLAN 501 VLAN 502 VLAN 502 VLAN 502 VLAN 502 VLAN 502 Devices mapped to VLAN using MAC-based VLAN or MAC-authentication 22
HA for Campus Interconnects A short introduction 23
ADVPN automates secure connectivity Campus WAN Simple Automated zero-touch deployment with IMC Reduces configuration steps Secure Flexible support for any IP WAN technology Standards-based IPSec encryption Scalable Site-to-site performance for rich media Scales to over 30,000 sites Secure Data Tunnel Branch Branch 93 percent reduction in configuration steps 24 24
Secure Overlay with IPSec VPN MM/Standby Headquarter Secure Overlay setup over any access - MPLS or Internet (Cable, DSL, 4G) VPNC Certified Strong Encryption like FIPS PSK or PKI based VPN setup Routing overlay with OSPF and Static Routing MPLS INTERNET Zero Hub configuration when newer Spokes are added* Branch Branch Branch MC MC MC * Roadmap IPSec Tunnel 25
Context-Aware Path Selection Policy based Path selection based of MM/Standby Applications Roles, VLANs Headquarter Global Load Balancing based of: VPNC Round robin (default) Hash Session count Uplink utilization MPLS INTERNET Uplinks can be configured Active-Active or Active- Backup Added WAN Compression capability IPSec Tunnel Branch MC Business Critical Latency Sensitive Office 365 SAP Lync/SfB Video Recreational Youtube 26
Recommendation Everything should be made as simple as possible, but not simpler. 27
Downtime on software Upgrades IRF IRF IRF Whenever there is a downtime possible for software upgrades -> use VSF / IRF! Very stable Configuration errors less likely (always remember most IT outages are caused by human error) Support for high bandwidth requirements Doesn t conflict with other network protocols/functions 40G server bandwidth Legend Configuration effort Operational complexity Active link Convergence time User/app experience Blocked/standby link 28
No downtime at the Campus CORE IRF L3 Core IRF If there is REALLY NO DOWNTIME possible for the Campus Core -> USE L3 Campus Core Wherever possible try to avoid L2 overlay If L2 overlay is needed consider number of VLAN s -> small number configure overlay manually In large scale deployments use EVPN 40G server bandwidth Legend Configuration effort Operational complexity Active link Convergence time User/app experience Blocked/standby link 29