HIGH AVAILABILITY DESIGN

Similar documents
Emerging MPLS OAM mechanisms

Techniques and Protocols for Improving Network Availability

HP MSR Router Series. MPLS Configuration Guide(V7) Part number: Software version: CMW710-R0106 Document version: 6PW

Multi Topology Routing Truman Boyes

Network Configuration Example

Introduction to Multi-Protocol Label

Cisco ASR 9000 Series High Availability: Continuous Network Operations

Implementing MPLS Label Distribution Protocol

Network Configuration Example

HP 5920 & 5900 Switch Series

Configuring OSPF network management 39 Enabling message logging 39 Enabling the advertisement and reception of opaque LSAs 40 Configuring OSPF to

Juniper Networks Certified Specialist Service Provider Routing and Switching Bootcamp, JIR, JSPX, JMF (JNCIS-SP BC)

Cisco Exam Deploying Cisco Service Provider Network Routing (SPROUTE) Version: 7.0 [ Total Questions: 130 ]

Network Configuration Example

Implementing MPLS Layer 3 VPNs

SYSC 5801 Protection and Restoration

Fast IP Convergence. Section 4. Period from when a topology change occurs, to the moment when all the routers have a consistent view of the network.

SR-TE On Demand LSP. The SR TE On demand LSP feature provides the ability to connect Metro access rings via a static route to the

LARGE SCALE IP ROUTING LECTURE BY SEBASTIAN GRAF

Segment Routing Commands

IS-IS Configuration Commands. Generic Commands. shutdown IS-IS XRS Routing Protocols Guide Page 533. Syntax [no] shutdown

EXAM - JN Service Provider Routing and Switching, Specialist (JNCIS-SP) Buy Full Product.

HP 5920 & 5900 Switch Series

Overview. Information About High Availability. Send document comments to CHAPTER

BrainDumps.4A0-103,230.Questions

High Availability for 2547 VPN Service

Introduction to routing

Multi-Chassis APS and Pseudowire Redundancy Interworking

Introduction to Segment Routing

Bidirectional Forwarding Detection (BFD) NANOG 39

internet technologies and standards

Testking.4A0-103,249.QA 4A Alcatel-Lucent Multi Protocol Label Switching

Configuring LDP with CLI

Logging neighbor state changes 38 Configuring OSPF network management 39 Enabling message logging 39 Enabling the advertisement and reception of

Network Configuration Example

IS-IS basic configuration 37 DIS election configuration 41 Configuring IS-IS route redistribution 45 IS-IS GR configuration example 49 IS-IS FRR

Fast convergence project

HP A-MSR Router Series MPLS. Configuration Guide. Abstract

MPLS in the DCN. Introduction CHAPTER

MPLS Traffic Engineering Traffic Protection using Fast Re-route (FRR)

Network-Level High Availability

HPE FlexNetwork 5510 HI Switch Series

Cisco CPT Packet Transport Module 4x10GE

Implementing MPLS Label Distribution Protocol

Configure SR-TE Policies

HP Routing Switch Series

Practice exam questions for the Nokia NRS II Composite Exam

Cisco ASR 9000 Series Aggregation Services Router MPLS Configuration Guide, Release 4.1

HP MSR Router Series. MPLS Configuration Guide(V5) Part number: Software version: CMW520-R2513 Document version: 6PW

MPLS VPN--Inter-AS Option AB

Cisco EXAM Cisco ADVDESIGN. Buy Full Product.

MPLS etc.. MPLS is not alone TEST. 26 April 2016 AN. Multi-Protocol Label Switching MPLS-TP FEC PBB-TE VPLS ISIS-TE MPƛS GMPLS SR RSVP-TE OSPF-TE PCEP

OSPF. Unless otherwise noted, OSPF refers to OSPFv2 throughout this document.

MPLS VPN Inter-AS Option AB

Configuring MPLS, MPLS VPN, MPLS OAM, and EoMPLS

LDP Fast Reroute using LDP Downstream On Demand. 1. Problem: 2. Summary: 3. Description:

Routed Fast Convergence and High Availability

IPv6 Switching: Provider Edge Router over MPLS

MPLS Core Networks Николай Милованов/Nikolay Milovanov

Routed Fast Convergence and High Availability

IP Routing BFD Configuration Guide, Cisco IOS Release 12.2SX

Internet Engineering Task Force (IETF)

MPLS Network Management MPLS Japan Ripin Checker, Product Manager, Cisco Systems

TELCO GROUP NETWORK. Rafał Jan Szarecki 23/10/2011

CONTENTS. Introduction

MPLS LDP Graceful Restart

Vendor: Juniper. Exam Code: JN Exam Name: Service Provider Routing and Switching Support, Professional. Version: Demo

Alcatel-Lucent 7705 SERVICE AGGREGATION ROUTER OS RELEASE 5.0 ROUTING PROTOCOLS GUIDE ROUTING PROTOCOLS GUIDE

Agenda DUAL STACK DEPLOYMENT. IPv6 Routing Deployment IGP. MP-BGP Deployment. OSPF ISIS Which one?

MPLS etc.. 9 May 2017 AN

Q&As. Deploying Cisco Service Provider Network Routing (SPROUTE) Pass Cisco Exam with 100% Guarantee

AToM (Any Transport over MPLS)

Junos OS Multiple Instances for Label Distribution Protocol Feature Guide Release 11.4 Published: Copyright 2011, Juniper Networks, Inc.

Multipoint LDP (mldp)

CertShiken という認定試験問題集の権威的な提供者. CertShiken.

Network Configuration Example

Configuring Cisco NSF with SSO Supervisor Engine Redundancy

HP A5820X & A5800 Switch Series MPLS. Configuration Guide. Abstract

CertifyMe. CertifyMe

Computer Network Architectures and Multimedia. Guy Leduc. Chapter 2 MPLS networks. Chapter 2: MPLS

Internetwork Expert s CCNP Bootcamp. Gateway Redundancy Protocols & High Availability. What is High Availability?

IP/LDP FAST PROTECTION SCHEMES

GMPLS Overview Generalized MPLS

H3C SR6600 Routers. High Availability. Configuration Guide. Hangzhou H3C Technologies Co., Ltd.

Company Overview. Network Overview

Configuring CRS-1 Series Virtual Interfaces

Multi-Dimensional Service Aware Management for End-to-End Carrier Ethernet Services By Peter Chahal

OSPF Convergence Details

Cisco Training - HD Telepresence MPLS: Implementing Cisco MPLS V3.0. Upcoming Dates. Course Description. Course Outline

IP Fast Reroute Applicability. Pierre Francois Institute IMDEA Networks

Free4Torrent. Free and valid exam torrent helps you to pass the exam with high score

Monitoring MPLS Services

Routing Protocols. Technology Description BGP CHAPTER

IPv6 Switching: Provider Edge Router over MPLS

Network Configuration Example

Configuring MPLS Transport Profile

LARGE SCALE IP ROUTING LECTURE BY SEBASTIAN GRAF

BraindumpsQA. IT Exam Study materials / Braindumps

JPexam. 最新の IT 認定試験資料のプロバイダ IT 認証であなたのキャリアを進めます

Network Configuration Example

Transcription:

IP/MPLS CORE HIGH AVAILABILITY DESIGN Mars Chen <email: marschen@juniper.net> 27 th January 2010

TODAY S IP NETWORK Is an infrastructure that supports mission critical services: VoIP/Mobility Converged data network services Business VPN Services Cloud Computing And Internet access services.. These carrier services typically have customer SLA s that must be supported 2 Copyright 2009 Juniper Networks, Inc. www.juniper.net

BUSINESS CASE FOR HIGH AVAILABILITY Cost Downtime Cost Availability Cost Availability 3 Copyright 2009 Juniper Networks, Inc. www.juniper.net

HIGH AVAILABILITY SOLUTION ARCHITECTURE Carrier-Class Availability Is a Culture, Not a Single Feature or Product Management process y Securit Dependable Networks Dependable Platforms ity Scalabil Hardware Software 4 Copyright 2009 Juniper Networks, Inc. www.juniper.net

PLATFORM HIGH AVAILABILITY 1. Hardware 2. Software 3. Control plane Nonstop forwarding ( NSF) : Graceful restart + GRES Nonstop routing 4. Virtualization Virtual Chassis 5 Copyright 2009 Juniper Networks, Inc. www.juniper.net

LOGICAL PLATFORM VIEW OF MODERN ROUTER Clean separation of routing and packet forwarding functions Consistent performance Stability Provider-class routing Routing Engine (RE) JUNOS software Packet Forwarding Engine (PFE) Processor-based design Interfaces FPC/PICs RE PFE Junos Software Update Internet I/O Card Forwarding Table Forwarding Table Switch Fabric Processor (s) I/O Card 6 Copyright 2009 Juniper Networks, Inc. www.juniper.net

ROUTING ENGINE REDUNDANCY GRACEFUL ROUTING ENGINE SWITCHOVER (GRES) Control plane and forwarding plane separation allows continuous packet forwarding during control plane failure GRES provides stateful replication between the master and backup REs Keepalives exchange info such as interfaces, and kernel. GRES allows fast switchover during RE failure Daemon 1 Daemon 2 Daemon 3 Daemo on 1 Kernel Daemo on 2 n 3 Daemon Daemo n Kernel Daemo on n Keepalive GRES alone is not enough Routing adjacencies broken during switchover Must be combined with either graceful restart protocol extensions or nonstop active routing Packet Forwarding Physical Interfaces 7 Copyright 2009 Juniper Networks, Inc. www.juniper.net

NONSTOP FORWARDING Dae mon n Daemon 1 Daemon 2 Daemon 3 Dae emon 1 Dae emon 2 Daemon Daem n mon 3 GRES + Graceful Restart Kernel e = NSF (Nonstop Forwarding) Kernel Keepalive Forwarding) Routing Adjacencies 8 Copyright 2009 Juniper Networks, Inc. www.juniper.net

GRACEFUL RESTART: OPERATIONS Separate control and dd data planes When router recovers, P P PE 2 neighbors sync up without disturbing forwarding. PE 1 If router s control plane fails, data plane can keep forwarding packets P Neighbors hide failure from all others routers in the network P PE 3 Other routers are never made aware of failure 9 Copyright 2009 Juniper Networks, Inc. www.juniper.net

GRACEFUL RESTART PROTOCOL DETAILS Purpose - Continue forwarding (PFE) during a restart of routing (RE) Changes IETF BGP Protocol extensions Graceful Restart Mechanism for BGP Per-peer configuration rfc4724 Various timers with configurable defaults OSPF Protocol extensions Graceful OSPF Restart New opaque-lsa type 9, rfc3623 Grace-LSA IS-IS MPLS RSVP Protocol extensions 3 new timers New re-start option (TLV) in IIH PDU Protocol Extensions Uses signaling as described in Graceful Restart Mechanism for BGP Protocol Extensions Extend rfc 3473 Recovery ERO Restart Signaling for ISIS rfc3847 Graceful Restart Mechanism for BGP with MPLS Rfc4781 Extensions to GMPLS RSVP Graceful Restart Rfc5063 10 Copyright 2009 Juniper Networks, Inc. www.juniper.net

GRACEFUL RESTART: LIMITATIONS Neighboring routers must understand GR procedures and messages Inhibits full GR implementation Particularly significant on PEs GR must stop if topology changes during grace period In some cases, cannot distinguish between link failure and control plane failure In some cases, routing re-convergence might exceed grace period For example, if there are hundreds of BGP peers Protocol interdependencies can slow re-convergence beyond grace period For example, if BGP depends on LDP restart completion, which depends on RSVP restart completion Operators acceptance of GR is not widespread 11 Copyright 2009 Juniper Networks, Inc. www.juniper.net

NON-STOP ACTIVE ROUTING Internal processes keep backup RE aware of protocol state and adjacency activities Individual routers assume sole responsibility for RE failure or switchover No need to run protocol extensions on neighbors Backup routing engine becomes hot standby Both REs run routing protocol processes Relies on GRES to preserve kernel and interface info 1 Daemon 2 Daemon Daemon 3 Dae emon 1 Kernel Dae emon 2 Routing Adjacencies Daemon emon n 3 Kernel Dae emon n switchover seamless to neighbors 12 Copyright 2009 Juniper Networks, Inc. www.juniper.net

IN SERVICE SOFTWARE UPGRADE Not every vendor extends GRES, GR and NSR support to ISSU software upgrade isn t merely a single point of software replacement. Ideally, a fully redundant system can eliminate any disruption during software upgrade Reality is, most systems today are not tfully redundant d Daemon 1 2 Daemon Daemon 3 Dae emon 1 Kernel Dae emon 2 Daemon n mon 3 Kernel Packet Forwarding Dae emon n Keepalive Physical Interfaces 13 Copyright 2009 Juniper Networks, Inc. www.juniper.net

TRUE ISSU Version X.0 Version Y.1 Upgrade Between Major Releases Upgrades entire software image Can be done to major or minor releases Routing Engine Da aemon 1 Da aemon 2 emon 3 Da Kernel Da aemon n Control Plane Forwarding Plane Packet Forwarding Physical Interfaces 14 Copyright 2009 Juniper Networks, Inc. www.juniper.net

VIRTUALIZATION VIRTUAL CHASSIS KEY ADVANTAGES No Dependency on Dedicated Connectivity Hardware Leverage existing Redundancy Mechanisms GRES/NSR, LAG (Aggregated-Ethernet). Operational efficiency Single control plane as visible externally. 0 RE 0 Failover transparent to external control 1 RE 1 entities (OSS, policy servers, AAA etc). No routing change as visible externally: 2 53 LC LC Inter-chassis link failover completely 4 LC contained within VC Failover not gated by routing re-convergence. 15 Copyright 2009 Juniper Networks, Inc. www.juniper.net

RELIABLE NETWORKS SDH (Layer 1) Ethernet t OAM ( Layer 2) Link bundling VRRP IGP fast convergence BFD VRRP MPLS (RSVP) Fast reroute IP/LDP fast reroute 16 Copyright 2009 Juniper Networks, Inc. www.juniper.net

SONET/SDH PROTECTION SWITCHING SONET APS & SDH MSP Redundant routers share uplink Rapid circuit failure recovery Used on router-to-adm to links Interoperable with standard ADM Working & protect circuits May reside on different routers May reside on same router ADM Protect 17 Copyright 2009 Juniper Networks, Inc. www.juniper.net

OAM LAYERS Connectivity Transport Layer Ensures two directly connected peers maintain bidirectional communication. Must detect link down failure and notify higher layer for protocol to re-route around the failure. Monitor link quality to ensure that performance meets an acceptable S i level. Connectivity Layer Monitor the communication path between two non-adjacent devices. Service Layer Measures and represents the status of the services as seen by the customer. Metrics such as throughput, round-trip delay, jitter need to be monitored in an effect to meet the Service Level Agreements (SLAs) contracted between the provider and the customer. Transport Service 18 Copyright 2009 Juniper Networks, Inc. www.juniper.net

ETHERNET OAM FAILURE ACTION Both 802.3ah and 802.1ag can mark the link down upon failure MPLS FRR can be triggered by link down indication Ethernet Ring Protection will also link down indication 19 Copyright 2009 Juniper Networks, Inc. www.juniper.net

LINK BUNDLING Reliable Links Link failure does not affect forwarding Load redistributed among other members Parallel Link Technologies MLPPP T1/E1 Link aggregation Multi-Link Frame Relay 802.3ad Ethernet aggregation SONET/SDH aggregation Simultaneous Physical Connections 20 Copyright 2009 Juniper Networks, Inc. www.juniper.net

IP DYNAMIC ROUTING New York San Francisco OSPF or IS-IS computes path If link or node fails, New path is computed Response times: Typically a few seconds Completion time: Typically a few minutes, but very dependant on topology 21 Copyright 2009 Juniper Networks, Inc. www.juniper.net

FASTER ROUTER CONVERGENCE Faster convergence improves Network Reliability Separation of control & forwarding planes is key Protocol expertise is key Features Benefits High Priority Flooding for Interested LSPs (ISIS) Quick SPF Scheduling (ISIS) Sub-second Hellos (ISIS) RIB and FIB Enhancements (BGP) Timer reduced from 100 to 20msec Faster propagation of major changes Reduces time from 7 sec to 50 msec Speeds calculation of optimum path Lowest Hello Time possible for IS-IS, 333msec Faster Link Failure Detection Indirect Next Hop implies faster convergence 22 Copyright 2009 Juniper Networks, Inc. www.juniper.net

WHAT IS BFD (BIDIRECTIONAL FORWARDING DETECTION )? Yet Another Hello Protocol ( but lightweight compared to OSPF/IS- IS.) Packets sent at regular intervals; neighbor failure detected when packets stop showing up Always unicast, even on shared media Not just for direct links; can be used over MPLS LSPs, multi-hop separated neighbors, unidirectional links. Simple packet format, very low processing overhead. Can be implemented in forwarding plane to the extent possible. 23 Copyright 2009 Juniper Networks, Inc. www.juniper.net

BFD ENHANCEMENTS Static route OSPF/ISIS ebgp Multihop ibgp BFD for MPLS OAM BFD provides LSP data plane verification. LSP-Ping verifies consistency between LSP control and data plane. BFD benefits: Lightweight. Scales to large number of LSPs Sub-second dfailure detection ti Periodic fault detection BFD over different types of LSPs: Point-to-point LSP (RSVP or LDP signaled) ECMP PE-PE awareness P2MP LSP L2 Pseudowire Can use VCCV-BFD and VCCV-Ping for Pseudowires 24 Copyright 2009 Juniper Networks, Inc. www.juniper.net

BFD VS ETHERNET OAM 802.1AG CC COMPARISON BFD 802.1ag CC Layer 3 continuity check. Better for Layer Ethernet Layer 2 continuity check. Better 3 and MPLS for Ethernet Use ping and traceroute for loopback and 802.1ag also supports loopback and trace functions linktrace Comparatively simpler configuration MEPs, MIPs, MD give more flexibility but make configuration complex. Working to simplify it. Runs on aggregate link but not on child Runs on aggregate and child links. Can links adjust aggregate bandwidth. Session down causes IGP, BGP to reroute Interface down causes protocols to reroute MPLS OAM: data plane failure detection MPLS OAM: interface down can trigger for RSVP and LDP LSPs, LSP switch FRR action, can trigger FRR 25 Copyright 2009 Juniper Networks, Inc. www.juniper.net

MPLS (RSVP BASED) PROTECTION Secondary LSP Secondary Standby LSP FAST REROUTE One-to-One Backup p( (aka 1:1 detour) using label swapping Facility Backup using label stacking Link Protection Protects only against link failure Node Protection Protects against both link failure and node forwarding plane failure LSR Detour Pi Primary 26 Copyright 2009 Juniper Networks, Inc. www.juniper.net

IP/LDP FAST REROUTE WHY? Some networks do not implement RSVP RSVP Requires PE to PE full mesh for transport plane protection ti RSVP N^2 Problem 200 PEs = app. 39800 LSPs + detours & bypasses Fixable using LSP hierarchy 27 Copyright 2009 Juniper Networks, Inc. www.juniper.net

LOOP-FREE ALTERNATES (LFA) Adds fast-reroute (FRR) capability to IS-IS, OSPF and LDP Normally only best nodal path is used for RIB walks Add a non-best (albeit loop free) path for backup purposes. How? Shared, common link state database Place the SPF root at your neighbors 28 Copyright 2009 Juniper Networks, Inc. www.juniper.net

SPF ROOTS & LFA ILLUSTRATED N1 2 R2 1 3 S 1 3 D 1 3 N2 3 R3 Main SPF Backup SPF 29 Copyright 2009 Juniper Networks, Inc. www.juniper.net

MANAGEMENT PROCESS CONTINUOUS SYSTEMS AVAILABILITY Planned Maintenance Hardware and software upgrades Human Factors Changes that negatively impact network performance Unplanned Events Minimize impact of human factors with failsafe mechanisms Prevent configuration errors and ensure compliance to configuration policies with candidate configurations, commit verifications, commit scripts Speed response and resolution to unplanned events Avert downtime with transparent failover, network recovery features Network failures, hardware events Provide proactive response and accelerate and software defects resolution with extensive instrumentation, event policies, op scripts Reduce time of planned events Stable releases reduce the frequency of fixes and duration of upgrades In-service upgrades available in high-end routing platforms 30 Copyright 2009 Juniper Networks, Inc. www.juniper.net

SCRIPT AUTOMATION Helps to Reduce Human Error Commit Scripts - Parse Configurations upon commit Generate Warnings Reject the commit Modify the configuration Op Scripts - Used to ensure compliance with company policy Generate notifications Automatic Diagnosis Correction of Problems Event E t Policies i - Monitors an event tti trigger Works with Op Scripts Wait for Notification messages 31 Copyright 2009 Juniper Networks, Inc. www.juniper.net

SCALABILITY/SECURITY Scalability Hardware : performance/next-hops/firewall filter Control plane : routes/peers/routing instances/logging. Security DDOS Attacks 32 Copyright 2009 Juniper Networks, Inc. www.juniper.net

SUMMARY High Availability: Is a culture Has many layers Is business critical 33 Copyright 2009 Juniper Networks, Inc. www.juniper.net