Techniques and Protocols for Improving Network Availability

Similar documents
SYSC 5801 Protection and Restoration

Minimizing Packet Loss

HIGH AVAILABILITY DESIGN

MPLS Traffic Engineering Traffic Protection using Fast Re-route (FRR)

Resilient IP Backbones. Debanjan Saha Tellium, Inc.

ENTERPRISE MPLS. Kireeti Kompella

Cisco ASR 9000 Series High Availability: Continuous Network Operations

High Availability for 2547 VPN Service

Juniper Networks Live-Live Technology

Introduction to Segment Routing

Emerging MPLS OAM mechanisms

BW Protection. 2002, Cisco Systems, Inc. All rights reserved.

MPLS Networks: Design and Routing Functions

Multi Topology Routing Truman Boyes

MPLS etc.. MPLS is not alone TEST. 26 April 2016 AN. Multi-Protocol Label Switching MPLS-TP FEC PBB-TE VPLS ISIS-TE MPƛS GMPLS SR RSVP-TE OSPF-TE PCEP

Converged Communication Networks

for Metropolitan Area Networks MPLS No. 106 Technology White Paper Abstract

IP/LDP FAST PROTECTION SCHEMES

Deploying MPLS & DiffServ

January 8, 2013 MPLS WG, IETF 85 MULTI-PATH RSVP-TE KIREETI KOMPELLA CONTRAIL SYSTEMS

Cisco Training - HD Telepresence MPLS: Implementing Cisco MPLS V3.0. Upcoming Dates. Course Description. Course Outline

LDP Fast Reroute using LDP Downstream On Demand. 1. Problem: 2. Summary: 3. Description:

Cisco EXAM Cisco ADVDESIGN. Buy Full Product.

LARGE SCALE IP ROUTING LECTURE BY SEBASTIAN GRAF

IP Fast Reroute Applicability. Pierre Francois Institute IMDEA Networks

Multi-Dimensional Service Aware Management for End-to-End Carrier Ethernet Services By Peter Chahal

Practice exam questions for the Nokia NRS II Composite Exam

Implementing MPLS Label Distribution Protocol

Broadband Loop Carrier Ethernet Protection Switching Using Ethernet to Create a Reliable B roadband Access Network F A Q

BrainDumps.4A0-103,230.Questions

Company Overview. Network Overview

Top-Down Network Design, Ch. 7: Selecting Switching and Routing Protocols. Top-Down Network Design. Selecting Switching and Routing Protocols

Configuration and Management of Networks. Pedro Amaral

THE MPLS JOURNEY FROM CONNECTIVITY TO FULL SERVICE NETWORKS. Sangeeta Anand Vice President Product Management Cisco Systems.

A Segment Routing (SR) Tutorial. R. Bonica NANOG70 June 6, 2017

The State of Traffic Engineering - an ISP's Perspective

LARGE SCALE IP ROUTING LECTURE BY SEBASTIAN GRAF

MPLS Egress Protection Framework draft-shen-mpls-egress-protectionframework-02

Hands-On Metro Ethernet Carrier Class Networks

Network Configuration Example

Fast convergence project

MPLS VPN--Inter-AS Option AB

Core Networks Evolution

The Role of the Path Computation El ement Centralized Controller in SDN & NFV

Service Providers Networks & Switching (MPLS) 20/11/2009. Local Team

OPTera Metro 8000 Services Switch

MPLS etc.. 9 May 2017 AN

Fabric Connect Multicast A Technology Overview. Ed Koehler - Director DSE. Avaya Networking Solutions Group

Introduction to IEEE 802.1Qca Path Control and Reservation

Segment Routing Commands

Ahmed Benallegue RMDCN workshop on the migration to IP/VPN 1/54

Computer Network Architectures and Multimedia. Guy Leduc. Chapter 2 MPLS networks. Chapter 2: MPLS

PIM-tunnels and MPLS P2MP as Multicast data plane in IPTV and MVPN. Lesson learned

Top-Down Network Design

"Charting the Course...

Chapter 7: Routing Dynamically. Routing & Switching

MPLS in the DCN. Introduction CHAPTER

IPv6 Switching: Provider Edge Router over MPLS

Configuring CRS-1 Series Virtual Interfaces

IPv6 Switching: Provider Edge Router over MPLS

MPLS Multi-Protocol Label Switching

CertShiken という認定試験問題集の権威的な提供者. CertShiken.

MPLS VPN Inter-AS Option AB

Configuring StackWise Virtual

HP A5820X & A5800 Switch Series MPLS. Configuration Guide. Abstract

Comparing Ethernet Network Fabric Technologies

Loop Free Alternate and Remote Loop Free Alternate IP Fast Reroute

OSPF Protocol Overview on page 187. OSPF Standards on page 188. OSPF Area Terminology on page 188. OSPF Routing Algorithm on page 190

GMPLS Overview Generalized MPLS

Detailed Analysis of ISIS Routing Protocol on the Qwest Backbone:

MPLS Network Management MPLS Japan Ripin Checker, Product Manager, Cisco Systems

Configuring MPLS, MPLS VPN, MPLS OAM, and EoMPLS

MPLS, THE BASICS CSE 6067, UIU. Multiprotocol Label Switching

Session 2: MPLS Traffic Engineering and Constraint-Based Routing (CR)

Routing Resiliency Latest Enhancements

Mission Critical MPLS in Utilities

6 MPLS Model User Guide

Multi-Protocol Lambda Switching for Packet, Lambda, and Fiber Network

Multi Protocol Label Switching

Multi-Chassis APS and Pseudowire Redundancy Interworking

Table of Contents 1 Multicast VPN Configuration 1-1

NEXT GENERATION BACKHAUL NETWORKS

Testking.4A0-103,249.QA 4A Alcatel-Lucent Multi Protocol Label Switching

IS-IS Configuration Commands. Generic Commands. shutdown IS-IS XRS Routing Protocols Guide Page 533. Syntax [no] shutdown

MPLS Traffic Engineering--Scalability Enhancements

Fast Reroute for Node Protection in LDP based LSPs

Virtualizing The Network For Fun and Profit. Building a Next-Generation Network Infrastructure using EVPN/VXLAN

QUESTION: 1 You have been asked to establish a design that will allow your company to migrate from a WAN service to a Layer 3 VPN service. In your des

NETWORK ARCHITECTURE

Unified MPLS for Multiple Applications Transport Profile

Configuring MPLS L2VPN

Reliable IPTV Transport Network. Dongmei Wang AT&T labs-research Florham Park, NJ

Multiprotocol Label Switching (MPLS) Traffic Engineering

MPLS-TE Configuration Application

MPLS Traffic Engineering Path Calculation and Setup Configuration Guide, Cisco IOS Release 12.2SX

Introduction to OSPF

BASIC MPLS - MPLS Traffic Engineering Network

Implementing VXLAN. Prerequisites for implementing VXLANs. Information about Implementing VXLAN

PREREQUISITES TARGET AUDIENCE. Length Days: 5

Egress Protection (draft-shen-mpls-egress-protection-framework) Presented by Krzysztof G. Szarkowicz NANOG71 October 4, 2017

Transcription:

Techniques and Protocols for Improving Network Availability Don Troshynski dtroshynski@avici.com February 26th, 2004

Outline of Talk The Problem Common Convergence Solutions An Advanced Solution: RAPID Increasing RAPID s coverage: U-turn Neighbors Applications of RAPID Summary Slide 2

The Profitability Problem: Best Effort IP Network = Limited Services Supports only best effort services due to reliability and stability limitations Low Margins Commodity pricing of undifferentiated best-effort services High Costs High CapEx and OpEx outlays Frequent outages and high customer service costs Slide 3

Who will Capture IP s True Value???? Service provider IP network today Unified Messaging VPNs/Teleworking Delivery of audio, video, software Voice over IP Interactive Gaming E-mail Web Browsing The IP Challenge: To increase market share and gross margins carriers need to deliver more than just pipe Slide 4

Network Disruptions are Daily Events Causes Router Failure Disruptive Operations (sw upgrades, configuration changes, ) Link Failure Service Impact Loss of traffic for 10s of seconds Disruption of Real-Time Services (voice calls, gaming sessions, video, ATM) Business Impact SLA Penalties Customer Service/Maintenance Issues Customer Churn Inability to support High-Margin Real-Time Services Slide 5

Traffic Convergence Goal: < 50 ms To support a multi-service network, need to minimize service interruption. Network Failures cause service interruption. Node Failure: Avoid disruption with Non-Stop Routing Link Failure: Minimize traffic loss during convergence. Traffic Convergence IGP Convergence: SPF provides the basis for all other protocols so must be very fast. BGP Convergence: Using forwarding-plane indirection to IGP next-hop allows traffic restoration for BGP learned destination before BGP re-computation occurs for many failure scenarios. LDP Convergence: Requires IGP SPF results to install new forwarding plane state. Slide 6

Outline of Talk The Problem Common Convergence Solutions An Advanced Solution: RAPID Increasing RAPID s coverage: U-turn Neighbors Applications of RAPID Summary Slide 7

Layer 1: Linear and Distributed APS Linear APS Comes in 1+1 and 1:N flavors Works at Line Layer Signaled in K1/K2 bytes Router Working Line Protect Line Distributed APS Like Linear APS, but two routers terminate the working and protect lines Failure of line, or even router, is protected Working Router Interconnect Protect Router Working Line Protect Line APS is costly to implement and therefore a targeted solution SONET ADM Slide 8

Layer 2: POS Bundling Description Single Virtual Link Router Router POS equivalent of GigE Link Aggregation A method to aggregate 2 or more physical POS links into a single logical link as observed from Layers 2 and 3 Network sees a single IP Address/Interface Flows comprised of IP src/dest or MPLS LSPs routed to a single bundle member Source/Dest IP Address and MPLS label based hashing algorithm for traffic flow (same as ECMP) Slide 9

Layer 2: POS Bundling Protection Single Virtual Link Router Router When one or more fiber fails, traffic shifted to remaining members Failure transparent to IP routing layer bandwidth of link just decreased Switchover can be performed in <50ms Network does not need to re-converge at Layer-3 Some products even support mixed member link speeds Better link utilization than APS but applicable only to parallel links Slide 10

Layer 2: Link Aggregation Description Router Router Switch Switch Aggregate-links are a number of individual Ethernet links that collectively form a single Layer 2 link using the IEEE 802.3ad standard Upper layer protocols (Spanning Tree, IS-IS, OSPF, BGP, etc.) and applications see the link aggregation as a single interface Conversation flows, which could be defined by MAC src/dest or IP src/dest, are kept on the same Link Agg member link Slide 11

Layer 2: Link Aggregation Protection Router Router Switch Switch There is an automatic configuration ability, through the use of Link Aggregation Control Protocol (LACP) LACP also provides a keepalive mechanism If a failure occurs on a link, traffic shifted over to remaining member links Switchover may happen quickly (<1sec) - upper layers don t see the failure Slide 12

Layer 2.5: Head-end Rerouted LSPs Primary LSP Time 1 R2 R3 Time 2 R1 R4 R5 R6 Time 3 Planning Occurs After Failure Tunnel Ingress Detects Failure Perform CSPF to Reroute LSP Rerouted Primary LSP Recovery is Order(seconds) Packet Loss INCREASES as failure moves away from Ingress CSPF and flooding is very sensitive to size of TE topology Slide 13

Layer 2.5: Pre-Signaled Standby LSPs Primary LSP Time 1 R2 R3 Time 3 R1 R4 R5 R6 Time 2 Planning Occurs Before Failure Tunnel Ingress Detects Failure Move Traffic to use standby LSP Standby LSP Recovery can be in 100s of milliseconds Packet Loss INCREASES as failure moves away from Ingress Slide 14

Layer 2.5: Fast Reroute Primary LSP (R1-R2-R3-R4) R2 R3 R1 Backup LSP (R3-R6-R4) R4 Backup LSP (R1-R5-R6-R3) R5 R6 Planning Occurs Before Failure PLR Detects Failure Move Traffic to use Backup LSP Backup LSP (R2-R5-R6-R4) Recovery in 10s of milliseconds Increased state in network and potential for unused bandwidth Quickly becomes complex to manage and troubleshoot Slide 15

Layer 3: Equal Cost Multi-Path (ECMP) R2 R3 R1 R4 R5 R6 ECMP can be used for either node or link protection Problems with ECMP: Can have long failover times due to IGP flooding and SPF (same issue as IGP convergence, if failure occurred remotely) IGP costs have to be the same (potentially complex traffic engineering) Really it s the same issues as normal IGP convergence, except half (or more) of the traffic won t be affected by the failure Slide 16

Outline of Talk The Problem Common Convergence Solutions An Advanced Solution: RAPID Increasing RAPID s coverage: U-turn Neighbors Applications of RAPID Summary Slide 17

IGP Convergence Time R1 Time 1 Time 4 1 1 R2 R3 1 Time 2 Time 10 10 3 2 2 R5 2 R6 R4 Failure detection triggers R2 to re-converge and signal failure Link loss happens in ms, but keepalive timeout can be seconds R2 s convergence time doesn t matter only its failure detection and signalling time does Packets now loop until R1 receives signal, processes it, and reconverges (and perhaps R5 needs to as well) Slide 18

Why It Takes So Long Most people think Dijkstra is what takes the most time. It isn t. Installing the routes is. Slide 19

Reliable Alternate Paths for Internet Destinations (RAPID) -- Basic Concept R1 Time 1 Time 4 1 1 R2 R3 1 Time 2 Time 3 10 10 2 2 R5 2 R6 R4 R2 pre-computes alternate IGP path for R4 traffic in case link fails Failure detection triggers R2 to failover to alternate path Failover occurs in milliseconds for both IP and LDP R2 also signals failure and runs SPF, but that time does not impact traffic Some time later R1 will have run a new SPF and send traffic to R5 Slide 20

Alternate Next-Hops Pre-computed with previous Dijkstra SPF calculation Used during a local link failure while Router is computing a new SPF based on the revised topology and is installing it into the forwarding plane Must not cause forwarding loop during failure Feasible Alternate Next-Hop can be used for LDP as well as IGP/BGP to provide sub-second traffic redirection Once new SPF runs, it overrides the RAPID alternate path Slide 21

Finding Loop-Free Neighbors 12 1 1 R2 R3 1 R1 10 10 2 2 R5 2 R6 R4 4 R2 can find a loop free neighbor: R5 R5 is loop-free, because the distance from R5 to R4 is less than the distance from R2 to R4 plus the distance from R5 to R2. R2 can know all this because it has the full LSDB Only R2 needs to support RAPID to provide protection for its links This allows a slow migration to RAPID protection Slide 22

Loop-Free Coverage R1 Time 1 Time 4 1 R2 R3 1 Time 2 Time 3 No loop-free alternate path from R2 to R4 10 2 2 R5 2 R6 1 R4 Many networks don t have alternate links at all points Simple loop-free RAPID provides an average 75% failure coverage But 75% of the links does not equal 75% of the traffic could be a lot less if the 25% unprotected are important links If R2 could use R1 as an alternate, the coverage would increase dramatically Slide 23

Outline of Talk The Problem Common Convergence Solutions An Advanced Solution: RAPID Increasing RAPID s coverage: U-turn Neighbors Applications of RAPID Summary Slide 24

Breaking the loop: U-Turn Alternates R1 is a U-turn neighbor of R2 Time 1 1 R2 R3 1 1 Time 2 R1 Time 3 10 R4 2 2 R5 2 R6 Time 4 R1 is a U-turn neighbor of R2 because: R1 itself has a loop-free alternate path to reach R4 R1 can break the loop So R2 could use R1 as an alternate, if R1 were capable of breaking the loop when a failure happens Slide 25

U-Turn Alternates R1 breaks the loop and sends traffic to alternate port R1 Time 1 1 R2 R3 1 Time 2 Time 3 2 2 R5 2 R6 1 10 R4 Time 4 R1 can break the loop, if its hardware can forward traffic received on the wrong port to the alternate path port The receiving port is wrong, if it s the port that the traffic should be transmitting back out on This means R1 has to be doing RAPID, and have the hardware to be a U-turn alternate Thus new IETF drafts to signal capabilities: OSPF, ISIS, LDP Slide 26

RAPID Details New IETF drafts define signaling of Router s RAPID capability, and per link capability for IGPs and LDP Drafts also define common rules for selecting loop-free and U-turn alternates Using U-turn alternates increases protection coverage from 75% average, to 95% average User-configurable for simple (non-u-turn) RAPID, or for full RAPID Asymmetric costs taken into account Currently multicast not covered by RAPID uses old convergence method (under investigation) Slide 27

Outline of Talk The Problem Common Convergence Solutions An Advanced Solution: RAPID Increasing RAPID s coverage: U-turn Neighbors Applications of RAPID Summary Slide 28

Network 1 RAPID Protected IP/LDP Backbone A highly redundant IP/LDP Backbone no MPLS-TE RAPID provides protection both in the Core, Aggregation, and Edge Coverage is very good, if the link redundancy is sufficient Slide 29

Network 2 RAPID Protected RAPID Protected MPLS FRR Backbone Less redundant design using MPLS-TE in core FRR provides loop-free method to backup logical-ring core RAPID protects Aggregation/Hub/Edge routers Slide 30

Network 3 RAPID Protected IP Core Router Pure MPLS Core BGP/IGP Adj. RAPID Protected IP Core Router MPLS PWE Backbone Separate Transit MPLS PWE/L2-VPN Core design IP Core routers do not see this MPLS core they think they have direct connections to the other IP Core routers MPLS Backbone can be protected by FRR or Pre-Signaled standby tunnels (more common) RAPID protects IP Core Routers (not Transit Backbone Routers) and Edge Slide 31

Outline of Talk The Problem Common Convergence Solutions An Advanced Solution: RAPID Increasing RAPID s coverage: U-turn Neighbors Applications of RAPID Summary Slide 32

Well-Known Convergence Solutions Layer-1/2: APS: standard SONET Line-layer protection mechanism fast failover, but wastes protection link Composite Links/Link Aggregation: multiple parallel links to neighbor fast failover, but no node protection/interoperability Layer-2.5: Head-end Reroute: re-signal LSP path from ingress to egress slow failover (<10 seconds) Head-end Standby Tunnels: pre-signaled from ingress to egress slow failover (100s of milliseconds) MPLS Fast Reroute: MPLS-TE based local protection (1:1 or 1:N) fast failover, somewhat complicated and doesn t scale well Layer-3: ECMP for IP/LDP: multi-path load-sharing based on IGP cost slow failover and requires careful planning for equal cost paths Slide 33

RAPID Fast Convergence Summary Provide < 50ms traffic convergence in the event of a link failure for IP and LDP traffic. Loop-free alternates can be used independent of LDP Fast Convergence on the alternate next-hop. U-Turn Alternates expand the potential failure coverage on networks. Simple to configure and manage Can be incrementally deployed the benefit of U-Turn Alternates will be seen as more routers are deployed with this feature in the network. Slide 34

RAPID Benefits vs. Alternatives Current Protection Mechanisms (MPLS FRR/ APS/ etc.) IP/LDP Fast Convergence Costs Expensive unused protection capacity More effective utilization of network assets Complexity Flexibility Requires high skill set Time intensive and manual Requires RSVP-TE overlay Provisioning requires constant updates as network changes Simple configuration requires no specialized training or resources Automated and adaptive Seamless adaptability to network events (topology/ failures/ etc.) Scalability Backup tunnels use more resources Tunnels rarely deployed edge-toedge Seamless adaptability to network events (topology/ failures/ etc.) Protects end-to-end Slide 35