SHARED MESH RESTORATION IN OPTICAL NETWORKS

Similar documents
Resilient IP Backbones. Debanjan Saha Tellium, Inc.

SONET Topologies and Upgrades

Progress Report No. 15. Shared Segments Protection

ARE CONTROL PLANE BENEFITS APPLICABLE TO SUBMARINE NETWORKS?

Multi-Protocol Lambda Switching for Packet, Lambda, and Fiber Network

System Applications Description

Fault management. Acnowledgements

SONET Topologies and Upgrades

Network Survivability

ECE442 Communications Lecture 4. Optical Networks

Transport is now key for extended SAN applications. Main factors required in SAN interconnect transport solutions are:

A Comparison of Path Protections with Availability Concern in WDM Core Network

SYSC 5801 Protection and Restoration

CHAPTER I INTRODUCTION. In Communication Networks, Survivability is an important factor

Fault management. Fault management. Acnowledgements. The impact of network failures. Pag. 1. Inspired by

The Emerging Optical Control Plane

GMPLS Overview Generalized MPLS

Sycamore Networks Implementation of the ITU-T G.ASON Control Plane

Simulation of All Optical Networks

Generalized Multiprotocol Label Switching (GMPLS)

Multi-layer protection and restoration requirements

Optical Communications and Networking 朱祖勍. Nov. 27, 2017

Complexity in Multi-level Restoration and Protection Strategies in Data Networking

Capacity planning and.

The LSP Protection/Restoration Mechanism in GMPLS. Ziying Chen

Some economical principles

OPTICAL NETWORKS. Optical Metro Networks. A. Gençata İTÜ, Dept. Computer Engineering 2005

BW Protection. 2002, Cisco Systems, Inc. All rights reserved.

Techniques and Protocols for Improving Network Availability

Capacity planning and.

Tellabs Optical Networking Solutions Provide Cost-Effective, Scalable Bandwidth and Services

Multilayer Design. Grooming Layer + Optical Layer

Unavailability Analysis of Long-Haul Networks

Outline. EL736 Communications Networks II: Design and Algorithms. Class3: Network Design Modelling Yong Liu 09/19/2006

A Novel Class-based Protection Algorithm Providing Fast Service Recovery in IP/WDM Networks

Figure 1: History has shown that switching and grooming speeds increase as transport speeds increase

OnePlanner. Unified Design System

TRANS-OCEANIC OADM NETWORKS: FAULTS AND RECOVERY

Configuring Rapid PVST+

Benefits of Metropolitan Mesh Optical Networks

Convert Network Configurations

Bridging the Ring-Mesh Dichotomy with p-cycles:

Introduction to IEEE 802.1Qca Path Control and Reservation

IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 13, NO. 2, APRIL

Survivability Architectures for Service Independent Access Points to Multiwavelength Optical Wide Area Networks

Optical Fiber Communications. Optical Networks- unit 5

MPLS Traffic Engineering Traffic Protection using Fast Re-route (FRR)

Control and Management of Optical Networks

Zero touch photonics. networks with the cost efficiency of WDM. András Kalmár Péter Barta 15. April, Szeged

Strategy for SWITCH's next generation optical network

Broadband Rate Design for Public Benefit

MODERN RECOVERY MECHANISMS FOR DATA TRANSPORT NETWORKS

Efficient Distributed Restoration Path Selection for Shared Mesh Restoration

Operation Administration and Maintenance in MPLS based Ethernet Networks

Traffic Types and Growth in Backbone Networks

Spare Capacity Allocation Using Partially Disjoint Paths for Dual Link Failure Protection

CTC Information and Shortcuts

Building Infrastructure for Private Clouds Cloud InterOp 2014"

The Role of MPLS-TP in Evolved packet transport

NETSMART Network Management Solutions

Routing. 4. Mar INF-3190: Switching and Routing

from ocean to cloud FLEXIBLE WDM AND GMPLS IMPLEMENTATION ON WACS

Arc Perturbation Algorithms for Optical Network Design

Configuring Rapid PVST+ Using NX-OS

CTC Information and Shortcuts

SPARE CAPACITY MODELLING AND ITS APPLICATIONS IN SURVIVABLE IP-OVER-OPTICAL NETWORKS

Common Issues With Two Fiber Bidirectional Line Switched Rings

Inter-domain Routing with Shared Risk/Resource Groups (SRG)

CTC Information and Shortcuts

Networking in DWDM systems. Péter Barta András Kalmár 7-9. of April, Debrecen

Survivability with P-Cycle in WDM Networks

Card Protection. 4.1 Overview CHAPTER

Abstract. Introduction and Motivations. Automatic Switched Optical Networks

Ahmed Benallegue RMDCN workshop on the migration to IP/VPN 1/54

Data Center Applications and MRV Solutions

ADAPTIVE AND DYNAMIC LOAD BALANCING METHODOLOGIES FOR DISTRIBUTED ENVIRONMENT

SONET Bidirectional Line-Switched Ring Equipment Generic Criteria

Add and Remove Nodes

9. Control and Management in Optical Networks

Configuring Rapid PVST+

Progress Report No. 13. P-Cycles and Quality of Recovery

MPLS OAM Technology White Paper

Getting in Gear with the Service Catalog

Core Networks Evolution

Network Topologies & Error Performance Monitoring in SDH Technology

Converged Communication Networks

NetAnalyst Test Management Software Automated, Centralized Network Testing. NetComplete Service Assurance Solutions Portfolio

Routing in packet-switching networks

HIGH AVAILABILITY DESIGN

Configuring StackWise Virtual

OSPF IN OPTICAL NETWORKS

Overview of GMPLS Protocols and Standardization

David Tipper Graduate Telecommunications and Networking Program University of Pittsburgh. Motivation

where we are, where we could be, how we can get there. 12/14/2011

Architecture. SAN architecture is presented in these chapters: SAN design overview on page 16. SAN fabric topologies on page 24

Why Service Providers Should Consider IPoDWDM for 100G and Beyond

Reading. Read 3 MPLS links on the class website

IO2654 Optical Networking. WDM network design. Lena Wosinska KTH/ICT. The aim of the next two lectures. To introduce some new definitions

Comparison of Protection Cost at IP or WDM Layer

Application of SDN: Load Balancing & Traffic Engineering

Transcription:

SHARED MESH RESTORATION IN OPTICAL NETWORKS OFC 2004 Jean-Francois Labourdette, Ph.D. labourdette@ieee.org

Page: 2 Outline Introduction Network & Restoration Arch Evolution Mesh Routing & Provisioning Fast Shared Mesh Restoration Re-provisioning for Improved Reliability Planning & Dimensioning Re-optimization Maintenance Summary

Page: 3 Network Evolution Network architecture Mesh vs. Ring Multi-tier network Historical (DS0/DS1/DS3/STS-1/STS-48) Not a question of if but when Technology O/E/O switching All-optical switching in opaque network All-optical/transparent network e.g., (R)OADM Difficulty is often not just the technology but one of network management and operations

Restoration Architecture Evolution 80's DCS-based Mesh Restoration of DS3 Facilities Centralized (EMS/NMS) Path-based, failure-dependent, after fault detection and isolation Capacity-efficient but slow (~ minutes) 90's ADM-based Ring Protection of SONET/SDH Facilities Distributed Path-based (UPSR) or span-based (BLSR), pre-determined Fast ( 50 msec ) but capacity-inefficient 2000's OXC-based Mesh Protection/Restoration Distributed Path-based, failure-independent, pre-determined and preprovisioned Capacity-efficient AND fast (10 s 100's msec) Page: 4

Page: 5 Restoration Architecture Evolution (cont'd) Challenge: OXC-based mesh network architecture requires new network management, and new operations which can be more sophisticated than those in ring-based networks Question: can ring-based architectures be evolved instead? Trans-oceanic ring architecture (G.841) p-cycle (W. Grover et al.) Next-gen SONET/SDH equip. can handle multiple rings Ability to not close working or protection ring Ability to share protection channels across rings End-result is an evolution towards mesh networking

Page: 6 Mesh Operations Routing & Provisioning Shared mesh restoration based on pre-determined, preprovisioned restoration paths KEY requirement of mesh networking Complete synch between network and its inventory Cannot rely on manual entry of network inventory Provisioning Components Self-discovery of neighbour and port connectivity <-- the key feature Topology discovery Route computation Lightpath setup Debate on management vs. control plane approach

Page: 7 Automated Discovery of Port and Neighbouring Node Connectivity Hello (O1,P1) OXC O1 T P1 R OXC O2 T P10 R Hello (O2,P10) Hello Ack (O2,P10, O1,P2) Hello Ack (O1,P1, O2,P10) NDP at O1, P1 P2 T R misconfiguration T R P11 exchange reveals misconfiguration since sending P1 is followed by acknowledgment of P2 P3 T R T R P12 Hello (O1,P3) Hello (O2,P12) Hello Ack (O2,P12, O1,P3) NDP at O1, P3 Hello Ack (O1,P3, O2,P12) normal exchange

Comparing Control & Management Plane Approaches Page: 8 Management Plane Approach Control Plane Approach Neighbor discovery Manually configured Topology discovery Derived at NMS/EMS from neighbour & port adjacency information Route computation NMS/EMS computes the primary and backup paths from the topology and lightpath databases Lightpath setup NMS sets up lightpaths by configuring the NEs using TL-1 messages Neighbor discovery Done using neighbour discovery protocol (LMP) between NEs Topology discovery Done by NEs by exchanging neighbour & port adjacency info (OSPF/IS-IS) Route computation Done by NEs from topology database NEs may not have complete lightpath database Lightpath setup Done by NEs using a signaling protocol like RSVP-TE or CR-LDP

Page: 9 Dedicated Mesh (1+1) Protection U Primary for demand AB W A C S X V Bridging (both lightpaths are active) Secondary for demand AB Secondary for demand CD Cross-connections are established on secondary paths before failure Y Primary for demand CD Z T B D source transmits to both working and edge/nodediverse backup paths The destination decides which path to select, based on the quality of the received signals. Fastest restoration but wasteful because backup path is permanently active, although not used under normal conditions.

Shared Mesh Restoration Shared mesh restoration reserves the capacity for the backup path, and activate the backup only when necessary Enable the same capacity to serve multiple backup paths if the paths are not expected to be activated simultaneously. This condition is satisfied if their working paths are diverse A U Primary for demand AB V W B A Primary for demand AB is not affected U W V B S Optical line reserved for shared protection of demands AB and CD T S Optical line carries protection of demands CD T Cross-connections are not established Y Cross-connections are established after failure occurs Link failure Y C X Primary for demand CD Z D C X Z D BEFORE FAILURE AFTER FAILURE Page: 10

Page: 11 Diversity of Paths e a b c z e f a b c z f After computing the working path using a shortest path algorithm, and removing the edges to compute the backup. The residual graph becomes disconnected, and we find no corresponding backup path even though one exists. e a b c z f

Shared Risk Groups U second route A to B V SRG 4 W SRG 10 All optical lines in conduit share a same risk SRG 2 SRG 1 SRG 3 SRG 6 X SRG 5 SRG 8 SRG 9 SRG 7 A Y first route A to B Z B SRGs are used to represent sets of edges that may be affected by a common failure, such as fibers routed through a given conduit, etc. Non-trivial SRGs are the norm rather than the exception in telecom networks. There is no known optimum algorithm for arbitrary SRGs. Page: 12

Dedicated vs. Shared Mesh Case Study - Network Page: 13 100 Nodes 137 Links ~ 3000 OC-48 equivalent demands

Page: 14 Dedicated vs. Shared Mesh Case Study - Results US core network - 100 nodes, 137 links 50000 45000 Number of (bi-directional) Channels 40000 35000 30000 25000 20000 15000 10000 26829 19807 19913 26905 19981 9491 21199 11660 21227 Secondary Primary 5000 0 no protection 1+1 link failure protected 1+1 node failure protected Central mesh link failure restored Central mesh node failure restored

Page: 15 Restoration: Scope & Objectives Scope Guaranteed restoration after single failure events TR failure Amplifier failure Fiber/Cable cut Recovery via re-provisioning after multiple concurrent failures Objectives Fast response High degree of robustness Support for different service levels Dedicated (1+1) diverse protection for lowest restoration latency Shared diverse restoration for capacity efficiency Non pre-emptible non-restorable service Pre-emptible service

Page: 16 Mesh Networking and Ring-like Restoration Fast restoration Pre-computed and pre-provisioned backup paths Bit-oriented failure notification & restoration signalling (using SONET/SDH overhead bytes) Fast intra-system communication Robustness Bit-oriented failure notification & restoration signalling Per-channel independent signaling for each lightpath Connection after verification Optimized for the common case Fast and guaranteed restoration for single SRG failure Re-provisioning for multiple concurrent failures

Page: 17 Restoration Protocol Overview F G Shared Backup Path H 8 7 9 4 5 8 Primary Path B C D A 7 5 4 7 3 1 0 7 5 7 4 E 1 9 Drop port Drop port A shared backup path is soft-setup for each shared mesh restorable primary path Channels on the backup path may be shared with other backup paths Crossconnects are not setup during provisioning Path restoration triggers are sent to the end-nodes End-to-end signaling over the backup path activates it and completes the path restoration

Page: 18 Mesh Restoration Simulation Why simulation? Unlike ADM-based rings, restoration speed dependent on network loading and routing Predict restoration performance (e.g., for SLA compliance) Restored lightpaths (%) 100 90 80 70 60 50 40 30 20 10 0 0 20 40 60 80 100 Time since failure event (ms)

Page: 19 Restoration Simulation Methodology Use calibrated simulation tool to predict network performance, using identical traffic loading and routing as real network Equipment model System architecture racks, shelves, interface modules, control modules Internal communication architectures Processing queues Restoration protocol model Parameterize basic events, processes and queuing delays Measure restoration latency in a test network of real systems Tune parameters to calibrate simulation tool

Page: 20 Summary Mesh Restoration Shared mesh restoration is fast! - 10's to 100's of msec Very different from centralized mesh restoration in DCS networks (order of minutes) Distributed ring-like implementation Simulation critical for mesh network operations Similation tool with calibration, identical protocols and routing algorithms as in real network Integration with planning and operations systems Ring restoration will be slower with next-gen products Because same equipment will terminate 10's of rings Restoration processes will compete for resources Restoration times will be dependent on network loading

Page: 21 Network Reliability & Service Availability Speed of restoration is not that important for service availability A = MTBF/(MTBF + MTTR) Impact of MTTR ranging from msec to sec is insignificant Impact of MTTR of several hours if physical repair is needed (in case of double failure) is significant Ability to protect against double failure is key to high service availability

Page: 22 Service Unavailability Mostly due to double failures for protected services Service unavailability is roughly proportional to Length of path for unprotected service Product of length of working and protection path for protected service (higher unavailability of shared mesh over dedicated mesh due to impact of sharing) But decreases when network is split into independent restoration domains, so For limited geographical span, longer protection path on ring causes higher unavailability For larger geographical span, presence of many rings decreases chances of two failures happening in single ring, and ring-based architecture can achieve lower unavailability

Page: 23 Service Unavailability Ring vs. Mesh Unavailability with MTTR = 4 hrs 3 2.5 Unavailable Minutes/Yr 2 1.5 1 0.5 Mesh Ring 0 0 500 1000 1500 2000 2500 3000 3500 Lightpath Length (km)

Page: 24 Service Availability Re-provisioning But mesh networks can be made much more robust by splitting into multiple domains e.g., US/trans-Atlantic/European domains Using end-to-end re-provisioning in case of double failures EMS/NMS-based with fault detection and localization Take 10's of sec instead of hours with manual intervention Can be 100% succesful with enough spare capacity

Page: 25 Planning & Dimensioning Mesh networks are more robust than ring networks to traffic forecast uncertainties Many rings have low utilization because traffic did not materialize where and when predicted Will become less true with next-gen SONET/SDH equipment supporting many rings But ability to re-optimize mesh network Dimensioning mesh networks Can be modeled and analyzed mathematically (e.g., random graphs, Moore bound,...) Properties, approximations, asymptotic behavior (e.g., ratio of protected to working capacity)

TIME Page: 26 Why Re-optimization? Over-time, a network routing diverges from optimality On-line results in drift toward sub-optimal solutions, Service churn, capacity upgrade, and new additions to the network infrastructure, create opportunities for improvement Periodically re-optimize the routes to seize on these opportunities and espouse the network dynamics as it grows COST Cost of current solution Cost of best known solution (that is achievable using same input) Drift RE OPTIMIZE New capacity trunk and/or Service termination (churn)

Re-optimization: Objectives and challenges Page: 27 Objectives of re-optimization Increase routing efficiency and utilization, increase capacity availability to offer more services at no additional cost Decrease paths lengths (reduce latency), improve service quality Challenges Re-optimization is executed from the network operation center, using existing network infrastructure and given capacity Minimize undesirable impacts to services Minimize risks of major service disruptions in case of network failures, i.e. keep protection mechanism functional during re-optimization Ability to pause and resume the re-optimization, in order to cope with unexpected events

Page: 28 Types of Re-optimization Complete: re-optimize both primary and backup paths Most effective, but Requires service interruption to switch from old to new primary path Partial: re-optimize the backup path only No impact on primary, thus transparent to the client layer, Almost as effective as a complete re-optimization, Type can be decided on a per-lightpath basis Complete re-optimization for services with lower SLA Partial re-optimization for services with stringent requirements

How Re-optimization is Done? Network Operation Center Download network state Re optimization Tool Execute re route sequence 1. Select a lightpath 2. Remove it's backup path 3. Compute and provision new backup path 4. Iterate over 1 to 3 until no further improvement is observed Lightpath reroute sequence Re optimize routes Backup paths are re-routed one at a time, the corresponding lightpath is unprotected during re-routing The backup paths of some lightpaths may be reoptimized more than once to perform capacity swaps Intelligence, in selecting the proper lightpaths and proper sequence to achieve the most effective results Page: 29

Page: 30 Network Re-optimization A Real Network Example (45 cities across US) Normalized bandwidth 120 100 80 60 40 Actual solution from online routing. 31% Re-optimization 27% 20 0 Feb-02 Mar-02 Apr-02 May-02 Jul-02 Jun-02 Aug-02 Oct-02 Sep-02 Nov-02 Dec-02 Feb-03 Jan-03 Demand growth Actual used ports Port requirement of best known solution Mar-03

Page: 31 Network Re-optimization - Summary Experience clearly demonstrates benefits of re-optimization The nature of network operations leads to inefficient routing over time Up to 20% capacity saving: freed capacity can be reused for future services (capital avoidance) Backup latency is 30% shorter Procedure is safe Primary paths unaffected, no service interruption One demand at a time is briefly unprotected, while backup path is being re-provisioned Operates within actual network capacity, all operations are performed from network operation center Possible thanks to increased flexibility and efficiency offered by mesh optical networks, based on intelligent optical switches Not applicable to ring-based networks

Page: 32 Maintenance Maintenance operations need to be adapted to mesh networks In ring networks, no interference of maintenance activities between geographically distant rings In mesh networks, back-up capacity in one location can be shared by geographically distant lightpaths Operations can rely on accurate inventory of routes to schedule maintenance activities Ability to constrain routing of lightpaths and sharing Ability to reroute back-up path (as during re-optimization)

Page: 33 Summary Mesh - Overall long term strategic architecture evolution thanks to Capacity efficiency of shared path-based restoration Shared mesh restoration speed (10 s to 100 s msec) Improved reliability with re-provisioning Re-optimization Network control, management, and operations need to be adapted Consistent set of planning and modelling software tools, in addition to management system, critical for operating a mesh network efficiently All tools must be able to interact with each other to transfer data Sophisticated algorithms required for maximum benefits of a mesh network Set of tools along with mesh network intelligence provide higher efficiency, lower cost, higher reliability and service flexibility