MULTINATIONAL BANKING CORPORATION INVESTS IN ROUTE ANALYTICS TO AVOID OUTAGES

Similar documents
ENSURING SCADA NETWORK CONTINUITY WITH ROUTING AND TRAFFIC ANALYTICS

JUNOS SPACE ROUTE INSIGHT

Route Explorer. Contents

MPLS VPN over mgre. Finding Feature Information. Last Updated: November 1, 2012

ibgp Multipath Load Sharing

Cisco Group Encrypted Transport VPN

Transforming the Cisco WAN with Network Intelligence

IMPLEMENTING CISCO MPLS (MPLS)

Voice of the Customer First American Title SD-WAN Transformation

Optimizing Infrastructure Management with Predictive Analytics: The Red Hat Insights Approach

THE MISSING LAYER: SDN ANALYTICS AND AUTOMATION FOR MULTI-SERVICE NETWORKS

Federal Agencies and the Transition to IPv6

CTO PoV: Enterprise Networks (Part 2) Security for IoT & Cloud

Riverbed. Rapidly troubleshoot critical application and network issues using real-time infrastructure visualization and monitoring.

STEELCENTRAL NETPLANNER

Impact Analysis in MPLS Networks

Cisco Wide Area Application Services: Secure, Scalable, and Simple Central Management

BGP-MVPN SAFI 129 IPv6

FFIEC Cyber Security Assessment Tool. Overview and Key Considerations

Real4Test. Real IT Certification Exam Study materials/braindumps

securing your network perimeter with SIEM

MPLS VPN Carrier Supporting Carrier IPv4 BGP Label Distribution

MPLS VPN Carrier Supporting Carrier IPv4 BGP Label Distribution

BGP Link Bandwidth. Finding Feature Information. Prerequisites for BGP Link Bandwidth

Techniques and Protocols for Improving Network Availability

Case study: UniCredit Tiriac Bank deploys Cisco Network Admission Control

Cisco Training - HD Telepresence MPLS: Implementing Cisco MPLS V3.0. Upcoming Dates. Course Description. Course Outline

Module 16 An Internet Exchange Point

IP/MPLS PERFORMANCE AND PATH ANALYTICS FOR TELEPROTECTION IN POWER UTILITIES

This document is not restricted to specific software and hardware versions.

Operation Manual MCE H3C S3610&S5510 Series Ethernet Switches. Table of Contents

Maximize Network Visibility with NetFlow Technology. Adam Powers Chief Technology Officer Lancope

Intelligent Routing Platform

BGP Support for Next-Hop Address Tracking

Lecture 4: Intradomain Routing. CS 598: Advanced Internetworking Matthew Caesar February 1, 2011

Cisco ASR 9000 Series High Availability: Continuous Network Operations

Shortcut Switching Enhancements for NHRP in DMVPN Networks

CompTIA Exam JK0-023 CompTIA Network+ certification Version: 5.0 [ Total Questions: 1112 ]

2610:f8:ffff:2010:04:13:0085:1

Multiprotocol Label Switching (MPLS) on Cisco Routers

MPLS VPN Inter-AS with ASBRs Exchanging VPN-IPv4 Addresses

ehealth SPECTRUM Integration

Chapter 7: Routing Dynamically. Routing & Switching

Fast convergence project

BGP Cost Community. Prerequisites for the BGP Cost Community Feature

CS 43: Computer Networks. 24: Internet Routing November 19, 2018

Cisco Stealthwatch Improves Threat Defense with Network Visibility and Security Analytics

10 BEST PRACTICES TO STREAMLINE NETWORK MONITORING. By: Vinod Mohan

PASS4TEST. IT Certification Guaranteed, The Easy Way! We offer free update service for one year

CCIE SP Operations Written Exam v1.0

Matching Information Network Reliability to Utility Grid Reliability

EXPLORER SDN PATH PROVISIONING

BGP Support for IP Prefix Export from a VRF Table into the Global Table

Visual TruView Unified Network and Application Performance Management Focused on the Experience of the End User

QUESTION: 1 You have been asked to establish a design that will allow your company to migrate from a WAN service to a Layer 3 VPN service. In your des

VoIP and Network Quality Manager

BGP for Internet Service Providers

Trisul Network Analytics - Traffic Analyzer

<Insert Picture Here> Managing Oracle Exadata Database Machine with Oracle Enterprise Manager 11g

Deploying and Administering Cisco s Digital Network Architecture (DNA) and Intelligent WAN (IWAN) (DNADDC)

Enterprise SD-WAN Financial Profile (Hybrid WAN, Segmentation, Quality of Service, Centralized Policies)

Cisco SP Wi-Fi Solution Support, Optimize, Assurance, and Operate Services

MPLS VPN Route Target Rewrite

Remote Access MPLS-VPNs

BGP Support for Next-Hop Address Tracking

IPv6 Module 16 An IPv6 Internet Exchange Point

Fundamentals and Deployment of Cisco SD-WAN Duration: 3 Days (24 hours) Prerequisites

Multiprotocol Label Switching (MPLS) on Cisco Routers

INTERNET LABORATORY PROJECT. EIGRP Routing Protocol. Abhay Tambe Aniruddha Deshmukh Sahil Jaya

Core of Multicast VPNs: Rationale for Using mldp in the MVPN Core

Network Planning & Engineering

Module 1b IS-IS. Prerequisites: The setup section of Module 1. The following will be the common topology used for the first series of labs.

BGP Policy Accounting

Network Configuration Example

Multi Topology Routing Truman Boyes

MANRS. Mutually Agreed Norms for Routing Security. Jan Žorž

Cisco Connected Factory Accelerator Bundles

Network Security Monitoring with Flow Data

Securizarea Calculatoarelor și a Rețelelor 32. Tehnologia MPLS VPN

VPN Troubleshooting. VPN Troubleshooting CHAPTER20. Tunnel Details

The Case for Separating Routing from Routers

IPv6 migration challenges and Security

locuz.com SOC Services

Implementing Cisco IP Routing (ROUTE)

Troubleshooting Converged Enterprise Networks

IP SLAs Overview. Finding Feature Information. Information About IP SLAs. IP SLAs Technology Overview

SaaS Providers. ThousandEyes for. Summary

Visualize Real-Time Topology, Traffic, and Status in a Single View Troubleshoot Network Issues More Rapidly

IPv6 Switching: Provider Edge Router over MPLS

Cisco EXAM Cisco ADVDESIGN. Buy Full Product.

Configuring Easy Virtual Network Shared Services

Data Plane Protection. The googles they do nothing.

ehealth SPECTRUM Integration

HP A5820X & A5800 Switch Series MPLS. Configuration Guide. Abstract

Automated, Real-Time Risk Analysis & Remediation

Cisco Dynamic Multipoint VPN: Simple and Secure Branch-to-Branch Communications

CS 43: Computer Networks Internet Routing. Kevin Webb Swarthmore College November 16, 2017

Cisco CCNP ROUTE: Implementing Cisco IP Routing (ROUTE) 2.0. Upcoming Dates. Course Description. Course Outline

CompTIA Security+ CompTIA SY0-401 Dumps Available Here at:

MPLS VPN--Inter-AS Option AB

Transcription:

MULTINATIONAL BANKING CORPORATION INVESTS IN ROUTE ANALYTICS TO AVOID OUTAGES CASE STUDY

Table of Contents Organization Background and Network Summary 3 Outage Precursor and Impact 3 Outage Analysis 4 Corrective Actions 5 Page 2 of 8

This case study discusses the cause of a network outage at a multinational banking and financial services corporation and how Packet Design s Route Explorer was implemented for its analytics, modeling and planning capabilities to avoid a recurrence. Organization Background and Network Summary A global banking organization delivers various financial services to its customers, including web and mobile e-commerce and online banking, global credit and debit card services, and asset and investment management. The company has nearly 5,000 banking centers, more than 15,000 automatic teller machines (ATM), five data centers and several customer assistance contact centers. Given the critical need for security, high availability, compliance and rapid response to requests and incidents, the bank, like many others, largely operates and manages its own IP/MPLS network, with some network operations being outsourced. The branches, ATMs and data centers form a large interconnected network referred to as the bank network. Major branches serve as concentration points to combine different types of traffic from smaller branches before being sent to the backbone network. Internal transactions related to banking operations stay within the bank network whereas customer-facing services, such as e-commerce, online banking, wealth management and customer relationship management (CRM), traverse the Internet and are run from DMZs located in the data centers. The network is comprised of HP and Cisco devices, with Cisco ASR and Cisco 7600 series routers used at the edge in the data centers and concentration points. Outage Precursor and Impact The network team had scheduled a weekend maintenance window to perform close to 5,000 changes across the bank network. Among these changes were updates to the BGP routing table at six locations -- four data centers and two of the large branch concentration points. As part of the change, new route numbers were to be added to routing tables in the Cisco ASR and Cisco 7600 series routers, along with updated route-maps. 1 During the maintenance window, the bank began to receive complaints from customers and the customer support teams that the bank s website and online banking facilities were inaccessible. Customers were getting a page cannot be displayed error on the website and the mobile banking applications displayed connection error. Customers and support associates who were already in website and online banking sessions were disconnected and unable to log back in again. At the same time, customers using the bank s debit cards were unable to complete their transactions and support 1 A route-map can be defined as a series of statements that ensures that routes (route numbers or prefixes) in the routing table match policies defined to permit or deny the route advertisement to the rest of the network. A route-map can also include optional attributes about the routes. Page 3 of 8

associates were unable to access various customer support applications, including the CRM system. The network team received alerts from web monitoring tools about dropped sessions and the bank s traffic analyzer reported a drop in the volume of traffic to the data centers. After about 40 minutes of downtime, the issue resolved automatically, and the website and banking services were back online and accessible to the customers and support associates. The bank s post mortem analysis reported more than 200,000 failed interactions. Outage Analysis Within a few minutes of the issue being first reported and alerts generated by their monitoring tools, the network team began analyzing the problem by checking error logs and change logs for the bank network s DNS servers, proxy servers and all network devices. Escalations were triggered and conference calls were set up to analyze and troubleshoot the issue. While this analysis was being performed, the monitoring tools reported that traffic to the data center had returned to baseline and banking applications and customer services were once again accessible. Troubleshooting is extremely hard and sometimes impossible when the issue is already resolved and there is no smoking gun. The bank s network team faced such a scenario and their existing network and traffic monitoring systems could not pinpoint the root cause. It was only after hours of manual analysis of the events that led up to the outage that the network team realized it was the BGP routing table changes on the edge routers that caused the problem. They determined it was related to the sequence in which the commands were issued to the routers. The sequence of adding new route numbers on the routers involved first specifying the new route numbers and then the route-map. The process had been tested in the lab using a Cisco 7600 series router, like the bank s edge routers. The same sequence was then followed during the maintenance window on the live network -- on the four Cisco 7600 series routers and then on the two Cisco ASR routers. However, the IOS for each of these router models handles BGP updates differently. The Cisco 7600 waits for a soft reset command to apply the new routes and advertise the new route numbers based on policies defined in the route-map. The Cisco ASR has a feature that saves route updates automatically as each command is applied. This difference resulted in the new route numbers from the ASR being advertised to the DMZ routers, causing traffic destined for the Internet to be redirected to the bank network and blocking user access to the banking applications hosted from the DMZ. As the change team completed the scheduled changes, unaware of the outage, the new routes were advertised as per policy and the issue ultimately resolved itself. The post mortem analysis later determined through testing that the new routes had been advertised and propagated through the entire bank network within 30 seconds of the commands being entered on the Cisco ASRs. Page 4 of 8

Corrective Actions The bank identified several action plans to mitigate the risk of a recurrence and enable faster detection and resolution should there be one. The requirements included the following: Monitor and alert on changes in routes and route advertisements. This would prevent inadvertent advertisement of routes or route leakage that could lead to similar outages. The monitoring and alerting must be done in real time considering that the new route numbers were advertised and propagated in the network within 30 seconds. Using SNMP or other periodic data collection could result in a similar condition being missed. A method to capture and analyze route path change history. This would help the network engineers performing root cause analysis to troubleshoot conditions that were no longer present. Based on these requirements, Packet Design s Route Explorer was selected for an extensive evaluation and ultimately acquired by the bank. Route Explorer is the base component of the Packet Design Explorer Suite which combines route, traffic and performance analytics. Route Explorer is used worldwide by network engineers, architects, Detecting BGP Prefixes not in Baseline Page 5 of 8

planners and operations teams responsible for today s complex, mission-critical enterprise and service provider networks. It provides management visibility into routing behavior for IGP and BGP protocols, multicast, MPLS VPNs and traffic engineering tunnels, with real-time monitoring, historical reporting and interactive modeling capabilities. Route Explorer is deployed as a passive routing device so it receives (and records) the BGP and IGP exchanges between the routers in a network. The data is used to build and maintain real-time and historical topology models of the IP/MPLS network and overlay services which are used for real-time monitoring and alerting of route, path and topology changes, providing back-in-time forensic analysis, and what-if modeling capabilities to mitigate the risk of unexpected results from maintenance changes. Route Explorer collects route number data and establishes a baseline. When new routes not in baseline are detected, Route Explorer issues an alert. This ensures the bank knows immediately if new routes are advertised to the DMZ or to the bank network, so the team can take steps to avoid another outage. To address the bank s second requirement, Route Explorer s recorded routing data and topology models can be viewed to analyze past events. A DVR-like playback feature allows the network team to rewind and replay the network s routing behavior to understand how routing path changes, even those that lasted only for a few seconds or minutes, impacted services. Routing Path History via Playback Feature Page 6 of 8

In addition to the documented requirements, the bank discovered that Route Explorer strengthens the change management process. By analyzing what-if scenarios, engineers can predict the impact of planned changes. For example, the engineers can simulate adding or removing routers, interfaces, routes, prefixes and peers to see how the network converges and observe unexpected results. After a six-month evaluation, the bank determined that Route Explorer met the requirements and deployed the product in the live network, providing an additional layer of assurance for its critical online services. Page 7 of 8

To learn more about Packet Design and the Explorer Suite, please visit www.packetdesign.com. Page 8 of 8