Understanding Internet Path Failures: Location, Characterization, Correlation

Similar documents
The Case for Resilient Overlay Networks. MIT Laboratory for Computer Science

Measurement: Techniques, Strategies, and Pitfalls. David Andersen CMU

Dynamics of Hot-Potato Routing in IP Networks

Practical Verification Techniques for Wide-Area Routing

Lecture 15: Measurement Studies on Internet Routing

Some Foundational Problems in Interdomain Routing

Routing, Routing Algorithms & Protocols

A Measurement Study of BGP Misconfiguration

Internet Measurements. Motivation

CS118 Discussion Week 7. Taqi

RIPv2. Routing Protocols and Concepts Chapter 7. ITE PC v4.0 Chapter Cisco Systems, Inc. All rights reserved. Cisco Public

Routing(2) Inter-domain Routing

Watching Data Streams Toward a Multi-Homed Sink Under Routing Changes Introduced by a BGP Beacon

BTEC Level 3 Extended Diploma

Introduction to IP Routing. Geoff Huston

BGP. Autonomous system (AS) BGP version 4. Definition (AS Autonomous System)

A configuration-only approach to shrinking FIBs. Prof Paul Francis (Cornell)

Towards a Logic for Wide-Area Internet Routing

Border Gateway Protocol (an introduction) Karst Koymans. Tuesday, March 8, 2016

Lecture 19: Network Layer Routing in the Internet

Computer Networks CS 552

Computer Networks CS 552

The Impact of Router Outages on the AS-Level Internet

Routing Protocols of IGP. Koji OKAMURA Kyushu University, Japan

Why dynamic route? (1)

Studying Black Holes on the Internet with Hubble

CS118 Discussion 1A, Week 7. Zengwen Yuan Dodd Hall 78, Friday 10:00 11:50 a.m.

The Case for Separating Routing from Routers

Virtual Multi-homing: On the Feasibility of Combining Overlay Routing with BGP Routing

Impact Analysis in MPLS Networks

BGP. Autonomous system (AS) BGP version 4. Definition (AS Autonomous System)

Chapter 4: Network Layer, partb

Deriving Traffic Demands for Operational IP Networks: Methodology and Experience

Building the Routing Table. Introducing the Routing Table Directly Connected Networks Static Routing Dynamic Routing Routing Table Principles

Unit 3: Dynamic Routing

Lecture 16: Interdomain Routing. CSE 123: Computer Networks Stefan Savage

Measuring and Characterizing IPv6 Router Availability

Basic Idea. Routing. Example. Routing by the Network

BGP Routing inside an AS

Planning for Information Network

LIFEGUARD: Practical Repair of Persistent Route Failures

The Internet Measurement Toolbox. Justine Sherry, University of University of Puget Sound April 12, 2010

Routing by the Network

J. A. Drew Hamilton, Jr., Ph.D. Director, Information Assurance Laboratory and Associate Professor Computer Science & Software Engineering

Last time. Transitioning to IPv6. Routing. Tunneling. Gateways. Graph abstraction. Link-state routing. Distance-vector routing. Dijkstra's Algorithm

ROUTING CONSORTIUM TEST SUITE

CS 5114 Network Programming Languages Control Plane. Nate Foster Cornell University Spring 2013

The most simple way to accelerate a Router is at 9.8 m/sec/sec.

CS4450. Computer Networks: Architecture and Protocols. Lecture 15 BGP. Spring 2018 Rachit Agarwal

Effects of Internet Path Selection on Video-QoE

BGP Routing: A study at Large Time Scale

ASA Has High CPU Usage Due to a Traffic Loop When VPN Clients Disconnect

CS519: Computer Networks. Lecture 5, Part 4: Mar 29, 2004 Transport: TCP congestion control

Measurement in ISP Backbones Capacity Planning and SLA Monitoring. NANOG 26 - October 2002 Tony Tauber Genuity Network Architecture

Shortest Paths Algorithms and the Internet: The Distributed Bellman Ford Lecturer: Prof. Chiara Petrioli

Internet Routing Dynamics

CS 43: Computer Networks. 24: Internet Routing November 19, 2018

Internet Control Message Protocol (ICMP)

CMPE 151 Routing. Marc Mosko

Introduction to Intra-Domain Routing

Lab 8.4.2: Show IP Route Challenge Lab

Lecture 13: Routing in multihop wireless networks. Mythili Vutukuru CS 653 Spring 2014 March 3, Monday

Outline Computer Networking. Inter and Intra-Domain Routing. Internet s Area Hierarchy Routing hierarchy. Internet structure

BGP. Autonomous system (AS) BGP version 4. Definition (AS Autonomous System)

ETS110: Internet Protocol Routing Lab Assignment

Path MTU Discovery in Bridged Network

EULER Project Path-Vector Routing Stability Analysis

Chapter 7 Routing Protocols

Routing. Directly Connected IP Networks. Data link layer routing. ifconfig command

Routing(2) Inter-domain Routing

EIGRP. Routing Protocols and Concepts Chapter 9. Video Frank Schneemann, MS EdTech

Cisco Evolved Programmable Network System Test Topology Reference Guide, Release 5.0

BGP. Border Gateway Protocol (an introduction) Karst Koymans. Informatics Institute University of Amsterdam. (version 17.3, 2017/12/04 13:20:08)

Lecture 13: Traffic Engineering

Link State Routing & Inter-Domain Routing

Routing in the Internet

CS 43: Computer Networks Internet Routing. Kevin Webb Swarthmore College November 16, 2017

Internet Engineering Task Force (IETF) Category: Informational. K. Michielsen Cisco Systems November 2011

NIRA: A New Internet Routing Architecture

RIP Version 2. The Classless Brother

CSCE 463/612 Networks and Distributed Processing Spring 2018

5.1 introduction 5.5 The SDN control 5.2 routing protocols plane. Control Message 5.3 intra-as routing in Protocol the Internet

Lecture 4: Intradomain Routing. CS 598: Advanced Internetworking Matthew Caesar February 1, 2011

Configuring Redundant Routing on the VPN 3000 Concentrator

Topology Inference from BGP Routing Dynamics

FiberstoreOS IP Routing Configuration Guide

BGP. Daniel Zappala. CS 460 Computer Networking Brigham Young University

IP - The Internet Protocol. Based on the slides of Dr. Jorg Liebeherr, University of Virginia

Intra-domain Routing

Static Routing and Serial interfaces. 1 st semester

Topics for This Week

FSOS IP Routing Configuration Guide

Routing Concepts. IPv4 Routing Forwarding Some definitions Policy options Routing Protocols

Routing Protocols Classification

Active BGP Measurement with BGP-Mux. Ethan Katz-Bassett (USC) with testbed and some slides hijacked from Nick Feamster and Valas Valancius

CSc 450/550 Computer Networks Internet Routing

Announcements. CS 5565 Network Architecture and Protocols. Project 2B. Project 2B. Project 2B: Under the hood. Routing Algorithms

6.033 Computer System Engineering

Preventing the unnecessary propagation of BGP withdraws

ROUTING PROTOCOLS. Mario Baldi Routing - 1. see page 2

Transcription:

Understanding Internet Path Failures: Location, Characterization, Correlation Nick Feamster, David Andersen, Hari Balakrishnan M.I.T. Laboratory for Computer Science {feamster,dga,hari}@lcs.mit.edu

Big Picture C B A D What? Where? How Long? Warnings? Explain path failures on the Internet What causes failures, and where are they happening, anyway? engineering preventative measures What types of links experience failures, and why? understanding of why RON can route around failures more informed choices about connectivity

Questions Locate: Given a path failure, where did it happen? With respect to end hosts? What types of links? (intra-as, etc.) How can we observe this from the edge? (Traceroute is not good enough!) Characterize: Do failures elicit specific qualities? Does failure occurrence depend on location? Does failure duration depend on location? Correlate: Do failures correlate with routing instability? If so, under what circumstances?

The Trouble with Traceroute Could reflect failure on the reverse path. Solution: Trigger based on observed path failures. Tells us about interfaces, not routers. Maybe even the outgoing interface of the return packet! Solution: Disambiguation techniques. Little information about AS boundaries. Solution: Use knowledge about the topology... Some failures may reflect convergence issues. Solution: BGP Hints?

Disambiguation (Alias Resolution) a.b.c.d Watch returned IP IDs e.f.g.h i.j.k.l a b c m.n.o.p Rocketfuel s IP ID trick. Run the test times per pair to gain confidence.

CiscoSystems SERIES CiscoSystems SERIES AS Edge Resolution with Limited Traceroutes AS A AS B?? a.b.c.d/3 Cisco 75 SERIES CiscoSystems Cisco 75 e.f.g.h/3 Cisco 75 Edge Resolution Algorithm: Voting: IP addresses/router (% have 3 addresses) Some routers are clearly inside an AS. (~27%) Voting: edges towards each AS (inductive). (~7%) Last resort: traceroute to the router in question. From each failure: router, AS, location in AS Future Work: Failure Trajectory

Computing Distances 2 Links directly connected to hosts have distance zero. New links introduce two interfaces. At least one of these must connect to a host for which we ve assigned a distance. Assign minimum distance to end host.

Characterization Methodology 2 pairwise nodes, topologically distributed 6 with BGP feeds Periodic pairwise probing. Trigger traceroutes upon failure. Failure: 3 consecutive lost probes, >2 minutes Results may be affected by faults (as described before).

Failure Characterization Where are they occurring? How long do they last? How does outage duration depend on link type? (edge vs. non-edge) distance from last hop?

Failures Occur Near the Last Hop 25 2 Frequency 5 5 2 3 4 5 6 7 8 Distance 2/3 of observed failures occur intra-as. Why?

A Few Bad Apples Number of Occurrances Aros Korea MA-Cable Greece Link Number

Failure Duration.9.8.7 Fraction.6.5.4.3.2. Intra-AS Edge Time (sec) Observed failures on AS boundaries last slightly longer. Failure durations do not reflect distance from last hop.

Correlating Failures and Routing Instability Do path failures correlate with routing instability? (under what circumstances) Location of failure (i.e., distance from end, link type) Advertisement type (e.g., degree of aggregation, etc.) Path diversity If so, how do they correlate in time?

Degree of Correlation Depends on Host.8 Probability.6.4.2 2 3 4 5 6 7 8 9 Time (secs) England CA-DSL CMU Cornell MA-Cable Korea Greece Failures inside an autonomous system are less likely to be reflected by routing instability than failures on AS boundaries.

Time-based Correlation.2.5 R xx (t)..5 2 5 5 5 5 2 Delay (min) Failures occur several minutes before BGP activity.

Alternate-Route Search Upstream Failure Alternate Routes A A A W ~3 minutes time Failures are commonly accompanied by a march through alternate routes. In what cases do we see this, and to what degree?

Correlation: Failures on AS Boundaries.9.8.7 Fraction.6.5.4.3.2. 5 5 2 25 3 35 4 45 5 Number of BGP Messages Intra-AS Edge Failures inside an autonomous system are less likely to be reflected by routing instability than failures on AS boundaries.

Correlation: Distance from Last Hop.9.8.7 Fraction.6.5.4.3.2. 5 5 2 25 3 35 4 45 5 Number of BGP Messages Failures closer to end hosts are less likely to be reflected by routing instability. 2 3

Thoughts on Correlations Many possible explanations Path Diversity Level of Aggregation Location of Failure Which of these explains well-correlated failures, in each case? Sometimes, continued instability is a sign of trouble to come (or continue). Predictors?

Conclusions Locating Failures Characterizing Path Failures Correlating with Routing Instability Current work Also need to analyze per host to minimize bias. Need to do analysis for peering/transit links. How do correlation trends look across different BGP feeds?