Vivaldi: A Decentralized Coordinate System

Similar documents
Vivaldi: : A Decentralized Network Coordinate System

Vivaldi: A Decentralized Network Coordinate System. Authors: Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris MIT. Published at SIGCOMM 04

Vivaldi Practical, Distributed Internet Coordinates

Vivaldi: A Decentralized Network Coordinate System

Vivaldi: A Decentralized Network Coordinate System

A Survey on Research on the Application-Layer Traffic Optimization (ALTO) Problem

Charm: A charming network coordinate system

Pharos: A Decentralized and Hierarchical Network Coordinate System for Internet Distance Prediction

Vivaldi Practical, Distributed Internet Coordinates

A view from inside a distributed Internet coordinate system

Tuning Vivaldi: Achieving Increased Accuracy and Stability

Motivation and goal Design concepts and service model Architecture and implementation Performance, and so on...

Re-imagining Vivaldi with HTML5/WebGL. PhD Candidate: Djuro Mirkovic Supervisors: Prof. Grenville Armitage and Dr. Philip Branch

A Scalable Content- Addressable Network

Attacks on Phoenix: A Virtual Coordinate System

Experimental Study on Neighbor Selection Policy for Phoenix Network Coordinate System

On the Utility of Inference Mechanisms

End-user mapping: Next-Generation Request Routing for Content Delivery

Building Large-scale Distributed Systems with Network Coordinates

Scalable Route Selection for IPv6 Multihomed Sites

PlanetLab Deployment, Comparison and Analysis of Network Coordinate Systems

Improving Prediction Accuracy of Matrix Factorization Based Network Coordinate Systems

Coordinate-based Routing:

Path Optimization in Stream-Based Overlay Networks

Lecture 13: Traffic Engineering

On Suitability of Euclidean Embedding of Internet Hosts

Network Coordinates in the Wild

Internet Coordinate Systems Tutorial

NetForecast: A Delay Prediction Scheme for Provider Controlled Networks

Intelligent Neighbor Selection in P2P with CapProbe and Vivaldi

Towards a Two-Tier Internet Coordinate System to Mitigate the Impact of Triangle Inequality Violations

Bridge-Node Selection and Loss Recovery in Island Multicast

Chord : A Scalable Peer-to-Peer Lookup Protocol for Internet Applications

Towards a Two-Tier Internet Coordinate System to Mitigate the Impact of Triangle Inequality Violations

Distributed Hash Table

A Hierarchical Approach to Internet Distance Prediction

Modeling Distances in Large-Scale Networks by Matrix Factorization

Building an Internet-Scale Publish/Subscribe System

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 24, NO. 12, DECEMBER

Hybrid Overlay Structure Based on Random Walks

SVM Learning of IP Address Structure for Latency Prediction

Scalability In Peer-to-Peer Systems. Presented by Stavros Nikolaou

Fundamental Effects of Clustering on the Euclidean Embedding of Internet Hosts

PoP Level Mapping And Peering Deals

Brocade: Landmark Routing on Peer to Peer Networks. Ling Huang, Ben Y. Zhao, Yitao Duan, Anthony Joseph, John Kubiatowicz

Topics in P2P Networked Systems

Internet Anycast: Performance, Problems and Potential

Introduction. Routing & Addressing: Multihoming 10/25/04. The two SIGCOMM papers. A Comparison of Overlay Routing and Multihoming Route Control

Tarantula: Towards an Accurate Network Coordinate System by Handling Major Portion of TIVs

Landmark-based routing

PinPoint: A Ground-Truth Based Approach for IP Geolocation

Phoenix: Towards an Accurate, Practical and Decentralized Network Coordinate System

Veracity: Practical Secure Network Coordinates via Vote-Based Agreements

The Frog-Boiling Attack: Limitations of Anomaly Detection for Secure Network Coordinate Systems

Towards Optimal Data Replication Across Data Centers

Supporting Network Coordinates on PlanetLab

Efficient Formation of Edge Cache Groups for Dynamic Content Delivery

Object Location Query

A Reputation-Based Approach for Securing Vivaldi Embedding System

6.033 Computer System Engineering

The Case for Resilient Overlay Networks. MIT Laboratory for Computer Science

Authors: Rupa Krishnan, Harsha V. Madhyastha, Sridhar Srinivasan, Sushant Jain, Arvind Krishnamurthy, Thomas Anderson, Jie Gao

Semester Thesis on Chord/CFS: Towards Compatibility with Firewalls and a Keyword Search

Differentiating Link State Advertizements to Optimize Control Overhead in Overlay Networks

ENSC 835: HIGH-PERFORMANCE NETWORKS CMPT 885: SPECIAL TOPICS: HIGH-PERFORMANCE NETWORKS. Scalability and Robustness of the Gnutella Protocol

Network Traffic Measurements and Analysis

Drafting Behind Akamai (Travelocity-Based Detouring)

5.1 introduction 5.5 The SDN control 5.2 routing protocols plane. Control Message 5.3 intra-as routing in Protocol the Internet

CS 457 Networking and the Internet. What is Routing. Forwarding versus Routing 9/27/16. Fall 2016 Indrajit Ray. A famous quotation from RFC 791

CMSC711 Project: Alias Resolution

Architecture and Evaluation of an Unplanned b Mesh Network

A Distributed Web Cache using Load-Aware Network Coordinates

Pharos: accurate and decentralised network coordinate system Y. Chen 1 Y. Xiong 2 X. Shi 1 J. Zhu 3 B. Deng 1 X. Li 1

YouLighter: An Unsupervised Methodology to Unveil YouTube CDN Changes

Building a low-latency, proximity-aware DHT-based P2P network

Network Positioning from the Edge

From Routing to Traffic Engineering

Proceedings of the First Symposium on Networked Systems Design and Implementation

Constructing Internet Coordinate System Based on Delay Measurement

Technical Report. An implementation of a coordinate based location system. David R. Spence. Number 576. November Computer Laboratory

MULTI-DOMAIN VoIP PEERING USING OVERLAY NETWORK

Chapter 6 PEER-TO-PEER COMPUTING

Network Measurement. Xu Zhang Vision & CITE Lab, Nanjing University 2018/10/17 1

Turbo King: Framework for Large- Scale Internet Delay Measurements

Efficient solutions for the monitoring of the Internet

Virtual Multi-homing: On the Feasibility of Combining Overlay Routing with BGP Routing

Attaining the Promise and Avoiding the Pitfalls of TCP in the Datacenter. Glenn Judd Morgan Stanley

Chord: A Scalable Peer-to-peer Lookup Service For Internet Applications

A Cost-Space Approach to Distributed Query Optimization in Stream Based Overlays

The Frog-Boiling Attack: Limitations of Secure Network Coordinate Systems

Peer-to-peer computing research a fad?

HyCube: A distributed hash table based on a hierarchical hypercube geometry

Handling Churn in a DHT

NetFlow-based bandwidth estimation in IP networks

Application Layer Multicast For Efficient Peer-to-Peer Applications

Data Center Services and Optimization. Sobir Bazarbayev Chris Cai CS538 October

ProRenaTa: Proactive and Reactive Tuning to Scale a Distributed Storage System

Akamai's V6 Rollout Plan and Experience from a CDN Point of View. Christian Kaufmann Director Network Architecture Akamai Technologies, Inc.

Dynamic Landmark Triangles: A Simple and Efficient Mechanism for Inter-Host Latency Estimation

Akamai's V6 Rollout Plan and Experience from a CDN Point of View. Christian Kaufmann Director Network Architecture Akamai Technologies, Inc.

Transcription:

Vivaldi: A Decentralized Coordinate System Frank Dabek, Russ Cox, Frans Kaashoek and Robert Morris ACM SIGCOMM Computer Communication Review. Vol.. No.. ACM, 00. Presenter: Andrew and Yibo Peer-to-Peer systems There are many nodes to communicate with, you want to choose to talk to the node that is closest (lowest RTT) One approach is to calculate RTT with each node, and talk to closest node For small clusters or large transfers, this works great! But what about large content distribution systems (i.e. KaZaA, BitTorrent)???? What about systems with small messages (i.e. DNS) Peer-to-Peer systems Coordinate System Requirements. Accuracy -- embed Internet with little error You want to put nodes on a coordinate system If your coordinate system approximates RTT well, use it instead of probes!. Scale to many hosts -- pp scale. Decentralized algorithm -- pp applications. Very little probe traffic -- reduce burden on system. Adapt to network conditions -- not a static representation

Outline Vivaldi Network Model. Introdude need for coordinate systems. Design of Vivaldi. Evaluation of Vivaldi Treat the RTT between two nodes as a spring If distance in coordinates is equal to RTT, no tension in spring If distance in coordinates is not equal to RTT, tension in spring X i i X i i L i,j L i,j X j j X j j Vivaldi Network Model Vivaldi Network Model Measure error of a particular node (x i ) as the energy in all springs for the node Σ j (L i,j - x i - x j ) Measure error of whole system as the energy in all springs E = Σ i Σ j (L i,j - x i - x j ) Goal is to choose coordiantes x that minimize E

Vivaldi Centralized Algorithm Vivaldi Centralized Algorithm Big idea: for each node i,. figure out the total force of the springs between i and all nodes j. Move i by that force While error(l,x) > tolerance For each node i: F = 0 For each node j: //error of the spring between i and j e = L ij - x i - x j //add error to force vector of this spring F = F + e x u(x i - x j ) //move node i by a small step in the direction of the force x i = x i + t x F Vivaldi centralized algorithm While error(l,x) > tolerance For each node i: F = 0 For each node j: //error of the spring between i and j e = L ij - x i - x j //add error to force vector of this spring F = F + e x u(x i - x j ) //move node i by a small step in the direction of the force x i = x i + t x F We re assuming we know all RTTs for all pairs of nodes These RTTs are what we re trying to approximate! Vivaldi centralized algorithm While error(l,x) > tolerance For each node i: F = 0 Two changes For each node to make: j: //error of the spring between i and j e = L ij - x i - x j //add error to force vector of this spring F = F + e x u(x i - x j ) //move node i by a small step in the direction of the force x i = x i + t x F We re assuming we know all RTTs for all pairs of nodes These RTTs are what we re trying to approximate!. We need to calculate the coordinates of system using only a few RTTs. We need to do this using a distributed algorithm

Vivaldi Distributed algorithm Vivaldi Distributed algorithm Each node stores its own coordinate When it communicates with another node it measures RTT x x x x x Each node stores its own coordiante When it communicates with another node it measures RTT Moves itself proportional to the force within the spring x i = x i +! (rtt - x i - x j ) u(x i - x j ) x x x x x Vivaldi Distributed algorithm Vivaldi Distributed algorithm Each node stores its own coordiante When it communicates with another node it measures RTT x x x x! =.000! = Moves itself proportional to the force within the spring x i = x i +! (rtt - x i - x j ) u(x i - x j ) x

Vivaldi Distributed algorithm! =.000! = Adapt!. Converge quickly with a large!; as we become more certain of our location, make! smaller Vivaldi distributed algorithm //Given a sample rtt with node j, which has coordinate x j, error e j vivaldi(rtt, x j, e j ) //sample weight balances both local and remote errors w = e i / (e i + e j ) //calculate wieghted moving average of error of our samples e i = weighted_moving_average(e i, w, x i, x j, rtt) //Update local coordinates x_i = x_i + w (rtt - x i - x j ) u(x i - x j ) Evaluation methodology Latency data: two datasets ) Latency matrix for 9 hosts on PlanetLab network a) All pairs ping trace How to define latency? Latency?= minimum RTT Not for King, since King can report a RTT less than true value Use median to filter out transient congestion and packet loss ) Lacency matrix for 70 DNS nameservers a) Use King to collect latency b) Handling multiple authorative nameservers? i) Only use domains where authorative nameservers are on the same subnet large delay due to high load at nameserver A >> delay btw A and B

Using the data Using RTT matrices as inputs to a packet-level network simulator Each nodes run the decentralized Vivaldi algorithm Limitation of the simulator: RTTs do not vary over time, no queueing delays Why not simulating queueing delay? Because this needs modeling underlying network infrastructure (model a model!) Just stick to real data Evaluation. Effectiveness of the adaptive time-step!. How well Vivaldi handle high-error nodes. Vivaldi s sensitivity to communication patterns. Vivaldi s repsonsiveness to network changes landmark (x, y) landmark (x, y) landmark (x, y). Vivaldi s accuracy compared to that of global network positioning (GNP) (x, y) rtt rtt rtt ordinary host Effectiveness of the adaptive time-step! Network error: median of all nodes errors How well Vivaldi handle high-error nodes Evolution of a stalbe 00-node network after 00 new nodes join fixed! x i = x i +! (rtt - x i - x j ) u(x i - x j ) adaptive! = c*local error/(local error + remote error) local error = abs(predicted rtt - actual rtt)/actual rtt

How well Vivaldi handle high-error nodes Median link errors: median of all link errors Vivaldi s sensitivity to communication patterns Pattern : communicate with four neighbors Pattern : communicate with both neigbhors & long-distance hosts (get a global sense of their place in the network) How much long-distance comm. is necessary? A grid of 00 nodes. Each node is assigned neighbors and faraway random nodes. Adapting to network changes Use ITM tool to generate a transit-stub topology of 00 hosts At each step, each nodes chooses a faraway node with probability p among these 8 nodes. ms back to the previous topology transit-stub links become much longer

Accuracy Accuracy vs. the number of neighbors Compared with GNP best (Lowest median error) PlanetLab King Suitability for embedding? Euclidean space Triangluar inequaltiy violation In Euclidean space, triangular inequality holds. In network context, not necessary. A 0ms poorly provisioned link ms B 0ms C PlanetLab King lowest indirect path / direct path = (+0)/0 Conclusion: suitable

Spherical coordinates To model the shape of Earth Euclidean space with heights Euclidean space assumption: latency propotional to gegraphic distance Access link could be slow in the case of cable modems and telephone modems A height dimension for the access link Accuracy Graphical comparison -D Vivaldi w/o heights -D Vivaldi w/o heights Dataset: King -D Vivaldi w/ heights projected to -D Heights

Discussion Strengths: Very elegently designed solution Evaluation shows the strenght of the solution Weaknesses: Is the need still there? How many pp systems still out there? Heterogenious distributed systems?