Fixing the Embarrassing Slowness of OpenDHT on PlanetLab

Similar documents
Handling Churn in a DHT

Fixing the Embarrassing Slowness of OpenDHT on PlanetLab

A Public DHT Service

Internet Indirection Infrastructure (i3) Ion Stoica, Daniel Adkins, Shelley Zhuang, Scott Shenker, Sonesh Surana. UC Berkeley SIGCOMM 2002

R/Kademlia: Recursive and Topology-aware Overlay Routing

: Scalable Lookup

Scalability In Peer-to-Peer Systems. Presented by Stavros Nikolaou

Distributed Hash Table

Distributed Hash Tables: Chord

Peer-to-Peer Systems and Distributed Hash Tables

A Scalable Content- Addressable Network

Today. Why might P2P be a win? What is a Peer-to-Peer (P2P) system? Peer-to-Peer Systems and Distributed Hash Tables

Lecture 6: Overlay Networks. CS 598: Advanced Internetworking Matthew Caesar February 15, 2011

08 Distributed Hash Tables

Distributed Hash Tables (DHTs) OpenDHT: A Shared, Public DHT Service

Distributed Hash Tables

Kademlia: A P2P Informa2on System Based on the XOR Metric

Peer-to-peer computing research a fad?

Survey of DHT Evaluation Methods

CS514: Intermediate Course in Computer Systems

ResQ: Enabling SLOs in Network Function Virtualization

Talk Outline. Beyond Theory: DHTs in Practice. How Bad is Churn in Real Systems? Making DHTs Robust: The Problem of Membership Churn

Coordinate-based Routing:

John S. Otto Mario A. Sánchez John P. Rula Fabián E. Bustamante

Goals. EECS 122: Introduction to Computer Networks Overlay Networks and P2P Networks. Solution. Overlay Networks: Motivations.

Non-Transitive Connectivity and DHTs

Building an Internet-Scale Publish/Subscribe System

The Design and Implementation of a Next Generation Name Service for the Internet (CoDoNS) Presented By: Kamalakar Kambhatla

Staggeringly Large File Systems. Presented by Haoyan Geng

Proceedings of the First Symposium on Networked Systems Design and Implementation

EECS 122: Introduction to Computer Networks Overlay Networks and P2P Networks. Overlay Networks: Motivations

PEER-TO-PEER NETWORKS, DHTS, AND CHORD

Overlay networks. Today. l Overlays networks l P2P evolution l Pastry as a routing overlay example

CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring Lecture 20: Networks and Distributed Systems

CompSci 356: Computer Network Architectures Lecture 21: Overlay Networks Chap 9.4. Xiaowei Yang

CIS 700/005 Networking Meets Databases

Chord: A Scalable Peer-to-peer Lookup Service For Internet Applications

CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring Lecture 19: Networks and Distributed Systems

Topics in P2P Networked Systems

Engineering Goals. Scalability Availability. Transactional behavior Security EAI... CS530 S05

Early Measurements of a Cluster-based Architecture for P2P Systems

Part 1: Introducing DHTs

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2015 Lecture 14 NoSQL

Overlay Networks: Motivations. EECS 122: Introduction to Computer Networks Overlay Networks and P2P Networks. Motivations (cont d) Goals.

Democratizing Content Publication with Coral

P2P Network Structured Networks: Distributed Hash Tables. Pedro García López Universitat Rovira I Virgili

Page 1. Key Value Storage" System Examples" Key Values: Examples "

Democratizing Content Publication with Coral

Reprise: Stability under churn (Tapestry) A Simple lookup Test. Churn (Optional Bamboo paper last time)

Veracity: Practical Secure Network Coordinates via Vote-Based Agreements

DYNAMO: AMAZON S HIGHLY AVAILABLE KEY-VALUE STORE. Presented by Byungjin Jun

Badri Nath Rutgers University

Overlay and P2P Networks. Structured Networks and DHTs. Prof. Sasu Tarkoma

HDFS Architecture. Gregory Kesden, CSE-291 (Storage Systems) Fall 2017

CS 61C: Great Ideas in Computer Architecture. MapReduce

Page 1. Key Value Storage" System Examples" Key Values: Examples "

Routing in Sensor Networks

CSE 473 Introduction to Computer Networks. Midterm Exam Review

Introduction to Peer-to-Peer Systems

Tales of the Tail Hardware, OS, and Application-level Sources of Tail Latency

Security for Structured Peer-to-peer Overlay Networks. Acknowledgement. Outline. By Miguel Castro et al. OSDI 02 Presented by Shiping Chen in IT818

EE 122: Peer-to-Peer Networks

Finding Data in the Cloud using Distributed Hash Tables (Chord) IBM Haifa Research Storage Systems

MULTI-DOMAIN VoIP PEERING USING OVERLAY NETWORK

Remote Procedure Call. Tom Anderson

Flooded Queries (Gnutella) Centralized Lookup (Napster) Routed Queries (Freenet, Chord, etc.) Overview N 2 N 1 N 3 N 4 N 8 N 9 N N 7 N 6 N 9

Peer-to-peer systems and overlay networks

DISTRIBUTED SYSTEMS [COMP9243] Lecture 9a: Naming WHAT IS NAMING? Name: Entity: Slide 3. Slide 1. Address: Identifier:

Ken Birman. Cornell University. CS5410 Fall 2008.

automating server selection with OASIS

Distributed Systems. 16. Distributed Lookup. Paul Krzyzanowski. Rutgers University. Fall 2017

Toward Intrusion Tolerant Cloud Infrastructure

Making Gnutella-like P2P Systems Scalable

TECHNISCHE UNIVERSITEIT EINDHOVEN Faculteit Wiskunde en Informatica

Peer-to-Peer Systems. Network Science: Introduction. P2P History: P2P History: 1999 today

Naming in Distributed Systems

Click to edit Master title

Workshop Report: ElaStraS - An Elastic Transactional Datastore in the Cloud

Data-Centric Query in Sensor Networks

Scaling Out Key-Value Storage

CSE 5306 Distributed Systems

Implications of Neighbor Selection on DHT Overlays

Distributed Key-Value Stores UCSB CS170

Vivaldi Practical, Distributed Internet Coordinates

Distributed File Systems

Challenges in the Wide-area. Tapestry: Decentralized Routing and Location. Global Computation Model. Cluster-based Applications

Performing MapReduce on Data Centers with Hierarchical Structures

CS 43: Computer Networks BitTorrent & Content Distribution. Kevin Webb Swarthmore College September 28, 2017

CSE 473 Introduction to Computer Networks. Exam 2. Your name here: 11/7/2012

Reminder: Distributed Hash Table (DHT) CS514: Intermediate Course in Operating Systems. Several flavors, each with variants.

Structured Peer-to-Peer Overlays Need Application-Driven Benchmarks

CSCI 466 Midterm Networks Fall 2013

CSE 123A Computer Networks

PebblesDB: Building Key-Value Stores using Fragmented Log Structured Merge Trees

FaSST: Fast, Scalable, and Simple Distributed Transactions with Two-Sided (RDMA) Datagram RPCs

Data Replication under Latency Constraints Siu Kee Kate Ho

Large-Scale Data Stores and Probabilistic Protocols

Overlay networks. To do. Overlay networks. P2P evolution DHTs in general, Chord and Kademlia. Turtles all the way down. q q q

CSE 5306 Distributed Systems. Naming

ZHT A Fast, Reliable and Scalable Zero- hop Distributed Hash Table

Transcription:

Fixing the Embarrassing Slowness of OpenDHT on PlanetLab Sean Rhea, Byung-Gon Chun, John Kubiatowicz, and Scott Shenker UC Berkeley (and now MIT) December 13, 2005

Distributed Hash Tables (DHTs) Same interface as a traditional hash table put(key, value) stores value under key get(key) returns all the values stored under key Built over a distributed overlay network Partition key space over available nodes Route each put/get request to appropriate node

DHTs: The Hype High availability Each key-value pair replicated on multiple nodes Incremental scalability Need more storage/tput? Just add more nodes. Low latency Recursive routing, proximity neighbor selection, server selection, etc.

DHTs: The Hype Promises of DHTs realized only in the lab Use isolated network (Emulab, ModelNet) Measure while PlanetLab load is low Look only at median performance Our goal: make DHTs perform in the wild Network not isolated, machines shared Look at long term 99th percentile performance (Caveat: no outright malicious behavior)

Why We Care Promise of P2P was to harness idle capacity Not supposed to need dedicated machines Running OpenDHT service on PlanetLab No control over what else is running Load can be really bad at times Up 24/7: have to weather good times and bad Good median performance isn t good enough

Original OpenDHT Performance Long-term median get latency < 200 ms Matches performance of DHASH on PlanetLab Median RTT between hosts ~ 140 ms

Original OpenDHT Performance But 95th percentile get latency is atrocious! Generally measured in seconds And even median spikes up from time to time

Talk Overview Introduction and Motivation How OpenDHT Works The Problem of Slow Nodes Algorithmic Solutions Experimental Results Related Work and Conclusions

OpenDHT Partitioning responsible for these keys Assign each node an identifier from the key space Store a key-value pair (k,v) on several nodes with IDs closest to k Call them replicas for (k,v) id = 0xC9A1

OpenDHT Graph Structure Overlay neighbors match prefixes of local identifier Choose among nodes with same matching prefix length by network latency 0xC0 0xED 0x41 0x84

Performing Gets in OpenDHT Client sends a get request to gateway Gateway routes it along neighbor links to first replica encountered Replica sends response back directly over IP client gateway get(0x6b) get response get(0x6b) 0x41 0x6c

Robustness Against Failure If a neighbor dies, a node routes through its next best one If replica dies, remaining replicas create a new one to replace it client 0xC0 0x41 0x6c

The Problem of Slow Nodes What if a neighbor doesn t fail, but just slows down temporarily? If it stays slow, node will replace it But must adapt slowly for stability Many sources of slowness are short-lived Burst of network congestion causes packet loss User loads huge Photoshop image, flushing buffer cache In either case, gets will be delayed

Flavors of Slowness At first, slowness may be unexpected May not notice until try to route through a node First few get requests delayed Can keep history of nodes performance Stop subsequent gets from suffering same fate Continue probing slow node for recovery

Talk Overview Introduction and Motivation How OpenDHT Works The Problem of Slow Nodes Algorithmic Solutions Experimental Results Related Work and Conclusions

Two Main Techniques Delay-aware routing Guide routing not just by progress through key space, but also by past responsiveness

Delay-Aware Routing 30 ms Gateway 30 ms 30 ms Best next hop Replicas

Delay-Aware Routing 30 ms Gateway 50 ms 30 ms About the same? Replicas

Delay-Aware Routing 30 ms Gateway 500 ms 30 ms Best next hop Replicas

Two Main Techniques Delay-aware routing Guide routing not just by progress through key space, but also by past responsiveness Cheap, but must first observe slowness Added parallelism Send each request along multiple paths

Naïve Parallelism Gateway sudden delay Replicas

Multiple Gateways (Only client replicates requests.) Client Gateways sudden delay Replicas

Client Iterative Routing (Gateway maintains p concurrent RPCs.) Gateways sudden delay Replicas

Two Main Techniques Delay-aware routing Guide routing not just by progress through key space, but also by past responsiveness Cheap, but must first observe slowness Added parallelism Send each request along multiple paths Expensive, but handles unexpected slowness

Talk Overview Introduction and Motivation How OpenDHT Works The Problem of Slow Nodes Algorithmic Solutions Experimental Results Related Work and Conclusions

Experimental Setup Can t get reproducible numbers from PlanetLab Both available nodes and load change hourly But PlanetLab is the environment we care about Solution: run all experiments concurrently Perform each get using every mode (random order) Look at results over long time scales: 6 days; over 27,000 samples per mode

Delay-Aware Routing Greedy Mode Delay-Aware Latency (ms) 50th 99th 150 4400 100 1800 Cost Msgs Bytes 5.5 1800 6.0 2000 Latency drops by 30-60% Cost goes up by only ~10%

Multiple Gateways # of Gateways 1 2 3 Latency (ms) 50th 99th 100 1800 70 610 57 440 Cost Msgs Bytes 6.0 2000 12 4000 17 5300 Latency drops by a further 30-73% But cost doubles or worse

Iterative Routing # of Gateways Mode 50th 99th Cost Msgs Bytes 1 3 Recursive 100 57 1800 440 6.0 17 2000 5300 1 3-way 120 790 15 3800 2 Iterative 76 360 27 6700 Parallel iterative not as cost effective as just using multiple gateways

Talk Overview Introduction and Motivation How OpenDHT Works The Problem of Slow Nodes Algorithmic Solutions Experimental Results Related Work and Conclusions

Related Work Google MapReduce Cluster owned by single company Could presumably make all nodes equal Turns out it s cheaper to just work around the slow nodes instead Accordion Another take on recursive parallel lookup Other related work in paper

Conclusions Techniques for reducing get latency Delay-aware routing is a clear win Parallelism very fast, but costly Iterative routing not cost effective OpenDHT get latency is now quite low Was 150 ms on median, 4+ seconds on 99th Now under 100 ms on median, 500 ms on 99th Faster than DNS [Jung et al. 2001]

Thanks! For more information: http://opendht.org/