An Adaptive Stabilization Framework for DHT

Similar documents
An Adaptive Stabilization Framework for Distributed Hash Tables

Lecture 6: Overlay Networks. CS 598: Advanced Internetworking Matthew Caesar February 15, 2011

March 10, Distributed Hash-based Lookup. for Peer-to-Peer Systems. Sandeep Shelke Shrirang Shirodkar MTech I CSE

Goals. EECS 122: Introduction to Computer Networks Overlay Networks and P2P Networks. Solution. Overlay Networks: Motivations.

Peer-to-Peer Systems. Chapter General Characteristics

EE 122: Peer-to-Peer (P2P) Networks. Ion Stoica November 27, 2002

P2P Network Structured Networks: Distributed Hash Tables. Pedro García López Universitat Rovira I Virgili

Scalability In Peer-to-Peer Systems. Presented by Stavros Nikolaou

Symphony. Symphony. Acknowledgement. DHTs: : The Big Picture. Spectrum of DHT Protocols. Distributed Hashing in a Small World

Searching for Shared Resources: DHT in General

Searching for Shared Resources: DHT in General

Peer-to-Peer Systems and Distributed Hash Tables

CIS 700/005 Networking Meets Databases

08 Distributed Hash Tables

Today. Why might P2P be a win? What is a Peer-to-Peer (P2P) system? Peer-to-Peer Systems and Distributed Hash Tables

L3S Research Center, University of Hannover

EECS 122: Introduction to Computer Networks Overlay Networks and P2P Networks. Overlay Networks: Motivations

Distributed Hash Tables: Chord

Page 1. How Did it Start?" Model" Main Challenge" CS162 Operating Systems and Systems Programming Lecture 24. Peer-to-Peer Networks"

Understanding Disconnection and Stabilization of Chord

Implications of Neighbor Selection on DHT Overlays

PEER-TO-PEER (P2P) systems are now one of the most

Content Overlays. Nick Feamster CS 7260 March 12, 2007

A Framework for Peer-To-Peer Lookup Services based on k-ary search

Overlay Networks: Motivations. EECS 122: Introduction to Computer Networks Overlay Networks and P2P Networks. Motivations (cont d) Goals.

Handling Churn in a DHT

EE 122: Peer-to-Peer Networks

P2P: Distributed Hash Tables

Chord: A Scalable Peer-to-peer Lookup Service For Internet Applications

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University

Overlay Networks. Behnam Momeni Computer Engineering Department Sharif University of Technology

Flooded Queries (Gnutella) Centralized Lookup (Napster) Routed Queries (Freenet, Chord, etc.) Overview N 2 N 1 N 3 N 4 N 8 N 9 N N 7 N 6 N 9

Distributed Hash Tables

Architectures for Distributed Systems

Bandwidth-efficient management of DHT routing tables

LECT-05, S-1 FP2P, Javed I.

Distributed K-Ary System

CS 640 Introduction to Computer Networks. Today s lecture. What is P2P? Lecture30. Peer to peer applications

Distributed File Systems: An Overview of Peer-to-Peer Architectures. Distributed File Systems

Exploiting Communities for Enhancing Lookup Performance in Structured P2P Systems

An Expresway over Chord in Peer-to-Peer Systems

Small-World Overlay P2P Networks: Construction and Handling Dynamic Flash Crowd

Lecture 8: Application Layer P2P Applications and DHTs

CS 268: DHTs. Page 1. How Did it Start? Model. Main Challenge. Napster. Other Challenges

*Adapted from slides provided by Stefan Götz and Klaus Wehrle (University of Tübingen)

CSE 124 Finding objects in distributed systems: Distributed hash tables and consistent hashing. March 8, 2016 Prof. George Porter

CS 347 Parallel and Distributed Data Processing

Cycloid: A constant-degree and lookup-efficient P2P overlay network

Overlay and P2P Networks. Structured Networks and DHTs. Prof. Sasu Tarkoma

Chord : A Scalable Peer-to-Peer Lookup Protocol for Internet Applications

PEER-TO-PEER NETWORKS, DHTS, AND CHORD

Tracking Objects in Distributed Systems. Distributed Systems

Overlay networks. To do. Overlay networks. P2P evolution DHTs in general, Chord and Kademlia. Turtles all the way down. q q q

Self-Correcting Broadcast in Distributed Hash Tables

Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications

CPSC 426/526. P2P Lookup Service. Ennan Zhai. Computer Science Department Yale University

DISTRIBUTED COMPUTER SYSTEMS ARCHITECTURES

Early Measurements of a Cluster-based Architecture for P2P Systems

CompSci 356: Computer Network Architectures Lecture 21: Overlay Networks Chap 9.4. Xiaowei Yang

P2P Network Structured Networks: Distributed Hash Tables. Pedro García López Universitat Rovira I Virgili

Distributed Hash Tables

Distributed Hash Tables

The Design and Implementation of a Next Generation Name Service for the Internet (CoDoNS) Presented By: Kamalakar Kambhatla

416 Distributed Systems. Mar 3, Peer-to-Peer Part 2

Finding Data in the Cloud using Distributed Hash Tables (Chord) IBM Haifa Research Storage Systems

Distributed Hash Tables

Link Lifetimes and Randomized Neighbor Selection in DHTs

Lecture 6: Securing Distributed and Networked Systems. CS 598: Network Security Matthew Caesar March 12, 2013

CS514: Intermediate Course in Computer Systems

Effects of Churn on Structured P2P Overlay Networks

A DHT-Based Grid Resource Indexing and Discovery Scheme

Athens University of Economics and Business. Dept. of Informatics

Saarland University Faculty of Natural Sciences and Technology I Department of Computer Science. Masters Thesis

Large-Scale Data Stores and Probabilistic Protocols

Making Gnutella-like P2P Systems Scalable

Distributed Systems. 17. Distributed Lookup. Paul Krzyzanowski. Rutgers University. Fall 2016

Minimizing Churn in Distributed Systems

CS4700/CS5700 Fundamentals of Computer Networks

DISTRIBUTED HASH TABLE PROTOCOL DETECTION IN WIRELESS SENSOR NETWORKS

15-744: Computer Networking P2P/DHT

Content Overlays (continued) Nick Feamster CS 7260 March 26, 2007

CRESCENDO GEORGE S. NOMIKOS. Advisor: Dr. George Xylomenos

OnehopMANET: One-hop Structured P2P over Mobile ad hoc Networks

Distributed Evaluation of Continuous Equi-join Queries over Large Structured Overlay Networks

Simulations of Chord and Freenet Peer-to-Peer Networking Protocols Mid-Term Report

Telematics Chapter 9: Peer-to-Peer Networks

Scalable overlay Networks

Scaling Out Systems. COS 518: Advanced Computer Systems Lecture 6. Kyle Jamieson

Kleinberg s Small-World Networks. Normalization constant have to be calculated:

A Scalable Content- Addressable Network

A Structured Overlay for Non-uniform Node Identifier Distribution Based on Flexible Routing Tables

Delay and Capacity Analysis of Structured P2P Overlay for Lookup Service

Building a low-latency, proximity-aware DHT-based P2P network

Overlay and P2P Networks. Introduction and unstructured networks. Prof. Sasu Tarkoma

Opportunistic Application Flows in Sensor-based Pervasive Environments

Distributed Hashtable on Pre-structured Overlay Networks

L3S Research Center, University of Hannover

Introduction to P2P Computing

Main Challenge. Other Challenges. How Did it Start? Napster. Model. EE 122: Peer-to-Peer Networks. Find where a particular file is stored

Decentralized Object Location In Dynamic Peer-to-Peer Distributed Systems

Transcription:

An Adaptive Stabilization Framework for DHT Gabriel Ghinita, Yong Meng Teo Department of Computer Science National University of Singapore {ghinitag,teoym}@comp.nus.edu.sg Overview Background Related Work Adaptive Stabilization Framework Performance Evaluation Conclusion Q & A 2 1

Background 3 P2P Systems P2P characteristics large-scale, highly-dynamic environments nodes organize into an overlay network routing state: pointers to other peers design challenges scalability, self-organization, fault-tolerance Structured P2P / Distributed Hash Tables (DHT) overlay organization abides strict rules lookup process is deterministic upper bound on lookup performance distributed storage, multicast, publish-subscribe 4 2

DHT Characteristics Virtualize node and data items to common key space each peer is assigned a key space subset Hashtable interface: Get(key),Put(key,value) key lookup underlies every DHT operation Bounded routing state and lookup complexity logn / logn widely-used compromise Implementations: Chord, CAN, Pastry, Tapestry, etc 5 Churn Nodes join and leave/fail freely routing state inconsistent (routing constraints not satisfied) failed lookup operations (incorrect/incomplete) increased lookup path length disconnection Measurement join rate (global): # nodes joining/sec failure rate (global) or node lifetime (local) 6 3

Stabilization Execution of corrective routines check if pointer still alive (refresh) check if pointer still abides constraints (re-pin) may generate considerable control traffic Performance metrics lookup failure average lookup path length communication overhead # failed _ lookups # total _ lookups # stabilization _ msg # user _ msg 7 Related Work 8 4

DHT Stabilization Periodic Stabilization (PS) most widely used (Chord, Pastry, CAN, etc) ad-hoc stabilization rate, no failure lookup bounds unsuitable for variable churn Correction-on-use/Correction-on-change limited to DKS DHT Physics-style approach Accordion [Li et al, 2005] [El-Ansary et al, 2003] Adaptive Stabilization (concept) [Krishnamurthy et al, 2005] [Mahajan et al, 2003] 9 Motivation Periodic stabilization has limitations stabilization interval fixed at deployment difficult to estimate proper stabilization rate unsuitable for variable churn Implications poor overlay performance disconnection, failed lookups, increased path length excessive control messages overhead 10 5

Chord Circular, 1-dimensional m-bit identifier space Routing state: three sections Successor Successor list Finger table {f i f i.id = succ(n.id+2 i )} Separate stabilization timer for each section Successor pointer most frequently checked Finger list least frequently updated Stabilization timer setting expressed as s/sl/f 11 PS Evaluation Static scenario (no churn) S1 = 1/3/10 1000 nodes, 0.33 lookup/sec/node 450% message overhead Dynamic scenario half-life 500 nodes perfect overlay, 500 new nodes join 1000 nodes perfect overlay, 500 nodes fail lookup rate 0.33/sec/node 12 6

Half-life Scenarios S1 = 1/3/10 ; S2 = 3/5/20 ; S3 = 5/10/30 13 Adaptive Stabilization Framework 14 7

Objective Stabilization rate adapted to changing environment Design goals decentralization: autonomous decision to stabilize efficiency: maintain nominal network performance low cost: minimize message overhead Stabilization rate adjusted based on local estimation of churn rate local estimation of overlay size forwarding probability for each pointer DHT-flavor independent 15 Adaptive Stabilization Framework Input observations on churn and lookup workload, overlay size target QoS (lookup failure, lookup path length) Output stabilization decision 16 8

AS Framework Overview Estimate churn and lookup workload update at both external events and internal timer System model analysis predict system behavior Stabilization decision decision in agreement with the system model decision based on QoS requirements maximum allowed lookup failure 17 Advantages Informed stabilization decision as opposed to ad-hoc decision in PS Adapts to changing network conditions Stabilization decision correlated with QoS Tight stabilization control on a per-pointer basis DHT-protocol independent, to a large extent 18 9

Liveness and Accuracy Distinguish between two concepts: Liveness: is a pointer in the routing table still alive? Accuracy: is a pointer in the routing table still accurate? Liveness check: P tout Accuracy check: P inacc Cost : Liveness: O(1) message travels one overlay hop Accuracy: O(logN) - message travels logn hops 19 20 10

Overlay Size Estimation Based on density of nodes in identifier space assume node IDs are evenly distributed assumption holds if Consistent Hashing is used nodes know the IDs of P predecessors and S successors overlay size estimation is Size = Identifier _ Space _ Size ( P + S) ID ID last _ succ first _ pred 21 Overlay Size Estimation (2) P=2, S=4 Expected distance between nodes ID d = last _ succ Overlay size is ID P + S first _ pred Predecessor list P predecessors Current Node Successor list Identifier _ Space _ Size Size = d Identifier _ Space _ Size = ( P + S) ID ID last _ succ first _ pred ID first_pred S Successors ID last_succ 22 11

23 Churn Rate Estimation Stabilization rate adjusted in response to churn rate Intuitively, large churn rate faster stabilization rate small churn rate slower stabilization rate Each node timestamps its routing table entries T s p time pointer p was last known to be alive T join p time pointer p joined the network Global churn rate computation factors in overlay size 24 12

25 Liveness Check Analysis Each node performs analysis locally each pointer is considered separately assume lookup destinations uniformly distributed determine forward probability for each pointer factor in relative importance of each pointer for pointer p P = P P p tout p fwd p dead 26 13

Formulation of P dead Depends on node join/failure workload only independent of DHT flavor, routing algorithm, etc Assume exponential distribution of node lifetime other distribution easily supported P p dead ( t) = 1 e μ ( t p T s μ=estimated average node lifetime t = current time T p i = time when p was last checked ) 27 Finger Forwarding Probability DHT-flavor dependent Node n Chord: hops not exact power of 2 bias on average B=2 m-1 /N P fwd varies with index low for close-by fingers highest for i = logb successor pointer saturates for high index B F i = i th finger of n F i-1 F i+1 id(n)+2 i-1 mark 28 14

Formulation of P fwd Measured Forwarding Probability for Chord fingers 29 30 15

Accuracy Check Analysis Each node performs analysis locally each pointer is considered separately DHT-flavor dependent assume joining nodes IDs uniformly distributed analysis for each finger probability of node join that affects DHT constraints estimate gain in correctness with a better pointer might not be worth it to re-pin finger 31 Dealing with joins P inacci for f i of node n same as P[join in interval Δ] based on estimate join rate Node n f i-1 P inacci low join in Δ unlikely performance gain after repin is low Δ f i+1 f i = i th finger of n id(n)+2 i-1 mark 32 16

Formulation of P inacc P inacc only affected by node join rate P inacc NOT affected by node failure rate P i inacc i fi id n. id 2 = λ ( t T m 2. i pin ) λ = estimated arrival rate of new nodes t = current time T i pin = time when i th finger was last pinned (looked up) 33 34 17

Stabilization Decision Factors to consider relative importance of different pointers upper and lower bounds of the stabilization interval relative impact of different type of events: join/fail Evaluate probability of finding node p alive at time t last stabilization is origin of time axis stabilize at τ s.t. P[p_dead_before_τ] < threshold 35 Example Lifetime model: exp.distribution with mean μ =10/sec F p = moment node pointed by current pointer p fails P F p < τ = 1 e μτ P[ F p <τ ] 1 bound for 10% failure ratio P F p <τ < 0.1 0.1 τ 0 = 1 μ ln 0.9 0 τ 0 τ 36 18

Setting Thresholds Desired lookup failure ratio F 1 average path length is logn/2 P tout < F 1 2/logN Desired disconnection probability F 2 O(logN) successors P tout < F 2 1/O(logN) 37 Performance Evaluation 38 19

Experimental Settings AS prototype implemented on top of p2psim Constant churn rate three join/failure rate: 1, 2 and 5/sec target lookup failure set at 3% Variable churn rate two steady-states with low/moderate churn 500 and 2500 nodes respectively two periods of peak churn considerable variation in size target lookup failure set at 2% 39 Constant Churn Rate 40 20

Variable Churn Rate 41 Performance Evaluation AS outperforms PS at all accounts lookup performance stabilization cost AS superior in all test scenarios constant churn rate variable churn rate AS shows good reaction to changes in churn rate AS achieves target QoS! AS has superior performance-cost tradeoff safeguard against extreme scenarios 42 21

Conclusions We propose an adaptive stabilization framework DHT identify the fundamental principles behind DHT stabilization devise mechanisms to estimate environment dynamism devise a QoS-driven decision-making mechanism DHT-independent to a considerable extent To do extend framework to suit other churn models factor in lookup workload distribution relax assumptions to generalize model applicability employ machine-learning elements - learn from history develop robustness for tuning system parameters 43 Q & A 44 22