Symphony. Symphony. Acknowledgement. DHTs: : The Big Picture. Spectrum of DHT Protocols. Distributed Hashing in a Small World

Similar documents
P2P Network Structured Networks: Distributed Hash Tables. Pedro García López Universitat Rovira I Virgili

Degree Optimal Deterministic Routing for P2P Systems

ReCord: A Distributed Hash Table with Recursive Structure

Scalability In Peer-to-Peer Systems. Presented by Stavros Nikolaou

A Scalable Content- Addressable Network

Architectures for Distributed Systems

Searching for Shared Resources: DHT in General

Searching for Shared Resources: DHT in General

An Adaptive Stabilization Framework for DHT

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University

Peer-to-Peer Systems and Distributed Hash Tables

Overlay Networks for Multimedia Contents Distribution

Motivation. The Impact of DHT Routing Geometry on Resilience and Proximity. Different components of analysis. Approach:Component-based analysis

The Impact of DHT Routing Geometry on Resilience and Proximity. Acknowledgement. Motivation

Improving the Dependability of Prefix-Based Routing in DHTs

Content Overlays. Nick Feamster CS 7260 March 12, 2007

*Adapted from slides provided by Stefan Götz and Klaus Wehrle (University of Tübingen)

Distriubted Hash Tables and Scalable Content Adressable Network (CAN)

Today. Why might P2P be a win? What is a Peer-to-Peer (P2P) system? Peer-to-Peer Systems and Distributed Hash Tables

Distributed hash table - Wikipedia, the free encyclopedia

Making Gnutella-like P2P Systems Scalable

CS 640 Introduction to Computer Networks. Today s lecture. What is P2P? Lecture30. Peer to peer applications

Kademlia: A P2P Informa2on System Based on the XOR Metric

LECT-05, S-1 FP2P, Javed I.

CSE 124 Finding objects in distributed systems: Distributed hash tables and consistent hashing. March 8, 2016 Prof. George Porter

Module SDS: Scalable Distributed Systems. Gabriel Antoniu, KERDATA & Davide Frey, ASAP INRIA

Relaxing Routing Table to Alleviate Dynamism in P2P Systems

Handling Churn in a DHT

PEER-TO-PEER NETWORKS, DHTS, AND CHORD

Distributed Hash Tables: Chord

P2P Network Structured Networks: Distributed Hash Tables. Pedro García López Universitat Rovira I Virgili

CIS 700/005 Networking Meets Databases

Effect of Links on DHT Routing Algorithms 1

L3S Research Center, University of Hannover

Distributed Hash Tables

CS514: Intermediate Course in Computer Systems

P2P: Distributed Hash Tables

Kademlia: A peer-to peer information system based on XOR. based on XOR Metric,by P. Maymounkov and D. Mazieres

08 Distributed Hash Tables

Scalable P2P architectures

Distributed Information Processing

Overlay and P2P Networks. Structured Networks and DHTs. Prof. Sasu Tarkoma

CRESCENDO GEORGE S. NOMIKOS. Advisor: Dr. George Xylomenos

Load Sharing in Peer-to-Peer Networks using Dynamic Replication

Unit 8 Peer-to-Peer Networking

Cycloid: A constant-degree and lookup-efficient P2P overlay network

Peer-to-peer computing research a fad?

Lecture 6: Overlay Networks. CS 598: Advanced Internetworking Matthew Caesar February 15, 2011

Advanced Distributed Systems. Peer to peer systems. Reference. Reference. What is P2P? Unstructured P2P Systems Structured P2P Systems

Coordinate-based Routing:

Distributed Hash Table

Overlay and P2P Networks. Structured Networks and DHTs. Prof. Sasu Tarkoma

Security for Structured Peer-to-peer Overlay Networks. Acknowledgement. Outline. By Miguel Castro et al. OSDI 02 Presented by Shiping Chen in IT818

Distributed Systems. 17. Distributed Lookup. Paul Krzyzanowski. Rutgers University. Fall 2016

Structured Superpeers: Leveraging Heterogeneity to Provide Constant-Time Lookup

Chapter 10: Peer-to-Peer Systems

Peer-to-Peer Systems. Chapter General Characteristics

Chapter 6 PEER-TO-PEER COMPUTING

Telecommunication Services Engineering Lab. Roch H. Glitho

Peer-To-Peer Techniques

Dynamic Load Sharing in Peer-to-Peer Systems: When some Peers are more Equal than Others

March 10, Distributed Hash-based Lookup. for Peer-to-Peer Systems. Sandeep Shelke Shrirang Shirodkar MTech I CSE

Peer-to-Peer Internet Applications: A Review

Chord : A Scalable Peer-to-Peer Lookup Protocol for Internet Applications

Building a low-latency, proximity-aware DHT-based P2P network

Randomized Algorithms for Network Security and Peer-to-Peer Systems

The Design and Implementation of a Next Generation Name Service for the Internet (CoDoNS) Presented By: Kamalakar Kambhatla

Purpose and security analysis of RASTER

DISTRIBUTED COMPUTER SYSTEMS ARCHITECTURES

Overlay networks. To do. Overlay networks. P2P evolution DHTs in general, Chord and Kademlia. Turtles all the way down. q q q

GNUnet Distributed Data Storage

Athens University of Economics and Business. Dept. of Informatics

Towards Benchmarking of P2P Technologies from a SCADA Systems Protection Perspective

Goals. EECS 122: Introduction to Computer Networks Overlay Networks and P2P Networks. Solution. Overlay Networks: Motivations.

ChordNet: A Chord-based self-organizing super-peer network

: Scalable Lookup

Lecture 8: Application Layer P2P Applications and DHTs

Content Overlays (continued) Nick Feamster CS 7260 March 26, 2007

Distributed Meta-data Servers: Architecture and Design. Sarah Sharafkandi David H.C. Du DISC

Resilient GIA. Keywords-component; GIA; peer to peer; Resilient; Unstructured; Voting Algorithm

Telematics Chapter 9: Peer-to-Peer Networks

Winter CS454/ Assignment 2 Instructor: Bernard Wong Due date: March 12 15, 2012 Group size: 2

Bayeux: An Architecture for Scalable and Fault Tolerant Wide area Data Dissemination

Introduction to Peer-to-Peer Systems

Kleinberg s Small-World Networks. Normalization constant have to be calculated:

Distributed Systems. 16. Distributed Lookup. Paul Krzyzanowski. Rutgers University. Fall 2017

CS 268: Lecture 22 DHT Applications

Overlay Networks. Behnam Momeni Computer Engineering Department Sharif University of Technology

Distributed Hash Tables

Peer to peer systems: An overview

Brocade: Landmark Routing on Peer to Peer Networks. Ling Huang, Ben Y. Zhao, Yitao Duan, Anthony Joseph, John Kubiatowicz

INF5070 media storage and distribution systems. to-peer Systems 10/

Introduction to P2P Computing

CompSci 356: Computer Network Architectures Lecture 21: Overlay Networks Chap 9.4. Xiaowei Yang

Chord: A Scalable Peer-to-peer Lookup Service For Internet Applications

Slides for Chapter 10: Peer-to-Peer Systems

Data-Centric Query in Sensor Networks

CS 268: DHTs. Page 1. How Did it Start? Model. Main Challenge. Napster. Other Challenges

Motivation for peer-to-peer

Chord. Advanced issues

Transcription:

Distributed Hashing in a Small World Gurmeet Singh Manku Stanford University with Mayank Bawa and Prabhakar Raghavan Acknowledgement The following slides are borrowed from the author s talk at USITS 2003. Presented By Xiaorong Zhou Goal: How can thousands of hosts cooperatively maintain a large hash table in a completely decentralized fashion? DHTs: : The Big Picture Spectrum of DHT Protocols Load Balance How do we splice the hash table evenly? Nodes choose their ID in the hash space uniformly at random. Deterministic Topology Protocol #links latency CAN O(log n) O(log n) Chord O(log n) O(log n) Caching, Hotspots, Fault Tolerance, Replication,... --- Topology Establishment How do we route with small state per node? Deterministic Randomized (CAN/Chord) (Pastry/ Tapestry) () Partly Randomized Topology Completely Randomized Topology Viceroy O(1) O(log n) Tapestry O(log n) O(log n) Pastry O(log n) O(log n) 2k+2 O((log 2 n)/k) 1

in a Nutshell Nodes arranged in a unit circle (perimeter = 1) Arrival --> Node chooses position along circle uniformly at random Each node has 1 short link (net node on circle) and k long links Adaptation of Small World Idea: [Kleinberg00] Long links chosen from a probability distribution function: p() = 1 / log n where n = #nodes. Network Size Estimation Protocol Problem: What is the current value of n, the total number of nodes? = Length of arc 1/ = Estimate of n (Idea from Viceroy)? Simple greedy routing: Forward along that link that minimizes the absolute distance to the destination. node long link short link A typical network Average lookup latency = O((log 2 n) / k) hops Fault Tolerance: No backups for long links! Only short links are fortified for fault tolerance. - 3 arcs are enough. - Re-linking Protocol not worthwhile. Intuition Behind s PDF Step 0: Probability Distribution Chord p() = 1 / ( log n) : Draw from the PDF log n times 2

Step 1: Step- Step 2: Divide PDF into log n Equal Bins p() = 1 / log n Step-: Draw from the discretized PDF log n times Step-Partitioned-: Draw eactly once from each of log n bins Step 3: Discrete PDF From Chord to p() = 1 / log n Chord: Draw eactly once from each of log n bins Each bin is essentially a point. 3

Two Optimizations Latency vs State Maintenance Bi-directional Routing - Eploit both outgoing and incoming links! - Route to the neighbor that minimizes absolute distance to destination - Reduces avg latency by 25-30% 1-Lookahead Average Latency 5 10 15 Viceroy CAN Pastry Chord Network size: n=2 15 nodes + Bidirectional Links + 1-Lookahead Tapestry X Pastry - List of neighbor s neighbors - Reduces avg latency by 40% 0 10 20 30 40 50 60 # TCP Connections Many more graphs in the paper. Why? Family of Harmonic Distributions 1. Low state maintenance Low degree --> Fewer pings/keep-alives, less control traffic Low degree --> Distributed locking and coordination overhead over smaller sets of nodes Low degree --> Smaller bootstrapping time when a node joins Smaller recovery time when a node leaves Increasing n 2. Fault tolerance Only short links are bolstered. No backups for long links! 3. Smooth out-degree vs latency tradeoff Only protocol that offers this tuning knob even at run time! Out-degree is not fied at runtime, or as a function of network size. 4. Fleibility and support for heterogeneity Different nodes can have different #links! Cumulative PDF: P() = log n / log n for in [1/n, 1] PDF: p() = 1 / log n for in [1/n, 1] 4

1. Estimation Protocol 2. Average Latency - # links ++ Avg Latency -- - Diminishing returns with increasing #links - Bidirectional better than Unidirectional 3. Latency distribution for n = 2^14 nodes 4. Latency with log2(n) links per node Vertical errorbars capture 99% of the distribution - #links ++ Avg latency -- - #links ++ Variance -- 5

5. Choice of s for Estimation Protocol 6. Cumulative #relinks for epanding network - No change in avg latency with increasing s - Vertical errorbars capture 99% of distribution - Eactly one node arrives per timestep 7a. Impact of 1-Lookahead 7b. Impact of 1-Lookahead - 1-Lookahead diminishes avg latency by roughly 40% 6

8. DYNAMIC Network of 100K nodes 9a. Fault tolerance - n = 2^16 nodes - Vertical errorbars capture 99% of the distribution - Avg livetime = 23.5 hrs. Avg sleeptime = 0.5 hrs. - 1 st day: linear increase. 2 nd day: steady. 3 rd day: linear USITS, decrease 10 March 2003 9b. Fault tolerance 10. Cost of Joining & Leaving - n = 2^16 nodes - Only short links need be fortified! - n = 2^16 nodes - Only short links need be fortified! 7

11. Bandwidth Profile 12. Comparison with Uniformly Random Links Resources The author s homepage http://www.cs.stanfod.edu/~manku Stanford Peers http://www-db.stanford.edu/peers/ Question? 8