Chord: A Scalable Peer-to-peer Lookup Service For Internet Applications

Similar documents
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications

Finding Data in the Cloud using Distributed Hash Tables (Chord) IBM Haifa Research Storage Systems

P2P: Distributed Hash Tables

Scalability In Peer-to-Peer Systems. Presented by Stavros Nikolaou

Finding Data in the Cloud using Distributed Hash Tables (Chord) IBM Haifa Research Storage Systems

Ch06. NoSQL Part III.

Chord : A Scalable Peer-to-Peer Lookup Protocol for Internet Applications

Distributed Hash Table

Tracking Objects in Distributed Systems. Distributed Systems

CS535 Big Data Fall 2017 Colorado State University 11/7/2017 Sangmi Lee Pallickara Week 12- A.

Distributed Hash Tables Chord and Dynamo

CS 347 Parallel and Distributed Data Processing

Semester Thesis on Chord/CFS: Towards Compatibility with Firewalls and a Keyword Search

CSE 486/586 Distributed Systems

March 10, Distributed Hash-based Lookup. for Peer-to-Peer Systems. Sandeep Shelke Shrirang Shirodkar MTech I CSE

L3S Research Center, University of Hannover

LECT-05, S-1 FP2P, Javed I.

Content Overlays. Nick Feamster CS 7260 March 12, 2007

Chapter 6 PEER-TO-PEER COMPUTING

EE 122: Peer-to-Peer Networks

Distributed Hash Tables: Chord

Distributed File Systems: An Overview of Peer-to-Peer Architectures. Distributed File Systems

Chord: A Scalable Peer-to-Peer Lookup Protocol for Internet Applications

L3S Research Center, University of Hannover

Chord. Advanced issues

Distributed Hash Tables

Page 1. Key Value Storage"

DISTRIBUTED SYSTEMS CSCI 4963/ /4/2015

Topics in P2P Networked Systems

Content Overlays (continued) Nick Feamster CS 7260 March 26, 2007

Overlay Networks: Motivations. EECS 122: Introduction to Computer Networks Overlay Networks and P2P Networks. Motivations (cont d) Goals.

EECS 122: Introduction to Computer Networks Overlay Networks and P2P Networks. Overlay Networks: Motivations

Goals. EECS 122: Introduction to Computer Networks Overlay Networks and P2P Networks. Solution. Overlay Networks: Motivations.

Page 1. How Did it Start?" Model" Main Challenge" CS162 Operating Systems and Systems Programming Lecture 24. Peer-to-Peer Networks"

: Scalable Lookup

CIS 700/005 Networking Meets Databases

Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications

EE 122: Peer-to-Peer (P2P) Networks. Ion Stoica November 27, 2002

11/12/2018 Week 13-A Sangmi Lee Pallickara. CS435 Introduction to Big Data FALL 2018 Colorado State University

A Framework for Peer-To-Peer Lookup Services based on k-ary search

CS514: Intermediate Course in Computer Systems

Lecture 15 October 31

CSE 124 Finding objects in distributed systems: Distributed hash tables and consistent hashing. March 8, 2016 Prof. George Porter

P2P Network Structured Networks: Distributed Hash Tables. Pedro García López Universitat Rovira I Virgili

Using Linearization for Global Consistency in SSR

NodeId Verification Method against Routing Table Poisoning Attack in Chord DHT

Securing Chord for ShadowWalker. Nandit Tiku Department of Computer Science University of Illinois at Urbana-Champaign

08 Distributed Hash Tables

Simulations of Chord and Freenet Peer-to-Peer Networking Protocols Mid-Term Report

Dorina Luminiţa COPACI, Constantin Alin COPACI

Some of the peer-to-peer systems take distribution to an extreme, often for reasons other than technical.

Peer-to-Peer Systems and Distributed Hash Tables

CRESCENDO GEORGE S. NOMIKOS. Advisor: Dr. George Xylomenos

Architectures for Distributed Systems

CS 640 Introduction to Computer Networks. Today s lecture. What is P2P? Lecture30. Peer to peer applications

Last Time. CSE 486/586 Distributed Systems Distributed Hash Tables. What We Want. Today s Question. What We Want. What We Don t Want C 1

C 1. Last Time. CSE 486/586 Distributed Systems Distributed Hash Tables. Today s Question. What We Want. What We Want. What We Don t Want

Today s Objec2ves. Kerberos. Kerberos Peer To Peer Overlay Networks Final Projects

Building a low-latency, proximity-aware DHT-based P2P network

*Adapted from slides provided by Stefan Götz and Klaus Wehrle (University of Tübingen)

CPSC 426/526. P2P Lookup Service. Ennan Zhai. Computer Science Department Yale University

Lecture 6: Overlay Networks. CS 598: Advanced Internetworking Matthew Caesar February 15, 2011

Distributed Hash Tables

DATA. The main challenge in P2P computing is to design and implement LOOKING UP. in P2P Systems

A Chord-Based Novel Mobile Peer-to-Peer File Sharing Protocol

PEER-TO-PEER NETWORKS, DHTS, AND CHORD

Distributed hash table - Wikipedia, the free encyclopedia

Decentralized Object Location In Dynamic Peer-to-Peer Distributed Systems

FORMAL METHODS IN NETWORKING COMPUTER SCIENCE 598D, SPRING 2010 PRINCETON UNIVERSITY LIGHTWEIGHT MODELING IN PROMELA/SPIN AND ALLOY

P2P Network Structured Networks: Distributed Hash Tables. Pedro García López Universitat Rovira I Virgili

Routing Table Construction Method Solely Based on Query Flows for Structured Overlays

Diminished Chord: A Protocol for Heterogeneous Subgroup Formation in Peer-to-Peer Networks

Scaling Out Key-Value Storage

Reminder: Distributed Hash Table (DHT) CS514: Intermediate Course in Operating Systems. Several flavors, each with variants.

Peer-to-Peer (P2P) Systems

Distributed Meta-data Servers: Architecture and Design. Sarah Sharafkandi David H.C. Du DISC

Effect of Links on DHT Routing Algorithms 1

Today. Why might P2P be a win? What is a Peer-to-Peer (P2P) system? Peer-to-Peer Systems and Distributed Hash Tables

Introduction to P2P Computing

Peer to Peer Networks

Detecting and Recovering from Overlay Routing. Distributed Hash Tables. MS Thesis Defense Keith Needels March 20, 2009

Fault Resilience of Structured P2P Systems

CSE-E5430 Scalable Cloud Computing Lecture 10

Distributed K-Ary System

Overlay and P2P Networks. Structured Networks and DHTs. Prof. Sasu Tarkoma

Naming. Naming. Naming versus Locating Entities. Flat Name-to-Address in a LAN

CS 268: Lecture 22 DHT Applications

12/5/16. Peer to Peer Systems. Peer-to-peer - definitions. Client-Server vs. Peer-to-peer. P2P use case file sharing. Topics

An Adaptive Stabilization Framework for DHT

Structured Peer-to-Peer Networks

Peer-to-Peer (P2P) Distributed Storage. Dennis Kafura CS5204 Operating Systems

Searching for Shared Resources: DHT in General

Searching for Shared Resources: DHT in General

Dynamic Load Sharing in Peer-to-Peer Systems: When some Peers are more Equal than Others

ReCord: A Distributed Hash Table with Recursive Structure

Providing File Services using a Distributed Hash Table

Distributed Data Management. Profr. Dr. Wolf-Tilo Balke Institut für Informationssysteme Technische Universität Braunschweig

DHT Overview. P2P: Advanced Topics Filesystems over DHTs and P2P research. How to build applications over DHTS. What we would like to have..

ProRenaTa: Proactive and Reactive Tuning to Scale a Distributed Storage System

Horizontal or vertical scalability? Horizontal scaling is challenging. Today. Scaling Out Key-Value Storage

Transcription:

Chord: A Scalable Peer-to-peer Lookup Service For Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan Presented by Jibin Yuan

ION STOICA Professor of CS at UC Berkeley Leading RISELab Known for Chord Author of Spark, Mesos Cofounder of Databricks, Conviva Networks 2

OUTLINE Motivation Chord Experiment Discussion 3

MOTIVATION Peer-to-peer (P2P) Data are stored in the nodes Hard to scale the location/routing in pure P2P How to locate the node that stores a particular data item efficiently? 4

CHORD Base Consistent hashing Key search Node join Concurrent Stabilization Failures and Replication Model Load balance Decentralization Scalability Availability Flexible naming 5

CHORD BASE PROTOCOL Consistent Hashing m bit identifier space for both keys and nodes key identifier = SHA 1(key) key = oathkeeper &'()* ID = 60 node identifier = SHA 1(IP address) IP = 168.10.10.15 &'()* ID = 127 Both are uniformly distributed 6

CHORD BASE PROTOCOL Consistent Hashing 2-1 0 K60 7 N127

CHORD BASE PROTOCOL Consistent Hashing m = 8 (180, 255 1] 0 N1 K220 K35 N180 K60 (127, 180] 8 N127 (1, 127]

CHORD BASE PROTOCOL Key Search Successor: the next node (clockwise) on the identifier circle N1 I am the successor of N1 N127 9

CHORD BASE PROTOCOL Key Search Predecessor: the previous node (counter-clockwise) on the identifier circle N1 My predecessor is N1 N127 10

CHORD BASE PROTOCOL Key Search K220 0 N1 K35 Successor: N127 Successor: N1 N180 K60 Time: Θ n Successor: N180 Space: Θ 1 N127 11

CHORD BASE PROTOCOL Key Search Each node has a finger table For node n, using m-bit identifiers: Notation Definition finger k. start successor n + 2 C)* mod 2 -, 1 k m. interval [finger k. start, finger k + 1. start). node first node n. finger k. start [finger 1. node 12

CHORD BASE PROTOCOL Key Search Notation start interval node successor Definition n + 2 C)* mod 2 -, 1 k m [finger k. start, finger k + 1. start) first node n. finger k. start The next node on the identifier circle; [finger 1. node K220 0 N1 K35 Finger table: start Interval node 2 [2,3) 127 N180 K60 3 [3,5) 127 5 [5,9) 127 9 [9,17) 127 17 [17,33) 127 33 [33,65) 127 65 [65,129) 127 129 [129, 1) 180 Time: Θ log n Space: Θ log n 13 N127

CHORD BASE PROTOCOL Node Join When the node n joins the network: 1. Initialize the predecessor and fingers of node 2. Update the existing nodes 3. Transfer state from successor to new node 14

CHORD BASE PROTOCOL Node Join step 1 Locate any node p in the ring Ask node p to lookup fingers of new node N79 Return results to new node K220 0 N1(Succ: N127) K35 start Interval node 80 [80,81) 127 81 [81,83) 127 N180 K60 N79 83 [83,87) 127 87 [87,95) 127 95 [95,111) 127 111 [111,143) 127 143 [143,207) 180 207 [207, 80) 1 N127 (Pred: N1) 15

CHORD BASE PROTOCOL N180 K220 Node Join step 2 New node calls update function on existing nodes Existing nodes can recursively update fingers of other nodes 0 N1 (Succ: N127) N79) start Interval node N127 K35 K60 (Pred: N1) N79) 16 2 [2,3) 127 3 [3,5) 127 5 [5,9) 127 N79 9 [9,17) 127 17 [17,33) 127 33 [33,65) 127 65 [65,129) 127 129 [129, 1) 180 start Interval node start Interval node 2 [2,3) 79 80 [80,81) 127 3 [3,5) 79 81 [81,83) 127 5 [5,9) 79 83 [83,87) 127 9 [9,17) 79 87 [87,95) 127 17 [17,33) 79 95 [95,111) 127 33 [33,65) 79 111 [111,143) 127 65 [65,129) 79 143 [143,207) 180 129 [129, 1) 180 207 [207, 80) 1

CHORD BASE PROTOCOL Node Join step 3 Transfer state (e.g. key) from successor node to the new node N180 K220 0 N1 K35 K60 N79 Time: Θ log R n After optimization: Time: Θ log n N127:(K35, K60) 17

CHORD CONCURRENT OPERATIONS AND FAILURES Stabilization Handle concurrent join Updating successor pointers is sufficient to guarantee correctness Three behaviors: Everything is up to date ok Successor is up to date ok Nothing is up to date retry after a pause 18

CHORD CONCURRENT OPERATIONS AND FAILURES Stabilization Node join (concurrently, lazy update) 1. Initialize only the finger to successor node 2. Periodically verify immediate successor, predecessor 3. Periodically refresh finger table entries Node join (base protocol, aggressive update) 1. Initialize the predecessor and fingers of node 2. Update the existing nodes 3. Transfer state from successor to new node 19

CHORD CONCURRENT OPERATIONS AND FAILURES Failures and Replication Failure of nodes might cause incorrect lookup N127 doesn t know correct successor so lookup fails 0 N1 Successor fingers are enough for correctness N180 N160 Lookup(130) N140 N127 20

CHORD CONCURRENT OPERATIONS AND FAILURES Failures and Replication Use successor list Each node knows r immediate successors After failure, will know first live successor Correct successors guarantee correct lookups Guarantee is with some probability Can choose r to make probability of lookup failure arbitrarily small 21

CHORD MODEL Addressing these problems: Load balance Decentralization Scalability Availability Flexible naming 22

Experiment Simulator Iterative (vs recursive) Experiments Load balance Path length Simultaneous node failures Lookups during stabilization Latency measurements 23

Experiment LOAD BALANCE Reason of variance: node identifiers do not uniformly cover the entire identifier space. 24

Experiment LOAD BALANCE Solution for variance: Associate the key with virtual nodes Map multiple virtual nodes to real node (consistent hash paper) 25

Experiment PATH LENGTH N = 2 C nodes, 100 2 C keys, 3 k 14 Path length about * R log R N 26

Experiment SIMULTANEOUS NODE FAILURES Definition: correct lookup of a key: finds the node that was originally responsible for the key, before the failures Worry: Network partition (Did not happen) Lookup failure rate = Node failure rate 27

Experiment LOOKUPS DURING STABILIZATION Setup 500 nodes initially stabilize() called per 30s in average one lookup per sec (Poisson) x fail/join per sec (Poisson) 28

Experiment LOOKUPS DURING STABILIZATION Definition: correct lookup of a key: find current successor of the desired key From Probability: If p nodes fail btw stabilization, the failure rate would be p% Why it is worse: May 1 stabilization to clear out a failed node 29

Experiment LATENCY MEASUREMENTS Setup 10 sites SHA -1 TCP Iterative Lookup latency grows slowly with the total number of nodes. 30

DISCUSSION Future Work No mechanism to heal partitioned rings Incorrect view of ring from malicious participants Decrease lookup latency 31

DISCUSSION Strengths Load balance, Decentralization, Scalability, Availability Based on theoretical work (consistent hashing) Proven performance in many different aspects Simplicity, proven correctness Limitations Load balance on hot spot No mechanism to handle node leave Transfer state between nodes due to join/fail 32

Q/A 33