CS 347 Parallel and Distributed Data Processing

CS 347 Parallel and Distributed Data Processing Spring 2016 Notes 9: Peer-to-Peer Systems

Previous Topics Data Database design Queries Query processing Localization Operators Optimization Transactions Concurrency control Reliability Replication CS 347 Notes 9 2

Previous Topics Data Database design Queries Query processing Localization Operators Optimization Transactions Concurrency control Reliability Replication Client-server architecture Relational data Good understanding of What the data is Where the data is CS 347 Notes 9 3

Next Topic Client-server architecture Relational data Good understanding of What the data is Where the data is? CS 347 Notes 9 4

Peer-to-Peer Systems 1999 2000 2001 2003 2006 2009 CS 347 Notes 9 5

Peer-to-Peer Systems Distributed applications where nodes are Autonomous Very loosely coupled Equal in role or functionality Sharing & exchanging resources with each other CS 347 Notes 9 6

Peer-to-Peer Systems Related concepts File sharing P2P is one option Grid computing Focus is on computing Autonomic computing Focus is on self-management CS 347 Notes 9 7

Peer-to-Peer Systems Search Essential problem to solve Query Who has X? Node 3 Resources R 3,1, R 3,2,... Node 2 Resources R 2,1, R 2,2,... Node 1 Resources R 1,1, R 1,2,... CS 347 Notes 9 8

Peer-to-Peer Systems Search Node 3 Resources R 3,1, R 3,2,... Query Who has X? Node 2 Resources R 2,1, R 2,2,... Answers Node 1 Resources R 1,1, R 1,2,... CS 347 Notes 9 9

Peer-to-Peer Systems Search Node 3 Resources R 3,1, R 3,2,... Query Who has X? Node 2 Resources R 2,1, R 2,2,... Answers Request resource Node 1 Resources R 1,1, R 1,2,... Provide resource CS 347 Notes 9 10

Distributed Lookup Have < k, v > pairs, each of n nodes holds some pairs Given key k, find matching values { v 1, v 2, } k v 1 a 1 b 4 a 7 c 3 a 1 a 4 d lookup(4) = { a, d } CS 347 Notes 9 11

Distributed Lookup Communication overlay network Structured (+) Efficient distributed lookup Unstructured (+) Easy, robust Can handle high churn rates ( ) Flood queries Hybrid E.g., centralized search, decentralized exchange CS 347 Notes 9 12

Distributed Lookup Distributed hashing Most common way to create a structured overlay network H(k) is an m-bit number (k is a key) H(X) is an m-bit number (X is a node identifier) Hash function H is good H(k) 0 2 m - 1 k v stored at Y H(X) H(Y) CS 347 Notes 9 13

Distributed Lookup Distributed hashing Two approaches Chord Replicated hash table (RHT) CS 347 Notes 9 14

Chord The Chord circle N1 m = 6 N51 N48 N56 K54 K10 N8 N14 N42 N21 K38 Using hashed values E.g., N56 is node with id hashing to 56 N38 N32 K24, K30 CS 347 Notes 9 15

Chord Ownership rule Consider nodes X, Y such that Y follows X clockwise Node Y owns keys k such that H(k) in (H(X), H(Y)] N54 N3 Stores K55, K56,, K3 CS 347 Notes 9 16

Chord Notation X.function( ) (remote) calls function at X X.A returns value A at X If X omitted, refers to current node CS 347 Notes 9 17

Chord Successor/predecessor links N1.pred N1 m = 6 N51 N48 N56 N1.succ N8 N14 N42 N21 N38 N32 CS 347 Notes 9 18

Chord Search for owner using successor links X.find_succ(k): if k in (pred, X] return X else if k in (X, succ] return succ else return succ.find_succ(k) CS 347 Notes 9 19

Chord Value lookup X.lookup(k): Y := X.find_succ(k) return Y.get_value(k) X.get_value(k): // Return local value v for k, if it exists CS 347 Notes 9 20

Chord Example Searching for K52 N56 N1 N8 m = 6 N51.find_succ(K52) = N56 N51 N48 N14 N14.find_succ(K52) N42 N21 N38 N32 CS 347 Notes 9 21

Chord Finger table N1 m = 6 N56 N51 N48 N42 N8 N14 N21 Finger table for N8 N8 + 1 N14 N8 + 2 N14 N8 + 4 N14 N8 + 8 N21 N8 + 16 N32 N8 + 32 N42 N38 N32 CS 347 Notes 9 22

Chord Finger table N1 m = 6 N56 N51 N48 N42 N8 N14 N21 Finger table for N8 N8 + 1 N14 N8 + 2 N14 N8 + 4 N14 N8 + 8 N21 N8 + 16 N32 N8 + 32 N42 N38 N32 Node that owns key 8 + 32 = 40 CS 347 Notes 9 23

Chord Search using the finger table X.find_succ(k): if k in (pred, X] return X if k in (X, succ] return succ else Y := closest_preceeding(k) return Y.find_succ(k) X.closest_preceeding(k): for i := m downto 1 if finger[i] in (X, k] return finger[i] return nil CS 347 Notes 9 24

Chord Example Looking up K54 N56 N1 N8 N8.find_succ(K54) m = 6 N51 N48 N14 N42.find_succ(K54) N42 N21 N38 N32 CS 347 Notes 9 25

Chord Example Looking up K54 N56 N1 N8 m = 6 N51.find_succ(K54) N51 N48 N14 N42.find_succ(K54) N42 N21 N38 N32 CS 347 Notes 9 26

Chord Example Looking up K54 N56 N1 N8 N8.find_succ(K54) m = 6 N51.find_succ(K54) N51 N48 N14 N42.find_succ(K54) N42 N21 N38 N32 CS 347 Notes 9 27

Chord Adding nodes N1 m = 6 Need to 1. Update links 2. Move data N51 N48 N56 N8 N14 N42 Joining node N26 N21 N38 For now, assume nodes never die N32 CS 347 Notes 9 28

Chord Adding nodes X.join(Y): // Node Y is known to belong to the circle pred := nil; succ := Y.find_succ(X); CS 347 Notes 9 29

Chord Periodic stabilization X.stabilize(): Y := succ.pred if Y in (X, succ) succ := Y succ.notify(x) X.notify(Y): if pred = nil or Y in (pred, X) pred := Y CS 347 Notes 9 30

Chord Join example Before updating links N x N p N s CS 347 Notes 9 31

Chord Join example After N x.join() nil N x N p N s CS 347 Notes 9 32

Chord Join example After N x.stabilize() nil N x N p N s CS 347 Notes 9 33

Chord Join example After N p.stabilize() N x N p N s Exercise: fix finger table CS 347 Notes 9 34

Chord Moving data When? Keys in (N p, N x ] N x Keys in (N p, N s ] N p N s CS 347 Notes 9 35

Chord Moving data After N s.notify(n x ) nil N x N p N s Send all keys in (N p, N x ] when N s.pred gets updated CS 347 Notes 9 36

Chord Moving data Revised notify() X.notify(Y): if pred = nil or Y in (pred, X) Y.add(data in (pred, Y]) temp := pred pred := Y X.remove(data in (temp, Y]) Glossing over concurrency issues E.g., what happens to lookups while moving data? CS 347 Notes 9 37

Chord Moving data Revised notify() Exercise: when pred = nil, what data gets moved? CS 347 Notes 9 38

Chord Moving data Lookup at wrong node nil Keys in (N p, N x ] N x N p Keys in (N x, N s ] N s Lookup for all k in (N p, N x ] directed to N s CS 347 Notes 9 39

Chord Moving data Revised lookup() X.lookup(k): X.get_value(k): ok := false if k in (pred, X] while not ok return [true, value for k] Y := X.find_succ(k) else [ok, v] := Y.get_value(k) return [false, nil] return v CS 347 Notes 9 40

Chord Moving data Revised lookup() works pred, succ links eventually correct Data ends up at correct node Finger pointers speed up searches, but do not cause problems CS 347 Notes 9 41

Chord Performance n = number of live nodes With high probability, the number of nodes that must be contacted to find a successor is O(log n) Although finger table contains room for m entries, only O(log n) need to be stored Experimental results show average lookup time is ~(log n)/2 CS 347 Notes 9 42

Chord Node failures k7 k8 k9 v7 v8 v9 N x N p N s Assume N x dies Links become invalid Data gets lost CS 347 Notes 9 43

Chord Node failures Fixing links X.check_pred(): if pred failed pred := nil Also, keep backup links to s > 1 successors CS 347 Notes 9 44

Chord Node failure example State before failure N x N p N s backup succ link (s = 2) CS 347 Notes 9 45

Chord Node failure example After failure After N s.check_pred() N p Ns nil CS 347 Notes 9 46

Chord Node failure example After N p discovers that N x is down N p Ns nil CS 347 Notes 9 47

Chord Node failure example After stabilization N p Ns CS 347 Notes 9 48

Chord Node failure Protection from data loss: e.g., robust nodes (see replication notes) Robust node X Takeover protocols Node X 1 Node X 2 Node X 3 k7 v7 k7 v7 k7 v7 k8 v8 k8 v8 k8 v8 k9 v9 k9 v9 k9 v9 Backup protocols CS 347 Notes 9 49

Chord Summary Finding owner nodes Successor and predecessor links Finger table Adding nodes Updating links Moving data Coping with node failures Fixing links Protecting from data loss Performance CS 347 Notes 9 50

Replicated Hash Table Node N0 Node N1 Node N2 Node N3 Hash Node Hash Node Hash Node Hash Node 0 N0 0 N0 0 N0 0 N0 1 N1 1 N1 1 N1 1 N1 2 N2 2 N2 2 N2 2 N2 3 N3 3 N3 3 N3 3 N3 Data for keys that hash to 0 Data for keys that hash to 1 Data for keys that hash to 2 Data for keys that hash to 3 CS 347 Notes 9 51

Replicated Hash Table Adding nodes E.g., N0 overloaded Node N0 Hash Node 0 N0 1 N0 2 N2 3 N2 Data for keys that hash to 0,1 Node N2 Hash Node 0 N0 1 N0 2 N2 3 N2 Data for keys that hash to 2,3 CS 347 Notes 9 52

Replicated Hash Table Adding nodes First, set up N1 Node N0 Hash Node 0 N0 1 N0 2 N2 3 N2 Data for keys that hash to 0,1 Node N1 Hash Node 0 N0 1 N0 2 N2 3 N2 Node N2 Hash Node 0 N0 1 N0 2 N2 3 N2 Data for keys that hash to 2,3 CS 347 Notes 9 53

Replicated Hash Table Adding nodes Next, copy data to N1 Node N0 Node N1 Node N2 Hash Node Hash Node Hash Node 0 N0 0 N0 0 N0 1 N0 1 N0 1 N0 2 N2 2 N2 2 N2 3 N2 3 N2 3 N2 Data for keys that hash to 0,1 Data copy Data for keys that hash to 1 Data for keys that hash to 2,3 CS 347 Notes 9 54

Replicated Hash Table Adding nodes Next, change control Node N0 Node N1 Node N2 Hash Node Hash Node Hash Node 0 N0 0 N0 0 N0 1 N0 N1 1 N0 N1 1 N0 2 N2 2 N2 2 N2 3 N2 3 N2 3 N2 Data for keys that hash to 0,1 Data for keys that hash to 1 Data for keys that hash to 2,3 CS 347 Notes 9 55

Replicated Hash Table Adding nodes Next, remove data from N0 Node N0 Hash Node 0 N0 1 N1 2 N2 3 N2 Data for keys that hash to 0 Node N1 Hash Node 0 N0 1 N1 2 N2 3 N2 Data for keys that hash to 1 Node N2 Hash Node 0 N0 1 N0 2 N2 3 N2 Data for keys that hash to 2,3 CS 347 Notes 9 56

Replicated Hash Table Adding nodes Finally, update other nodes Eagerly by N0 or N1? Lazily during future lookups? Node N0 Node N1 Node N2 Hash Node Hash Node Hash Node 0 N0 0 N0 0 N0 1 N1 1 N1 1 N0 N1 2 N2 2 N2 2 N2 3 N2 3 N2 3 N2 Data for keys that hash to 0 Data for keys that hash to 1 Data for keys that hash to 2,3 CS 347 Notes 9 57

Replicated Hash Table Adding nodes What about inserts while adding a node (e.g., during copy)? Apply at N0 then copy? Apply at both? Redirect to N1? Node N0 Node N1 Node N2 Hash Node Hash Node Hash Node 0 N0 0 N0 0 N0 1 N0 1 N0 1 N0 2 N2 2 N2 2 N2 3 N2 3 N2 3 N2 Data for keys that hash to 0,1 Data copy Data for keys that hash to 1 Data for keys that hash to 2,3 CS 347 Notes 9 58

Chord vs. Replicated Hash Table Which is simpler to implement? Cost of operations Looking up values: O(log n) vs. O(1) Adding nodes Recovering from node failures Storage cost Routing table size: log n vs. n CS 347 Notes 9 59

Distributed Lookup Communication overlay network Structured Chord Replicated hash table Unstructured Neighborhood search CS 347 Notes 9 60

Neighborhood Search Each node stores its own data Searches nearby nodes E.g., Gnutella Key Value Key Value Key Value 41 g 99 c 14 d 12 a 7 b 13 c 25 a 47 f 12 d 51 x 9 y CS 347 Notes 9 61

Neighborhood Search Example 41 g 13 x 41 g 13 c 14 d 13 f 12 c 15 f 12 c 14 d 41 g 13 c 14 d 12 a 7 b 13 c 25 a 47 f 12 d 25 x 9 y 13 d 12 c 41 g 13 f 14 d Node N1 N1.lookup(13), TTL = 4 13 t 1 d 44 s 5 a CS 347 Notes 9 62

Neighborhood Search Example 41 g 13 x 41 g 13 c 14 d 13 f 12 c 15 f 12 c 14 d 41 g 13 c 14 d TTL = 1 12 a 7 b 13 c 25 a 47 f 12 d 25 x 9 y 13 d 12 c TTL = 2 41 g 13 f 14 d TTL = 3 Node N1 N1.lookup(13), TTL = 4 Answer so far = { c, f, x } 13 t 1 d 44 s 5 a CS 347 Notes 9 63

Neighborhood Search Example 41 g 13 x 41 g 13 c 14 d 13 f 12 c 15 f 12 c 14 d 41 g 13 c 14 d 12 a 7 b 13 c 25 a 47 f 12 d 25 x 9 y 13 d 12 c 41 g 13 f 14 d Node N1 N1.lookup(13), TTL = 4 Answer so far = { c, d, f, x } 13 t 1 d 44 s 5 a Incorrect/incomplete CS 347 Notes 9 64

Neighborhood Search Optimization Queries have unique identifiers Nodes keep cache of recent queries (query identifier and TTL) CS 347 Notes 9 65

Neighborhood Search Example [ 77, 1 ] 41 g 13 x 41 g 13 c 14 d 13 f 12 c 15 f 12 c [ 77, 2 ] 14 d 41 g 13 c 14 d TTL = 1 Do not reprocess 77 12 a 7 b 13 c 25 a 47 f 12 d 25 x 9 y 13 d 12 c TTL = 2 41 g 13 f 14 d TTL = 3 Node N1 [ 77, 4 ] N1.lookup(13), TTL = 4, id = 77 13 t 1 d 44 s 5 a [ 77, 3 ] CS 347 Notes 9 66

Neighborhood Search Example TTL = 1 [ 77, 1 ] 41 g 13 x 41 g 13 c 14 d 13 f 12 c 15 f 12 c 14 d [ 77, 2 ] TTL = 2 41 g 13 c 14 d TTL = 3 12 a 7 b 13 c 25 a 47 f 12 d 25 x 9 y 13 d 12 c TTL = 2 Do not reprocess 77 41 g 13 f 14 d Node N1 [ 77, 4 ] N1.lookup(13), TTL = 4, id = 77 13 t 1 d 44 s 5 a [ 77, 3 ] CS 347 Notes 9 67

Neighborhood Search Bootstrapping Bootstrap server Add to S Known nodes S = N1, N2, Get neighbors N5 N1 N2 N7 CS 347 Notes 9 68

Neighborhood Search Problems Unnecessary messages High load and traffic E.g., if nodes have p neighbors, each search ~p TTL messages Low capacity nodes are a bottleneck May not find all answers Network Area searched TTL CS 347 Notes 9 69

Neighborhood Search Advantages Can handle complex queries Simple, robust algorithm Works well if data is highly replicated Network Area searched TTL Nodes with data CS 347 Notes 9 70

Open Problems Performance Availability Efficiency Load balancing Authenticity DoS prevention Incentives Anonymity Correctness Participation CS 347 Notes 9 71

Peer-to-Peer Systems Summary Structured networks Chord Replicated hash table Unstructured networks Neighborhood search Open problems CS 347 Notes 9 72