Distributed Hash Tables (DHT) Jukka K. Nurminen *Adapted from slides provided by Stefan Götz and Klaus Wehrle (University of Tübingen)
The Architectures of st and nd Gen. PP Client-Server Peer-to-Peer. Server is the central entity and only provider of service and content. Network managed by the Server. Server as the higher performance system. 3. Clients as the lower performance system. Resources are shared between the peers. Resources can be accessed directly from other peers 3. Peer is provider and requestor (Servent concept) Unstructured PP Structured PP Centralized PP Pure PP Hybrid PP DHT-Based Example: WWW. All features of Peer-to-Peer included. Central entity is necessary to provide the service 3. Central entity is some kind of index/group database Example: Napster. All features of Peer-to-Peer included. Any terminal entity can be removed without loss of functionality 3. No central entities Examples: Gnutella.4, Freenet. All features of Peer-to-Peer included. Any terminal entity can be removed without loss of functionality 3. dynamic central entities Example: Gnutella., JXTA. All features of Peer-to-Peer included. Any terminal entity can be removed without loss of functionality 3. No central entities 4. Connections in the overlay are fixed Examples: Chord, CAN st Gen. nd Gen. -9- /Jukka K. Nurminen
Hash Function & Hash Tables Hashes Data Keys Hash function John Smith 45-345 John Smith Lisa Smith 3 4 Lisa Smith 45-5734 Sandra Dee 45-45545 Sam Doe 5 Sandra Dee 7 8 Sam Doe 343-755 put( Lisa Smith, 45-5734) get Lisa Smith ) -> 45-5734 5-9- /Jukka K. Nurminen
Addressing in Distributed Hash Tables Step : Mapping of content/nodes into linear space Usually:,, m - >> number of objects to be stored Mapping of data and nodes into an address space (with hash function) E.g., Hash(String) mod m : H( my data ) 33 Association of parts of address space to DHT nodes 3485 - - 79 8 - - - 7-95 9-3484 (3485 - ) m - H(Node Y)=3485 Data item D : H( D )=37 Y X H(Node X)=9 Often, the address space is viewed as a circle. -9- /Jukka K. Nurminen
Step : Routing to a Data Item Routing to a K/V-pair Start lookup at arbitrary node of DHT Routing to requested data item (key) put (key, value) value = get (key) H( my data ) = 37 79 8 7 Node 3485 manages keys 97-3485, 3485 9 Key = H( my data ) Initial node (arbitrary) (37, (ip, port)) Value = pointer to location of data -9- /Jukka K. Nurminen
Step : Routing to a Data Item Getting the content K/V-pair is delivered to requester Requester analyzes K/V-tuple (and downloads data from actual location in case of indirect storage) H( my data ) = 37 79 8 Get_Data(ip, port) 7 In case of indirect storage: After knowing the actual Location, data is requested 3485 9 Node 3485 sends (37, (ip/port)) to requester -9- /Jukka K. Nurminen
Chord -9- /Jukka K. Nurminen
Chord: Topology Keys and IDs on ring, i.e., all arithmetic modulo ^ (key, value) pairs managed by clockwise next node: successor 7 successor() = successor() = Chord Ring successor() = 3 5 3 4 X Identifier Node Key -9- /Jukka K. Nurminen
Chord: Primitive Routing Primitive routing: Forward query for key x until successor(x) is found Return result to source of query Pros: Cons: Simple Little node state Poor lookup efficiency: O(/ * N) hops on average (with N nodes) Node failure breaks circle 7 Node Key? 5 3 4-9- /Jukka K. Nurminen
Chord: Routing Chord s routing table: finger table Stores log(n) links per node Covers exponentially increasing distances: Node n: entry i points to successor(n + ^i) (i-th finger) finger table i start succ. keys 4 3 7 finger table i start 3 5 succ. 3 3 keys 5 3 finger table i start succ. keys 4 4 5 7-9- /Jukka K. Nurminen
Chord: Routing Chord s routing algorithm: Each node n forwards query for key k clockwise To farthest finger preceding k Until n = predecessor(k) and successor(n) = successor(k) Return successor(n) to source of query 3 4 i ^i Target Link i ^i i ^i Target Target 4 Link Link 4 453 4 54 4 5 54 54 lookup 4 43 (44) (44) = 45 4 75 35 3 8 47 49 3 8 3 33 4 55 4 5 4 4 39 393 5 3 7 7 5 3 55 5 49 5 5 54 45 44 4 39 37 33 3 7 3 3 4 9-9- /Jukka K. Nurminen
Comparison of Lookup Concepts System Per Node State Communi- cation Overhead Fuzzy Queries (e.g all names starting with letter J j* ) No false negatives Robustness Central Server O(N) O() Flooding Search O() O(N²) Distributed Hash Tables O(log N) O(log N) -9- /Jukka K. Nurminen