C 1. Last Time. CSE 486/586 Distributed Systems Distributed Hash Tables. Today s Question. What We Want. What We Want. What We Don t Want

Similar documents
Last Time. CSE 486/586 Distributed Systems Distributed Hash Tables. What We Want. Today s Question. What We Want. What We Don t Want C 1

CSE 486/586 Distributed Systems

Distributed Systems. 17. Distributed Lookup. Paul Krzyzanowski. Rutgers University. Fall 2016

CSE 486/586 Distributed Systems Peer-to-Peer Architectures

DISTRIBUTED SYSTEMS CSCI 4963/ /4/2015

CPSC 426/526. P2P Lookup Service. Ennan Zhai. Computer Science Department Yale University

Content Overlays. Nick Feamster CS 7260 March 12, 2007

Distributed Systems. 16. Distributed Lookup. Paul Krzyzanowski. Rutgers University. Fall 2017

PEER-TO-PEER NETWORKS, DHTS, AND CHORD

CompSci 356: Computer Network Architectures Lecture 21: Overlay Networks Chap 9.4. Xiaowei Yang

CS 640 Introduction to Computer Networks. Today s lecture. What is P2P? Lecture30. Peer to peer applications

Telematics Chapter 9: Peer-to-Peer Networks

Distributed Hash Tables (DHT)

Peer-to-Peer Systems and Distributed Hash Tables

EE 122: Peer-to-Peer Networks

Architectures for Distributed Systems

EECS 498 Introduction to Distributed Systems

Badri Nath Rutgers University

Today. Why might P2P be a win? What is a Peer-to-Peer (P2P) system? Peer-to-Peer Systems and Distributed Hash Tables

Peer-to-Peer (P2P) Systems

March 10, Distributed Hash-based Lookup. for Peer-to-Peer Systems. Sandeep Shelke Shrirang Shirodkar MTech I CSE

DISTRIBUTED COMPUTER SYSTEMS ARCHITECTURES

*Adapted from slides provided by Stefan Götz and Klaus Wehrle (University of Tübingen)

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University

Distributed File Systems: An Overview of Peer-to-Peer Architectures. Distributed File Systems

*Adapted from slides provided by Stefan Götz and Klaus Wehrle (University of Tübingen)

Lecture 8: Application Layer P2P Applications and DHTs

An Expresway over Chord in Peer-to-Peer Systems

Distributed Hash Tables

CSCI-1680 P2P Rodrigo Fonseca

Distributed lookup services

Overlay Networks. Behnam Momeni Computer Engineering Department Sharif University of Technology

CIS 700/005 Networking Meets Databases

Finding Data in the Cloud using Distributed Hash Tables (Chord) IBM Haifa Research Storage Systems

Scaling Out Systems. COS 518: Advanced Computer Systems Lecture 6. Kyle Jamieson

Searching for Shared Resources: DHT in General

Searching for Shared Resources: DHT in General

Page 1. How Did it Start?" Model" Main Challenge" CS162 Operating Systems and Systems Programming Lecture 24. Peer-to-Peer Networks"

Goals. EECS 122: Introduction to Computer Networks Overlay Networks and P2P Networks. Solution. Overlay Networks: Motivations.

P2P Network Structured Networks: Distributed Hash Tables. Pedro García López Universitat Rovira I Virgili

EECS 122: Introduction to Computer Networks Overlay Networks and P2P Networks. Overlay Networks: Motivations

Introduction to P2P Computing

LECT-05, S-1 FP2P, Javed I.

C 1. Recap: Finger Table. CSE 486/586 Distributed Systems Remote Procedure Call. Chord: Node Joins and Leaves. Recall? Socket API

CPSC 426/526. P2P Lookup Service. Ennan Zhai. Computer Science Department Yale University

Peer to Peer Networks

Distributed Hash Tables: Chord

08 Distributed Hash Tables

Recap. CSE 486/586 Distributed Systems Case Study: Amazon Dynamo. Amazon Dynamo. Amazon Dynamo. Necessary Pieces? Overview of Key Design Techniques

Overview Computer Networking Lecture 16: Delivering Content: Peer to Peer and CDNs Peter Steenkiste

Overlay Networks: Motivations. EECS 122: Introduction to Computer Networks Overlay Networks and P2P Networks. Motivations (cont d) Goals.

CS 43: Computer Networks. 14: DHTs and CDNs October 3, 2018

Recap. CSE 486/586 Distributed Systems Case Study: Amazon Dynamo. Amazon Dynamo. Amazon Dynamo. Necessary Pieces? Overview of Key Design Techniques

Horizontal or vertical scalability? Horizontal scaling is challenging. Today. Scaling Out Key-Value Storage

Advanced Computer Networks

Scalability In Peer-to-Peer Systems. Presented by Stavros Nikolaou

Content Overlays (continued) Nick Feamster CS 7260 March 26, 2007

Main Challenge. Other Challenges. How Did it Start? Napster. Model. EE 122: Peer-to-Peer Networks. Find where a particular file is stored

L3S Research Center, University of Hannover

Finding Data in the Cloud using Distributed Hash Tables (Chord) IBM Haifa Research Storage Systems

Overlay and P2P Networks. Structured Networks and DHTs. Prof. Sasu Tarkoma

Chord : A Scalable Peer-to-Peer Lookup Protocol for Internet Applications

L3S Research Center, University of Hannover

Scaling Out Key-Value Storage

Page 1. Key Value Storage"

12/5/16. Peer to Peer Systems. Peer-to-peer - definitions. Client-Server vs. Peer-to-peer. P2P use case file sharing. Topics

CSE 124 Finding objects in distributed systems: Distributed hash tables and consistent hashing. March 8, 2016 Prof. George Porter

CS 43: Computer Networks BitTorrent & Content Distribution. Kevin Webb Swarthmore College September 28, 2017

Chord: A Scalable Peer-to-peer Lookup Service For Internet Applications

416 Distributed Systems. Mar 3, Peer-to-Peer Part 2

CS 347 Parallel and Distributed Data Processing

: Scalable Lookup

Part 1: Introducing DHTs

Peer to Peer Networks

Overlay networks. To do. Overlay networks. P2P evolution DHTs in general, Chord and Kademlia. Turtles all the way down. q q q

CRESCENDO GEORGE S. NOMIKOS. Advisor: Dr. George Xylomenos

P2P: Distributed Hash Tables

Scaling KVS. CS6450: Distributed Systems Lecture 14. Ryan Stutsman

Peer-to-Peer: Part II

Special Topics: CSci 8980 Edge History

Peer-to-Peer Systems. Network Science: Introduction. P2P History: P2P History: 1999 today

Introduction to Peer-to-Peer Systems

CS514: Intermediate Course in Computer Systems

Handling Churn in a DHT

Opportunistic Application Flows in Sensor-based Pervasive Environments

P2P Network Structured Networks: Distributed Hash Tables. Pedro García López Universitat Rovira I Virgili

INF5070 media storage and distribution systems. to-peer Systems 10/

Making Gnutella-like P2P Systems Scalable

Distributed Data Management. Profr. Dr. Wolf-Tilo Balke Institut für Informationssysteme Technische Universität Braunschweig

Today s Objec2ves. Kerberos. Kerberos Peer To Peer Overlay Networks Final Projects

Simulations of Chord and Freenet Peer-to-Peer Networking Protocols Mid-Term Report

Tracking Objects in Distributed Systems. Distributed Systems

15-744: Computer Networking P2P/DHT

Distributed Hash Tables

Small-World Overlay P2P Networks: Construction and Handling Dynamic Flash Crowd

CSCI-1680 Web Performance, Content Distribution P2P Rodrigo Fonseca

Peer-to-Peer Applications Reading: 9.4

Peer to Peer I II 1 CS 138. Copyright 2015 Thomas W. Doeppner, Rodrigo Fonseca. All rights reserved.

Cycloid: A constant-degree and lookup-efficient P2P overlay network

A Framework for Peer-To-Peer Lookup Services based on k-ary search

Transcription:

Last Time Distributed Systems Distributed Hash Tables Evolution of peer-to-peer Central directory (Napster) Query flooding (Gnutella) Hierarchical overlay (Kazaa, modern Gnutella) BitTorrent Focuses on parallel download revents free-riding Steve Ko Computer Sciences and Engineering University at Buffalo 2 Today s Question What We Want How do we organize the nodes in a distributed system? Up to the 90 s revalent architecture: client-server (or master-slave) Unequal responsibilities Now Emerged architecture: peer-to-peer Equal responsibilities Today: studying peer-to-peer as a paradigm Functionality: lookup-response E.g., Gnutella 3 4 What We Don t Want What We Want Cost (scalability) & no guarantee for lookup Napster Gnutella Memory (@server) Napster: cost not balanced, too much for the serverside Gnutella: cost still not balanced, just too much, no guarantee for lookup Lookup Latency #Messages for a lookup 5 What data structure provides lookup-response? Hash table: data structure that associates keys with values Name-value pairs (or key-value pairs) E.g., http://www.cnn.com/foo.html and the Web page E.g., BritneyHitMe.mp3 and 12.78.183.2 6 C 1

Hashing Basics Hash function Function that maps a large, possibly variable-sized datum into a small datum, often a single integer that serves to index an associative array In short: maps n-bit datum into k buckets (k << 2 n ) rovides time- & space-saving data structure for lookup Main goals: Low cost Deterministic Uniformity (load balanced) E.g., mod k buckets (k << 2 n ), data d (n-bit) b = d mod k Distributes load uniformly only when data is distributed uniformly DHT: Goal Let s build a distributed system with a hash table abstraction! lookup(key) key value value 7 8 Where to Keep the Hash Table Where to Keep the Hash Table Server-side à Napster Client-local à Gnutella What are the requirements (think Napster and Gnutella)? Deterministic lookup Low lookup time (shouldn t grow linearly with the system size) Should balance load even with node join/leave What we ll do: partition the hash table and distribute them among the nodes in the system We need to choose the right hash function We also need to somehow partition the table and distribute the partitions with minimal relocation of partitions in the presence of join/leave Consider problem of data partition: Given document X, choose one of k servers to use Two-level mapping Hashing: Map one (or more) data item(s) to a hash value (the distribution should be balanced) artitioning: Map a hash value to a server (each server load should be balanced even with node join/leave) Let s look at a simple approach and think about pros and cons. Hashing with mod, and partitioning with buckets 9 10 Using Basic Hashing and Bucket artitioning? Hashing: Suppose we use modulo hashing Number servers 1..k artitioning: lace X on server i = (X mod k) roblem? Data may not be uniformly distributed Using Basic Hashing and Bucket artitioning? lace X on server i = hash (X) mod k roblem? What happens if a server fails or joins (k à k±1)? Answer: (Almost) all entries get remapped to new nodes! Mod Hash Server 0 Server 0 Server 1 Server 1 Server 15 Server 15 11 12 C 2

Administrivia A2-B due on Friday, 3/11 (In class) Midterm on Wednesday (3/9) Chord DHT A distributed hash table system using consistent hashing Organizes nodes in a ring Maintains neighbors for correctness and shortcuts for performance DHT in general DHT systems are structured peer-to-peer as opposed to unstructured peer-to-peer such as Napster, Gnutella, etc. Used as a base system for other systems, e.g., many trackerless BitTorrent clients, Amazon Dynamo, distributed repositories, distributed file systems, etc. It shows an example of principled design. 13 14 Chord Ring: Global Hash Table Chord: Consistent Hashing Represent the hash key space as a virtual ring A ring representation instead of a table representation. Use a hash function that evenly distributes items over the hash space, e.g., SHA-1 Map nodes (buckets) in the same ring 2 0 1 Used in DHTs, memcached, etc. 128-1 Id space represented as a ring. Hash(I_address) à node_id artitioning: Maps data items to its successor node Advantages Even distribution Few changes as nodes come and go Hash(I_address) à node_id 15 16 Chord: When nodes come and go Chord: Node Organization Small changes when nodes come and go Only affects mapping of keys mapped to the node that comes or goes Maintain a circularly linked list around the ring Every node has a predecessor and successor Separate join and leave protocols pred Hash(I_address) à node_id node succ 17 18 C 3

Chord: Basic Lookup lookup (id): if ( id > pred.id && id <= my.id ) return my.id; else return succ.lookup(id); Route hop by hop via successors O(n) hops to find destination id Lookup node Object ID Chord: Efficient Lookup --- Fingers ith entry at peer with id n is first peer with: id >= n + 2 i (mod2 m ) Finger Table at N80 i ft[i] 0 96 1 96 2 96 3 96 4 96 5 114 6 20 N96 80 + 2 4 80 + 2 3 80 + 2 2 80 + 2 1 80 + 2 0 N114 80 + 25 80 + 26 N80 N20 19 20 Finger Table Finding a <key, value> using fingers N102 86 + 2 4 N86 20 + 2 6 N20 Chord: Efficient Lookup --- Fingers lookup (id): if ( id > pred.id && id <= my.id ) return my.id; else // fingers() by decreasing distance for finger in fingers(): if id >= finger.id return finger.lookup(id); return succ.lookup(id); Route greedily via distant finger nodes O(log n) hops to find destination id 21 22 Chord: Node Joins and Leaves When a node joins Node does a lookup on its own id And learns the node responsible for that id This node becomes the new node s successor And the node can learn that node s predecessor (which will become the new node s predecessor) Monitor If doesn t respond for some time, find new Leave Clean (planned) leave: notify the neighbors Unclean leave (failure): need an extra mechanism to handle lost (key, value) pairs, e.g., as Dynamo does. Summary DHT Gives a hash table as an abstraction artitions the hash table and distributes them over the nodes Structured peer-to-peer Chord DHT Based on consistent hashing Balances hash table partitions over the nodes Basic lookup based on successors Efficient lookup through fingers 23 24 C 4

Acknowledgements These slides contain material developed and copyrighted by Indranil Gupta (UIUC), Michael Freedman (rinceton), and Jennifer Rexford (rinceton). 25 C 5