Today. Why might P2P be a win? What is a Peer-to-Peer (P2P) system? Peer-to-Peer Systems and Distributed Hash Tables

Similar documents
Peer-to-Peer Systems and Distributed Hash Tables

Distributed Hash Tables

PEER-TO-PEER NETWORKS, DHTS, AND CHORD

Distributed Hash Tables: Chord

Scaling Out Systems. COS 518: Advanced Computer Systems Lecture 6. Kyle Jamieson

Peer-to-Peer Systems

Peer-to-peer computing research a fad?

08 Distributed Hash Tables

CompSci 356: Computer Network Architectures Lecture 21: Overlay Networks Chap 9.4. Xiaowei Yang

: Scalable Lookup

Content Overlays. Nick Feamster CS 7260 March 12, 2007

EECS 498 Introduction to Distributed Systems

NAMING, DNS, AND CHORD

Finding Data in the Cloud using Distributed Hash Tables (Chord) IBM Haifa Research Storage Systems

CPSC 426/526. P2P Lookup Service. Ennan Zhai. Computer Science Department Yale University

CSCI-1680 P2P Rodrigo Fonseca

Peer-to-Peer (P2P) Distributed Storage. Dennis Kafura CS5204 Operating Systems

Kademlia: A P2P Informa2on System Based on the XOR Metric

DISTRIBUTED SYSTEMS CSCI 4963/ /4/2015

416 Distributed Systems. Mar 3, Peer-to-Peer Part 2

Peer-to-Peer Systems. Network Science: Introduction. P2P History: P2P History: 1999 today

P2P: Distributed Hash Tables

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University

Horizontal or vertical scalability? Horizontal scaling is challenging. Today. Scaling Out Key-Value Storage

Finding Data in the Cloud using Distributed Hash Tables (Chord) IBM Haifa Research Storage Systems

Introduction to Peer-to-Peer Systems

Distributed Systems. 16. Distributed Lookup. Paul Krzyzanowski. Rutgers University. Fall 2017

Page 1. Key Value Storage"

Distributed Systems. 17. Distributed Lookup. Paul Krzyzanowski. Rutgers University. Fall 2016

Peer-to-Peer (P2P) Systems

Peer to Peer Systems and Probabilistic Protocols

Overlay networks. T o do. Overlay networks. P2P evolution DHTs in general, Chord and Kademlia. q q q. Turtles all the way down

March 10, Distributed Hash-based Lookup. for Peer-to-Peer Systems. Sandeep Shelke Shrirang Shirodkar MTech I CSE

C 1. Last Time. CSE 486/586 Distributed Systems Distributed Hash Tables. Today s Question. What We Want. What We Want. What We Don t Want

CS514: Intermediate Course in Computer Systems

Lecture 8: Application Layer P2P Applications and DHTs

Telematics Chapter 9: Peer-to-Peer Networks

Scalability In Peer-to-Peer Systems. Presented by Stavros Nikolaou

CSE 486/586 Distributed Systems

Peer to Peer I II 1 CS 138. Copyright 2015 Thomas W. Doeppner, Rodrigo Fonseca. All rights reserved.

Semester Thesis on Chord/CFS: Towards Compatibility with Firewalls and a Keyword Search

Last Time. CSE 486/586 Distributed Systems Distributed Hash Tables. What We Want. Today s Question. What We Want. What We Don t Want C 1

DISTRIBUTED COMPUTER SYSTEMS ARCHITECTURES

CIS 700/005 Networking Meets Databases

CS 43: Computer Networks. 14: DHTs and CDNs October 3, 2018

Introduction to P2P Computing

Peer-to-Peer Systems. Chapter General Characteristics

CS 640 Introduction to Computer Networks. Today s lecture. What is P2P? Lecture30. Peer to peer applications

L3S Research Center, University of Hannover

Searching for Shared Resources: DHT in General

Searching for Shared Resources: DHT in General

Page 1. How Did it Start?" Model" Main Challenge" CS162 Operating Systems and Systems Programming Lecture 24. Peer-to-Peer Networks"

Flooded Queries (Gnutella) Centralized Lookup (Napster) Routed Queries (Freenet, Chord, etc.) Overview N 2 N 1 N 3 N 4 N 8 N 9 N N 7 N 6 N 9

Badri Nath Rutgers University

Today s Objec2ves. Kerberos. Kerberos Peer To Peer Overlay Networks Final Projects

Peer to Peer Networks

Handling Churn in a DHT

CDNs and Peer-to-Peer

EE 122: Peer-to-Peer Networks

The Design and Implementation of a Next Generation Name Service for the Internet (CoDoNS) Presented By: Kamalakar Kambhatla

Content Overlays (continued) Nick Feamster CS 7260 March 26, 2007

Scaling Out Key-Value Storage

Peer-to-peer systems and overlay networks

Goals. EECS 122: Introduction to Computer Networks Overlay Networks and P2P Networks. Solution. Overlay Networks: Motivations.

CS 43: Computer Networks BitTorrent & Content Distribution. Kevin Webb Swarthmore College September 28, 2017

Page 1. P2P Traffic" P2P Traffic" Today, around 18-20% (North America)! Big chunk now is video entertainment (e.g., Netflix, itunes)!

Overlay networks. To do. Overlay networks. P2P evolution DHTs in general, Chord and Kademlia. Turtles all the way down. q q q

LECT-05, S-1 FP2P, Javed I.

Lecture 6: Overlay Networks. CS 598: Advanced Internetworking Matthew Caesar February 15, 2011

EECS 122: Introduction to Computer Networks Overlay Networks and P2P Networks. Overlay Networks: Motivations

Ossification of the Internet

Tracking Objects in Distributed Systems. Distributed Systems

Distributed File Systems: An Overview of Peer-to-Peer Architectures. Distributed File Systems

INF5070 media storage and distribution systems. to-peer Systems 10/

L3S Research Center, University of Hannover

P2P Network Structured Networks: Distributed Hash Tables. Pedro García López Universitat Rovira I Virgili

Overlay networks. Today. l Overlays networks l P2P evolution l Pastry as a routing overlay example

*Adapted from slides provided by Stefan Götz and Klaus Wehrle (University of Tübingen)

Architectures for Distributed Systems

CS 268: Lecture 22 DHT Applications

Chord : A Scalable Peer-to-Peer Lookup Protocol for Internet Applications

15-744: Computer Networking P2P/DHT

Distributed Hash Tables

Peer to Peer Networks

Modern Technology of Internet

Chapter 10: Peer-to-Peer Systems

CS 347 Parallel and Distributed Data Processing

Distributed lookup services

P2P Network Structured Networks: Distributed Hash Tables. Pedro García López Universitat Rovira I Virgili

Early Measurements of a Cluster-based Architecture for P2P Systems

Overlay Networks: Motivations. EECS 122: Introduction to Computer Networks Overlay Networks and P2P Networks. Motivations (cont d) Goals.

Distributed Hash Table

Large-Scale Data Stores and Probabilistic Protocols

Peer-to-Peer Internet Applications: A Review

CSCI-1680 Web Performance, Content Distribution P2P John Jannotti

CSCI-1680 Web Performance, Content Distribution P2P Rodrigo Fonseca

Scaling Problem Millions of clients! server and network meltdown. Peer-to-Peer. P2P System Why p2p?

Scaling KVS. CS6450: Distributed Systems Lecture 14. Ryan Stutsman

Peer-to-Peer Systems. Internet Computing Workshop Tom Chothia

Distributed Systems Final Exam

Transcription:

Peer-to-Peer Systems and Distributed Hash Tables COS 418: Distributed Systems Lecture 7 Today 1. Peer-to-Peer Systems Napster, Gnutella, BitTorrent, challenges 2. Distributed Hash Tables 3. The Chord Lookup Service Kyle Jamieson [Credit: Selected content adapted from B. Karp, R. Morris] 2 What is a Peer-to-Peer (P2P) system? Node Node Internet Node Node Node A distributed system architecture: No centralized control Nodes are roughly symmetric in function Why might P2P be a win? High capacity for services through parallelism: Many disks Many network connections Many CPUs Absence of a centralized server or servers may mean: Less chance of service overload as load increases Easier deployment A single failure won t wreck the whole system System as a whole is harder to attack Large number of unreliable nodes 3 4 1

P2P adoption Successful adoption in some niche areas 1. Client-to-client (legal, illegal) file sharing Popular data but owning organization has no money 2. Digital currency: no natural single owner (Bitcoin) 3. Voice/video telephony: user to user anyway Issues: Privacy and control Example: Classic BitTorrent 1. User clicks on download link Gets torrent file with content hash, IP addr of tracker 2. User s BitTorrent (BT) client talks to tracker Tracker tells it list of peers who have file 3. User s BT client downloads file from one or more peers 4. User s BT client tells tracker it has a copy now, too 5. User s BT client serves the file to others for a while Provides huge download bandwidth, without expensive server or network links 5 6 The lookup problem Centralized lookup (Napster) get( Star Wars.mov ) N 2 N 3 N 2 N 3 N 1 Publisher (N 4 ) Internet Client? N 1 SetLoc( Star Wars.mov, IP address of N 4 ) DB Client Lookup( Star Wars.mov ) put( Star Wars.mov, [content]) N5 N 6 Publisher (N 4 ) key= Star Wars.mov, value=[content] N5 N 6 Simple, but O(N) state and a single point of failure 7 8 2

Flooded queries (original Gnutella) N 2 N 3 Lookup( Star Wars.mov ) Routed DHT queries (Chord) N 2 N 3 Lookup(H(audio data)) N 1 Robust, but O(N = number of peers) messages per lookup Client N 1 Client Publisher (N 4 ) key= Star Wars.mov, value=[content] N5 N 6 Publisher (N 4 ) key= H(audio data), value=[content] N5 N 6 Can we make it robust, reasonable state, reasonable number of hops? 9 10 Today 1. Peer-to-Peer Systems 2. Distributed Hash Tables 3. The Chord Lookup Service What is a DHT (and why)? Local hash table: key = Hash(name) put(key, value) get(key) à value Service: Constant-time insertion and lookup How can I do (roughly) this across millions of hosts on the Internet? Distributed Hash Table (DHT) 11 12 3

What is a DHT (and why)? Distributed Hash Table: key = hash(data) lookup(key) à IP addr (Chord lookup service) send-rpc(ip address, put, key, data) send-rpc(ip address, get, key) à data Partitioning data in truly large-scale distributed systems Tuples in a global database engine Data blocks in a global file system Cooperative storage with a DHT put(key, data) Distributed application Distributed hash table lookup(key) get (key) Lookup service node IP address Files in a P2P file-sharing system App may be distributed over many nodes DHT distributes data storage over many nodes data node node. node (DHash) (Chord) 13 14 BitTorrent over DHT BitTorrent can use DHT instead of (or with) a tracker BT clients use DHT: Key = file content hash ( infohash ) Value = IP address of peer willing to serve file Can store multiple values (i.e. IP addresses) for a key Client does: get(infohash) to find other clients willing to serve put(infohash, my-ipaddr)to identify itself as willing Why might DHT be a win for BitTorrent? The DHT comprises a single giant tracker, less fragmented than many trackers So peers more likely to find each other Maybe a classic BitTorrent tracker is too exposed to legal & c. attacks 15 16 4

Why the put/get DHT interface? API supports a wide range of applications DHT imposes no structure/meaning on keys Key-value pairs are persistent and global Can store keys in other DHT values And thus build complex data structures Why might DHT design be hard? Decentralized: no central authority Scalable: low network traffic overhead Efficient: find items quickly (latency) Dynamic: nodes fail, new nodes join 17 18 Today 1. Peer-to-Peer Systems 2. Distributed Hash Tables 3. The Chord Lookup Service Basic design Integration with DHash DHT, performance Chord lookup algorithm properties Interface: lookup(key) IP address Efficient: O(log N) messages per lookup N is the total number of servers Scalable: O(log N) state per node Robust: survives massive failures Simple to analyze 19 20 5

Chord Lookup: Identifiers Key identifier = SHA-1(key) Node identifier = SHA-1(IP address) Consistent hashing [Karger 97] Key 5 K5 N105 K20 SHA-1 distributes both uniformly Node 105 Circular 7-bit ID space N32 How does Chord partition data? i.e., map key IDs to node IDs N90 K80 Key is stored at its successor: node with next-higher ID 21 22 Chord: Successor pointers Basic lookup N105 N120 N10 N105 N120 N10 Where is K80? N32 N32 K80 N90 K80 N90 N60 N60 23 24 6

Simple lookup algorithm Lookup(key-id) succ ß my successor if my-id < succ < key-id // next hop call Lookup(key-id) on succ else // done return succ Improving performance Problem: Forwarding through successor is slow Data structure is a linked list: O(n) Idea: Can we make it more like a binary search? Need to be able to halve distance at each step Correctness depends only on successors 25 26 Finger table for O(log N)-time lookups Finger i Points to Successor of n+2 i N120 ¼ ½ K112 ¼ ½ 1/8 1/8 1/16 1/32 1/64 N80 1/16 1/32 1/64 N80 27 28 7

Implication of finger tables A binary lookup tree rooted at every node Threaded through other nodes' finger tables This is better than simply arranging the nodes in a single tree Every node acts as a root So there's no root hotspot No single point of failure But a lot more state in total Lookup with finger table Lookup(key-id) look in local finger table for highest n: my-id < n < key-id if n exists call Lookup(key-id) on node n // next hop else return my successor // done 29 30 Lookups Take O(log N) Hops An aside: Is O(log N) fast or slow? N5 For a million nodes, it s 20 hops N110 N10 K19 N20 If each hop takes 50 milliseconds, lookups take one second N99 N32 Lookup(K19) If each hop has 10% chance of failure, it s a couple of timeouts N80 So in practice log(n) is better than O(n) but not great N60 31 32 8

Joining: Linked list insert Join (2) N25 N25 N36 1. Lookup(36) N40 K30 K38 2. N36 sets its own successor pointer N40 K30 K38 N36 33 34 Join (3) Notify messages maintain predecessors N25 N25 3. Copy keys 26..36 from N40 to N36 N40 K30 K38 N36 K30 notify N25 N40 N36 notify N36 35 36 9

Stabilize message fixes successor Joining: Summary N25 stabilize N25 N36 N40 K30 K38 N36 K30 N40 My predecessor is N36. Predecessor pointer allows link to new node Update finger pointers in the background Correct successors generally produce correct lookups 37 38 Failures may cause incorrect lookup N120 N10 N113 N102 Successor lists Each node stores a list of its r immediate successors After failure, will know first live successor Correct successors generally produce correct lookups N85 N80 Lookup(K90) Guarantee is with some probability N80 does not know correct successor, so incorrect lookup 39 40 10

Choosing successor list length r Assume one half of the nodes fail P(successor list all dead) = (½) r i.e., P(this node breaks the Chord ring) Depends on independent failure Successor list of size r = O(log N) makes this probability 1/N: low for large N Lookup with fault tolerance Lookup(key-id) look in local finger table and successor-list for highest n: my-id < n < key-id if n exists call Lookup(key-id) on node n // next hop if call failed, remove n from finger table and/or successor list return Lookup(key-id) else return my successor // done 41 42 Today 1. Peer-to-Peer Systems 2. Distributed Hash Tables 3. The Chord Lookup Service Basic design Integration with DHash DHT, performance The DHash DHT Builds key/value storage on Chord Replicates blocks for availability Stores k replicas at the k successors after the block on the Chord ring Caches blocks for load balancing Client sends copy of block to each of the servers it contacted along the lookup path Authenticates block contents 43 44 11

DHash data authentication Two types of DHash blocks: Content-hash: key = SHA-1(data) Public-key: key is a cryptographic public key, data are signed by corresponding private key Chord File System example: root block public key... signature directory block H(D) D... H(F) inode block F... data block H(B1) B1 data block H(B2) B2 DHash replicates blocks at r successors N99 N110 N80 N68 N10 N40 Replicas are easy to find if successor fails Hashed node IDs ensure independent failure N5 N60 N50 N20 Block 17 45 46 Experimental overview Quick lookup in large systems Low variation in lookup costs Robust despite massive failure Goal: Experimentally confirm theoretical results Chord lookup cost is O(log N) Average Messages per Lookup Number of Nodes 47 Constant is 1/2 48 12

Failure experiment setup Start 1,000 Chord servers Each server s successor list has 20 entries Wait until they stabilize Insert 1,000 key/value pairs Five replicas of each Stop X% of the servers, immediately make 1,000 lookups Massive failures have little impact Failed Lookups (Percent) 1.4 1.2 1 0.8 0.6 0.4 0.2 0 (1/2) 6 is 1.6% 5 10 15 20 25 30 35 40 45 50 Failed Nodes (Percent) 49 50 Today 1. Peer-to-Peer Systems 2. Distributed Hash Tables 3. The Chord Lookup Service Basic design Integration with DHash DHT, performance Concluding thoughts on DHT, P2P DHTs: Impact Original DHTs (CAN, Chord, Kademlia, Pastry, Tapestry) proposed in 2001-02 Following 5-6 years saw proliferation of DHT-based applications: Filesystems (e.g., CFS, Ivy, OceanStore, Pond, PAST) Naming systems (e.g., SFR, Beehive) DB query processing [PIER, Wisc] Content distribution systems (e.g., Coral) distributed databases (e.g., PIER) 51 52 13

Why don t all services use P2P? 1. High latency and limited bandwidth between peers (cf. between server cluster in datacenter) 2. User computers are less reliable than managed servers 3. Lack of trust in peers correct behavior Securing DHT routing hard, unsolved in practice DHTs in retrospective Seem promising for finding data in large P2P systems Decentralization seems good for load, fault tolerance But: the security problems are difficult But: churn is a problem, particularly if log(n) is big So DHTs have not had the impact that many hoped for 53 54 What DHTs got right Consistent hashing Elegant way to divide a workload across machines Very useful in clusters: actively used today in Amazon Dynamo and other systems Replication for high availability, efficient recovery after node failure Incremental scalability: add nodes, capacity increases Self-management: minimal configuration Friday precept: Chandy-Lamport examples, Go and channels, Assignment 2 Introduction & API Monday topic: Scaling out Key-Value Storage: Amazon Dynamo Unique trait: no single server to shut down/monitor 55 56 14