CS 268: Lecture 22 DHT Applications

Similar documents
A Public DHT Service

Talk Outline. Beyond Theory: DHTs in Practice. How Bad is Churn in Real Systems? Making DHTs Robust: The Problem of Membership Churn

Topics in P2P Networked Systems

Distributed Hash Tables (DHTs) OpenDHT: A Shared, Public DHT Service

Distributed Hash Tables: Chord

Peer-to-Peer Systems and Distributed Hash Tables

Today. Why might P2P be a win? What is a Peer-to-Peer (P2P) system? Peer-to-Peer Systems and Distributed Hash Tables

Peer-to-peer computing research a fad?

Peer-to-Peer (P2P) Distributed Storage. Dennis Kafura CS5204 Operating Systems

EECS 122: Introduction to Computer Networks Overlay Networks and P2P Networks. Overlay Networks: Motivations

Overlay Networks: Motivations. EECS 122: Introduction to Computer Networks Overlay Networks and P2P Networks. Motivations (cont d) Goals.

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University

Distributed Hash Tables

OpenDHT: A Public DHT Service and Its Uses

Goals. EECS 122: Introduction to Computer Networks Overlay Networks and P2P Networks. Solution. Overlay Networks: Motivations.

CIS 700/005 Networking Meets Databases

Distributed Hash Table

CS514: Intermediate Course in Computer Systems

Telematics Chapter 9: Peer-to-Peer Networks

Chord: A Scalable Peer-to-peer Lookup Service For Internet Applications

Finding Data in the Cloud using Distributed Hash Tables (Chord) IBM Haifa Research Storage Systems

DHT Overview. P2P: Advanced Topics Filesystems over DHTs and P2P research. How to build applications over DHTS. What we would like to have..

Handling Churn in a DHT

08 Distributed Hash Tables

Making Gnutella-like P2P Systems Scalable

Scalability In Peer-to-Peer Systems. Presented by Stavros Nikolaou

Finding Data in the Cloud using Distributed Hash Tables (Chord) IBM Haifa Research Storage Systems

Chapter 6 PEER-TO-PEER COMPUTING

Overview Computer Networking Lecture 16: Delivering Content: Peer to Peer and CDNs Peter Steenkiste

P2P Network Structured Networks: Distributed Hash Tables. Pedro García López Universitat Rovira I Virgili

Router Design: Table Lookups and Packet Scheduling EECS 122: Lecture 13

CSE 123b Communications Software

March 10, Distributed Hash-based Lookup. for Peer-to-Peer Systems. Sandeep Shelke Shrirang Shirodkar MTech I CSE

Today s class. CSE 123b Communications Software. Telnet. Network File System (NFS) Quick descriptions of some other sample applications

Unit 8 Peer-to-Peer Networking

L3S Research Center, University of Hannover

Content Overlays. Nick Feamster CS 7260 March 12, 2007

EE 122: Peer-to-Peer (P2P) Networks. Ion Stoica November 27, 2002

Kademlia: A P2P Informa2on System Based on the XOR Metric

CSE 124 Finding objects in distributed systems: Distributed hash tables and consistent hashing. March 8, 2016 Prof. George Porter

INF5070 media storage and distribution systems. to-peer Systems 10/

Semester Thesis on Chord/CFS: Towards Compatibility with Firewalls and a Keyword Search

Internet Indirection Infrastructure (i3) Ion Stoica, Daniel Adkins, Shelley Zhuang, Scott Shenker, Sonesh Surana. UC Berkeley SIGCOMM 2002

: Scalable Lookup

Student Name: University of California at Berkeley College of Engineering Department of Electrical Engineering and Computer Science

CS 640 Introduction to Computer Networks. Today s lecture. What is P2P? Lecture30. Peer to peer applications

Ken Birman. Cornell University. CS5410 Fall 2008.

CS 268: Lecture 7 (Beyond TCP Congestion Control)

Peer-to-Peer Protocols and Systems. TA: David Murray Spring /19/2006

EE 122: Peer-to-Peer Networks

Page 1. How Did it Start?" Model" Main Challenge" CS162 Operating Systems and Systems Programming Lecture 24. Peer-to-Peer Networks"

Scaling Problem Computer Networking. Lecture 23: Peer-Peer Systems. Fall P2P System. Why p2p?

Distributed hash table - Wikipedia, the free encyclopedia

P2P Network Structured Networks: Distributed Hash Tables. Pedro García López Universitat Rovira I Virgili

A Scalable Content- Addressable Network

Distributed Systems. 16. Distributed Lookup. Paul Krzyzanowski. Rutgers University. Fall 2017

Symphony. Symphony. Acknowledgement. DHTs: : The Big Picture. Spectrum of DHT Protocols. Distributed Hashing in a Small World

HP AutoRAID (Lecture 5, cs262a)

Peer-to-Peer (P2P) Systems

DISTRIBUTED SYSTEMS [COMP9243] Lecture 9a: Naming WHAT IS NAMING? Name: Entity: Slide 3. Slide 1. Address: Identifier:

Key aspects of cloud computing. Towards fuller utilization. Two main sources of resource demand. Cluster Scheduling

Architectures for Distributed Systems

HP AutoRAID (Lecture 5, cs262a)

Query Processing Over Peer-To-Peer Data Sharing Systems

Subway : Peer-To-Peer Clustering of Clients for Web Proxy

A Chord-Based Novel Mobile Peer-to-Peer File Sharing Protocol

Distributed Knowledge Organization and Peer-to-Peer Networks

Overlay networks. To do. Overlay networks. P2P evolution DHTs in general, Chord and Kademlia. Turtles all the way down. q q q

416 Distributed Systems. March 23, 2018 CDNs

Peer-to-Peer Signalling. Agenda

P2P: Distributed Hash Tables

Key aspects of cloud computing. Towards fuller utilization. Two main sources of resource demand. Cluster Scheduling

Distributed Meta-data Servers: Architecture and Design. Sarah Sharafkandi David H.C. Du DISC

416 Distributed Systems. Mar 3, Peer-to-Peer Part 2

Reminder: Distributed Hash Table (DHT) CS514: Intermediate Course in Operating Systems. Several flavors, each with variants.

Peer-to-Peer Systems. Chapter General Characteristics

Advanced Computer Networks

Internet Indirection Infrastructure

Today CSCI Coda. Naming: Volumes. Coda GFS PAST. Instructor: Abhishek Chandra. Main Goals: Volume is a subtree in the naming space

Ch06. NoSQL Part III.

INF5071 Performance in distributed systems: Distribution Part III

Distributed systems. Lecture 6: distributed transactions, elections, consensus and replication. Malte Schwarzkopf

DISTRIBUTED SYSTEMS CSCI 4963/ /4/2015

Third Midterm Exam April 24, 2017 CS162 Operating Systems

P2P Network Structured Networks (IV) Distributed Hash Tables. Pedro García López Universitat Rovira I Virgili

Wide Area Query Systems The Hydra of Databases

Scaling Problem Millions of clients! server and network meltdown. Peer-to-Peer. P2P System Why p2p?

Searching for Shared Resources: DHT in General

Searching for Shared Resources: DHT in General

DISTRIBUTED COMPUTER SYSTEMS ARCHITECTURES

Page 1. Quality of Service. CS 268: Lecture 13. QoS: DiffServ and IntServ. Three Relevant Factors. Providing Better Service.

Introduction on Peer to Peer systems

CDNs and Peer-to-Peer

Page 1. Key Value Storage"

Data Replication under Latency Constraints Siu Kee Kate Ho

Lecture 8: Application Layer P2P Applications and DHTs

Block Device Scheduling. Don Porter CSE 506

Block Device Scheduling

Chord : A Scalable Peer-to-Peer Lookup Protocol for Internet Applications

DATA. The main challenge in P2P computing is to design and implement LOOKING UP. in P2P Systems

Transcription:

CS 268: Lecture 22 DHT Applications Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences University of California, Berkeley Berkeley, CA 94720-1776 (Presentation based on slides from Robert Morris and Sean Rhea) 1 Outline Cooperative File System (CFS) Open DHT 2 Page 1

Target CFS Uses Serving data with inexpensive hosts: - open-source distributions - off-site backups - tech report archive - efficient sharing of music 3 How to mirror open-source distributions? Multiple independent distributions - Each has high peak load, low average Individual servers are wasteful Solution: aggregate - Option 1: single powerful server - Option 2: distributed service But how do you find the data? 4 Page 2

Design Challenges Avoid hot spots Spread storage burden evenly Tolerate unreliable participants Fetch speed comparable to whole-file TCP Avoid O(#participants) algorithms - Centralized mechanisms [Napster], broadcasts [Gnutella] CFS solves these challenges 5 CFS Architecture Each node is a client and a server Clients can support different interfaces - File system interface - Music key-word search 6 Page 3

Client-server interface Files have unique names Files are read-only (single writer, many readers) Publishers split files into blocks Clients check files for authenticity 7 Server Structure QHR1S T QR(S T T T U M U K! " #%$&('*)+-,.-/0 $+1'#2$3-,546078 $&('9070:8$&)-,;"-0=<?>($&A@*"B#2CED GFH I-J-JAKLLMN"O,;"-0 $E)-,;"-0P< 8 Page 4

Chord Hashes a Block ID to its Successor Q U Q N10 (MM K(MK L"!#(M L (ML L N100 S Q S N32 (MM L $%& 'L N80 N60 LK "-C-$+ /CO),6" 0=< & :$E# /C-", C 4;! # 4 )->H$+C >80 0 $+1(" #9/"-C $ 4 /$ 4 $+ 9 DHash/Chord Interface ( Q*),+ - Q. 0/ S &1 QHR1S T &2 S 43 T5+ Q 0/ S 1 T lookup() returns list with node IDs closer in ID space to block ID - Sorted, closest first 10 Page 5

M K DHash Uses Other Nodes to Locate Blocks N5 N99 N110 N80 N10 N20 N40 N50 N68 N60 ( Q "%) 11 Storing Blocks S T &2- S 2 Long-term blocks are stored for a fixed time - Publishers need to refresh periodically Cache uses LRU 12 Page 6

M K M K Replicate blocks at r successors N5 N110 N10 N99 N20 N40 M&' N80 N50 N68 N60 "-C-$ -#%$ B " CC #%$&( /8>-#%$+ 4 /8C-$ 38$ /C-$/= #2$3-,54607-4,5>-#2$ 13 Lookups find replicas N5 / F S + T & T N110 N99 N80 N68 N60 ( Q +M') N10 N20 N40 N50 M&' 14 Page 7

M K M K First Live Successor Manages Replicas N5 N110 N10 N99 M&' N20 N40 M&' N80 N50 N68 N60, 4 4 4, 4 "-C-$0 -/E,;"-07-, C-$: $ #& /$ + 4;+$ #%! +$ H>0 07$&(1"B# 15 DHash Copies to Caches Along Lookup Path N5 / T T + T S T N110 N99 N80 N68 ( N60 Q ) N10 N20 N40 N50 16 Page 8

Caching at Fingers Limits Load N32 /-,,;"# /"-C $& :$ 4 / -$#% 3" 45/=4 / " -4;, 4 4 $ H4 /,;$ ),6" 0=<,;"C " / 17 Virtual Nodes Allow Heterogeneity N10 N60 N101 N5 U U Hosts may differ in disk/net capacity Hosts may advertise multiple IDs - Chosen as SHA-1(IP Address, index) - Each ID represents a virtual node Host load proportional to # v.n. s Manually controlled 18 Page 9

Why Blocks Instead of Files? Cost: one lookup per block - Can tailor cost by choosing good block size Benefit: load balance is simple - For large files - Storage cost of large files is spread out - Popular files are served in parallel 19 Outline Cooperative File System (CFS) Open DHT 20 Page 10

Questions: How many DHTs will there be? Can all applications share one DHT? 21 Benefits of Sharing a DHT Amortizes costs across applications - Maintenance bandwidth, connection state, etc. Facilitates bootstrapping of new applications - Working infrastructure already in place Allows for statistical multiplexing of resources - Takes advantage of spare storage and bandwidth Facilitates upgrading existing applications - Share DHT between application versions 22 Page 11

The DHT as a Service 23 The DHT as a Service OpenDHT 24 Page 12

The DHT as a Service OpenDHT Clients 25 The DHT as a Service OpenDHT 26 Page 13

The DHT as a Service What is this interface? OpenDHT 27 It s not lookup() lookup(k) Challenges: 1. Distribution 2. Security What does this node do with it? k 28 Page 14

1. Storage - CFS, UsenetDHT, PKI, etc. How are DHTs Used? 2. Rendezvous - Simple: Chat, Instant Messenger - Load balanced: i3 - Multicast: RSS Aggregation, White Board - Anycast: Tapestry, Coral 29 What about put/get? Works easily for storage applications Easy to share - No upcalls, so no code distribution or security complications But does it work for rendezvous? - Chat? Sure: put(my-name, my-ip) - What about the others? 30 Page 15

Protecting Against Overuse Must protect system resources against overuse - Resources include network, CPU, and disk - Network and CPU straightforward - Disk harder: usage persists long after requests Hard to distinguish malice from eager usage - Don t want to hurt eager users if utilization low Number of active users changes over time - Quotas are inappropriate 31 Fair Storage Allocation Our solution: give each client a fair share - Will define fairness in a few slides Limits strength of malicious clients - Only as powerful as they are numerous Protect storage on each DHT node separately - Must protect each subrange of the key space - Rewards clients that balance their key choices 32 Page 16

The Problem of Starvation Fair shares change over time - Decrease as system load increases Starvation! Client 1 arrives fills 50% of disk Client 2 arrives fills 40% of disk Client 3 arrives max share = 10% time 33 Preventing Starvation Simple fix: add time-to-live (TTL) to puts - put (key, value) put (key, value, ttl) Prevents long-term starvation - Eventually all puts will expire 34 Page 17

Preventing Starvation Simple fix: add time-to-live (TTL) to puts - put (key, value) put (key, value, ttl) Prevents long-term starvation - Eventually all puts will expire Can still get short term starvation Client A arrives fills entire of disk Client B arrives asks for space Client A s values start expiring B Starves time 35 Preventing Starvation Stronger condition: Be able to accept r min bytes/sec new data at all times This is non-trivial to arrange! max space TTL Sum must be < max capacity Reserved for future puts. Slope = r min size Candidate put 0 now time max 36 Page 18

Preventing Starvation Stronger condition: Be able to accept r min bytes/sec new data at all times This is non-trivial to arrange! Violation! max max space TTL size space TTL size 0 now time max 0 now time max 37 Preventing Starvation Formalize graphical intuition: f(τ) = B(t now ) - D(t now, t now + τ) + r min τ D(t now, t now + τ): aggregate size of puts expiring in the interval (t now, t now + τ) To accept put of size x and TTL l: f(τ) + x < C for all 0 τ < l Can track the value of f efficiently with a tree - Leaves represent inflection points of f - Add put, shift time are O(log n), n = # of puts 38 Page 19

Fair Storage Allocation Queue full: reject put Not full: enqueue put Per-client put queues Select most underrepresented Wait until can accept without violating r min Store and send accept message to client The Big Decision: Definition of most under-represented 39 Defining Most Under-Represented Not just sharing disk, but disk over time - 1 byte put for 100s same as 100 byte put for 1s - So units are bytes seconds, call them commitments Equalize total commitments granted? - No: leads to starvation - A fills disk, B starts putting, A starves up to max TTL Client A arrives fills entire of disk Client B arrives asks for space B catches up with A Now A Starves! time 40 Page 20

Defining Most Under-Represented Instead, equalize rate of commitments granted - Service granted to one client depends only on others putting at same time Client A arrives fills entire of disk Client B arrives asks for space B catches up with A time A & B share available rate 41 Defining Most Under-Represented Instead, equalize rate of commitments granted - Service granted to one client depends only on others putting at same time Mechanism inspired by Start-time Fair Queuing - Have virtual time, v(t) - Each put gets a start time S(p ci ) and finish time F(p ci ) F(p ci ) = S(p ci ) + size(p ci ) ttl(p ci ) S(p ci ) = max(v(a(p ci )) - ε, F(p c i-1 )) v(t) = maximum start time of all accepted puts 42 Page 21

FST Performance 43 Page 22