Content Overlays (continued) Nick Feamster CS 7260 March 26, 2007

Similar documents
Content Overlays. Nick Feamster CS 7260 March 12, 2007

CompSci 356: Computer Network Architectures Lecture 21: Overlay Networks Chap 9.4. Xiaowei Yang

Lecture 17: Peer-to-Peer System and BitTorrent

DISTRIBUTED SYSTEMS CSCI 4963/ /4/2015

CS 3516: Advanced Computer Networks

Lecture 8: Application Layer P2P Applications and DHTs

CS 43: Computer Networks BitTorrent & Content Distribution. Kevin Webb Swarthmore College September 28, 2017

BitTorrent. Masood Khosroshahy. July Tech. Report. Copyright 2009 Masood Khosroshahy, All rights reserved.

Page 1. How Did it Start?" Model" Main Challenge" CS162 Operating Systems and Systems Programming Lecture 24. Peer-to-Peer Networks"

Content Distribution and BitTorrent [Based on slides by Cosmin Arad]

internet technologies and standards

Peer to Peer Systems and Probabilistic Protocols

416 Distributed Systems. Mar 3, Peer-to-Peer Part 2

Reti P2P per file distribution: BitTorrent

CSCI-1680 P2P Rodrigo Fonseca

CS 3516: Computer Networks

EE 122: Peer-to-Peer Networks

CS 43: Computer Networks. 14: DHTs and CDNs October 3, 2018

CSE 486/586 Distributed Systems Peer-to-Peer Architectures

Chord: A Scalable Peer-to-peer Lookup Service For Internet Applications

Peer-to-Peer (P2P) Systems

Today. Why might P2P be a win? What is a Peer-to-Peer (P2P) system? Peer-to-Peer Systems and Distributed Hash Tables

P2P Content Distribution

March 10, Distributed Hash-based Lookup. for Peer-to-Peer Systems. Sandeep Shelke Shrirang Shirodkar MTech I CSE

Distributed Data Management. Christoph Lofi Institut für Informationssysteme Technische Universität Braunschweig

Peer-to-Peer Systems and Distributed Hash Tables

BitTorrent Optimization Techniques. (from various online sources)

EECS 122: Introduction to Computer Networks Overlay Networks and P2P Networks. Overlay Networks: Motivations

Tracking Objects in Distributed Systems. Distributed Systems

Peer to Peer Networks

Today s Objec2ves. Kerberos. Kerberos Peer To Peer Overlay Networks Final Projects

Scalability In Peer-to-Peer Systems. Presented by Stavros Nikolaou

Do incentives build robustness in BitTorrent?

Introduction to P P Networks

Overlay Networks: Motivations. EECS 122: Introduction to Computer Networks Overlay Networks and P2P Networks. Motivations (cont d) Goals.

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University

Overview Computer Networking Lecture 16: Delivering Content: Peer to Peer and CDNs Peter Steenkiste

CPSC 426/526. P2P Lookup Service. Ennan Zhai. Computer Science Department Yale University

Lecture 6: Overlay Networks. CS 598: Advanced Internetworking Matthew Caesar February 15, 2011

P2P Network Structured Networks: Distributed Hash Tables. Pedro García López Universitat Rovira I Virgili

Peer-to-Peer Applications Reading: 9.4

C 1. Last Time. CSE 486/586 Distributed Systems Distributed Hash Tables. Today s Question. What We Want. What We Want. What We Don t Want

Last Time. CSE 486/586 Distributed Systems Distributed Hash Tables. What We Want. Today s Question. What We Want. What We Don t Want C 1

Scalability of the BitTorrent P2P Application

Peer to Peer Networks

Distributed Data Management

P2P Applications. Reti di Elaboratori Corso di Laurea in Informatica Università degli Studi di Roma La Sapienza Canale A-L Prof.ssa Chiara Petrioli

Distributed Hash Tables

Department of Computer Science Institute for System Architecture, Chair for Computer Networks. File Sharing

08 Distributed Hash Tables

DISTRIBUTED COMPUTER SYSTEMS ARCHITECTURES

Scaling Problem Computer Networking. Lecture 23: Peer-Peer Systems. Fall P2P System. Why p2p?

Scaling Problem Millions of clients! server and network meltdown. Peer-to-Peer. P2P System Why p2p?

Application Layer: P2P File Distribution

Distributed Systems. 16. Distributed Lookup. Paul Krzyzanowski. Rutgers University. Fall 2017

INF3190 Data Communication. Application Layer

Scaling Problem Millions of clients! server and network meltdown. Peer-to-Peer. P2P System Why p2p?

Middleware and Distributed Systems. Peer-to-Peer Systems. Peter Tröger

Searching for Shared Resources: DHT in General

Searching for Shared Resources: DHT in General

15-744: Computer Networking P2P/DHT

Distributed Hash Tables: Chord

Telematics Chapter 9: Peer-to-Peer Networks

L3S Research Center, University of Hannover

Extreme Computing. BitTorrent and incentive-based overlay networks.

Flooded Queries (Gnutella) Centralized Lookup (Napster) Routed Queries (Freenet, Chord, etc.) Overview N 2 N 1 N 3 N 4 N 8 N 9 N N 7 N 6 N 9

INF3190 Data Communication. Application Layer

Goals. EECS 122: Introduction to Computer Networks Overlay Networks and P2P Networks. Solution. Overlay Networks: Motivations.

LECT-05, S-1 FP2P, Javed I.

PEER-TO-PEER NETWORKS, DHTS, AND CHORD

L3S Research Center, University of Hannover

CS514: Intermediate Course in Computer Systems

P2P content distribution Jukka K. Nurminen

CIS 700/005 Networking Meets Databases

Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications

CSC 4900 Computer Networks: P2P and Sockets

Last Lecture SMTP. SUNY at Buffalo; CSE 489/589 Modern Networking Concepts; Fall 2010; Instructor: Hung Q. Ngo 1

P2P Applications. Reti di Elaboratori Corso di Laurea in Informatica Università degli Studi di Roma La Sapienza

Distributed Systems. peer-to-peer Johan Montelius ID2201. Distributed Systems ID2201

416 Distributed Systems. March 23, 2018 CDNs

6 Distributed data management I Hashing

Distributed Systems. 17. Distributed Lookup. Paul Krzyzanowski. Rutgers University. Fall 2016

Chord: A Scalable Peer-to-Peer Lookup Protocol for Internet Applications

P2P Network Structured Networks: Distributed Hash Tables. Pedro García López Universitat Rovira I Virgili

CSE 124 Finding objects in distributed systems: Distributed hash tables and consistent hashing. March 8, 2016 Prof. George Porter

Finding Data in the Cloud using Distributed Hash Tables (Chord) IBM Haifa Research Storage Systems

Peer-to-Peer Networks 12 Fast Download

Chord : A Scalable Peer-to-Peer Lookup Protocol for Internet Applications

Application-Layer Protocols Peer-to-Peer Systems, Media Streaming & Content Delivery Networks

BitTorrent. Internet Technologies and Applications

Definition of a DS A collection of independent computers that appear to its user as a single coherent system.

CSE 486/586 Distributed Systems

Game Theory. Presented by Hakim Weatherspoon

The Scalability of Swarming Peer-to-Peer Content Delivery

IMPLEMENTATION AND EVALUATION OF THE MULTICAST FILE TRANSFER PROTOCOL (MCFTP)

INF5071 Performance in distributed systems: Distribution Part III

CRESCENDO GEORGE S. NOMIKOS. Advisor: Dr. George Xylomenos

Overlay and P2P Networks. Structured Networks and DHTs. Prof. Sasu Tarkoma

Main Challenge. Other Challenges. How Did it Start? Napster. Model. EE 122: Peer-to-Peer Networks. Find where a particular file is stored

Distributed Hash Table

Transcription:

Content Overlays (continued) Nick Feamster CS 7260 March 26, 2007

Administrivia Quiz date Remaining lectures Interim report PS 3 Out Friday, 1-2 problems 2

Structured vs. Unstructured Overlays Structured overlays have provable properties Guarantees on storage, lookup, performance Maintaining structure under churn has proven to be difficult Lots of state that needs to be maintained when conditions change Deployed overlays are typically unstructured 3

Structured [Content] Overlays 4

Chord: Overview What is Chord? A scalable, distributed lookup service Lookup service: A service that maps keys to values (e.g., DNS, directory services, etc.) Key technology: Consistent hashing Major benefits of Chord over other lookup services Simplicity Provable correctness Provable performance 5

Chord: Primary Motivation Scalable location of data in a large distributed system Publisher Key= LetItBe Value=MP3 data N 1 N 2 N 3 N 4 N 5 Client Lookup( LetItBe ) Key Problem: Lookup 6

Chord: Design Goals Load balance: Chord acts as a distributed hash function, spreading keys evenly over the nodes. Decentralization: Chord is fully distributed: no node is more important than any other. Scalability: The cost of a Chord lookup grows as the log of the number of nodes, so even very large systems are feasible. Availability: Chord automatically adjusts its internal tables to reflect newly joined nodes as well as node failures, ensuring that, the node responsible for a key can always be found. Flexible naming: Chord places no constraints on the structure of the keys it looks up. 7

Consistent Hashing Uniform Hash: assigns values to buckets e.g., H(key) = f(key) mod k, where k is number of nodes Achieves load balance if keys are randomly distributed Problems with uniform hashing How to perform consistent hashing in a distributed fashion? What happens when nodes join and leave? Consistent hashing addresses these problems 8

Consistent Hashing Main idea: map both keys and nodes (node IPs) to the same (metric) ID space Ring is one option. Any metric space will do Initially proposed for relieving Web cache hotspots [Karger97, STOC] 9

Consistent Hashing The consistent hash function assigns each node and key an m-bit identifier using SHA-1 as a base hash function Node identifier: SHA-1 hash of IP address Key identifier: SHA-1 hash of key 10

Chord Identifiers m bit identifier space for both keys and nodes Key identifier: SHA-1(key) Key= LetItBe SHA-1 ID=60 Node identifier: SHA-1(IP address) IP= 198.10.10.1 SHA-1 ID=123 Both are uniformly distributed How to map key IDs to node IDs? 11

Consistent Hashing in Chord A key is stored at its successor: node with next higher ID IP= 198.10.10.1 0 K5 N123 K20 K101 Circular 7-bit ID space N32 N90 K60 Key= LetItBe 12

Consistent Hashing Properties Load balance: all nodes receive roughly the same number of keys Flexibility: when a node joins (or leaves) the network, only an fraction of the keys are moved to a different location. This solution is optimal (i.e., the minimum necessary to maintain a balanced load) 13

Consistent Hashing Every node knows of every other node requires global information Routing tables are large: O(N) Lookups are fast: O(1) 0 N123 N10 Where is LetItBe? Hash( LetItBe ) = K60 N90 has K60 N32 K60 N90 N55 14

Load Balance Results (Theory) For N nodes and K keys, with high probability each node holds at most (1+ε)K/N keys when node N+1 joins or leaves, O(N/K) keys change hands, and only to/from node N+1 15

Lookups in Chord Every node knows its successor in the ring Requires O(N) lookups N123 0 N10 Where is LetItBe? Hash( LetItBe ) = K60 N32 K60 N90 N90 has K60 N55 16

Reducing Lookups: Finger Tables Every node knows m other nodes in the ring Increase distance exponentially N112 N16 80 + 2 5 80 + 2 6 N96 80 + 2 4 80 + 2 3 80 + 2 2 80 + 2 1 80 + 2 0 N80 17

Reducing Lookups: Finger Tables Finger i points to successor of n+2 i N112 N120 N16 80 + 2 5 80 + 2 6 N96 80 + 2 4 80 + 2 3 80 + 2 2 80 + 2 1 80 + 2 0 N80 18

Finger Table Lookups Each node knows its immediate successor. Find the predecessor of id and ask for its successor. Move forward around the ring looking for node whose successor s ID is > id 19

Faster Lookups Lookups are O(log N) hops N5 N110 N99 N10 N20 K19 N32 Lookup(K19) N80 N60 20

Summary of Performance Results Efficient: O(log N) messages per lookup Scalable: O(log N) state per node Robust: survives massive membership changes 21

Possible Applications Distributed indexes Cooperative storage Distributed, flat lookup services 22

Joining the Chord Ring Nodes can join and leave at any time Challenge: Maintining correct information about every key Three step process Initialize all fingers of new node Update fingers of existing nodes Transfer keys from successor to new node Two invariants Each node s successor is maintained successor(k) is responsible for k (finger tables must also be correct for fast lookups) 23

Join: Initialize New Node s Finger Table Locate any node p in the ring Ask node p to lookup fingers of new node N5 N36 N99 N20 1. Lookup(37,38,40,,100,164) N40 N80 N60 24

Join: Update Fingers of Existing Nodes New node calls update function on existing nodes N becomes ith finger of p if (1) p precedes n by at least 2^i-1 (2) ith finger of p succeeds n Existing nodes recursively update fingers of predecessors N5 N99 N20 N36 N80 N40 N60 25

Join: Transfer Keys Only keys in the range are transferred N5 N99 N80 N20 N36 N40 K30 K30 K38 K38 Copy keys 21..36 from N40 to N36 N60 26

Handling Failures Problem: Failures could cause incorrect lookup Solution: Fallback: keep track of successor fingers N102 N113 N120 N10 N85 N80 Lookup(90) 27

Handling Failures Use successor list Each node knows r immediate successors After failure, will know first live successor Correct successors guarantee correct lookups Guarantee is with some probability Can choose r to make probability of lookup failure arbitrarily small 28

Chord: Questions Comparison to other DHTs Security concerns Workload imbalance Locality Search 29

Unstructured Overlays

BitTorrent Steps for publishing Peer creates torrent: contains metadata about tracker and about the pieces of the file (checksum of each piece of the time). Peers that create the initial copy of the file are called seeders Steps for downloading Peer contacts tracker Peer downloads from seeder, eventually from other peers Uses basic ideas from game theory to largely eliminate the free-rider problem Previous systems could not deal with this problem 31

Basic Idea Chop file into many pieces Replicate different pieces on different peers as soon as possible As soon as a peer has a complete piece, it can trade it with other peers Hopefully, assemble the entire file at the end 32

Basic Components Seed Peer that has the entire file Typically fragmented into 256KB pieces Leecher Peer that has an incomplete copy of the file Torrent file Passive component The torrent file lists SHA1 hashes of all the pieces to allow peers to verify integrity Typically hosted on a web server Tracker Allows peers to find each other Returns a random list of peers 33

Pieces and Sub-Pieces A piece is broken into sub-pieces... Typically from 64kB to 1MB Policy: Until a piece is assembled, only download sub-pieces for that piece This policy lets complete pieces assemble quickly 34

Classic Prisoner s Dilemma Pareto Efficient Outcome Nash Equilibrium (and the dominant strategy for both players) 35

Repeated Games Repeated game: play single-shot game repeatedly Subgame Perfect Equilibrium: Analog to NE for repeated games The strategy is an NE for every subgame of the repeated game Problem: a repeated game has many SPEs Single Period Deviation Principle (SPDP) can be used to test SPEs 36

Repeated Prisoner s Dilemma Example SPE: Tit-for-Tat (TFT) strategy Each player mimics the strategy of the other player in the last round Question: Use the SPDP to argue that TFT is an SPE. 37

Tit-for-Tat in BitTorrent: Choking Choking is a temporary refusal to upload; downloading occurs as normal If a node is unable to download from a peer, it does not upload to it Ensures that nodes cooperate and eliminates the free-rider problem Cooperation involves uploaded sub-pieces that you have to your peer Connection is kept open 38

Choking Algorithm Goal is to have several bidirectional connections running continuously Upload to peers who have uploaded to you recently Unutilized connections are uploaded to on a trial basis to see if better transfer rates could be found using them 39

Choking Specifics A peer always unchokes a fixed number of its peers (default of 4) Decision to choke/unchoke done based on current download rates, which is evaluated on a rolling 20- second average Evaluation on who to choke/unchoke is performed every 10 seconds This prevents wastage of resources by rapidly choking/unchoking peers Supposedly enough for TCP to ramp up transfers to their full capacity Which peer is the optimistic unchoke is rotated every 30 seconds 40

Rarest Piece First Policy: Determine the pieces that are most rare among your peers and download those first This ensures that the most common pieces are left till the end to download Rarest first also ensures that a large variety of pieces are downloaded from the seed (Question: Why is this important?) 41

Piece Selection The order in which pieces are selected by different peers is critical for good performance If a bad algorithm is used, we could end up in a situation where every peer has all the pieces that are currently available and none of the missing ones If the original seed is taken down, the file cannot be completely downloaded! 42

Random First Piece Initially, a peer has nothing to trade Important to get a complete piece ASAP Rare pieces are typically available at fewer peers, so downloading a rare piece initially is not a good idea Policy: Select a random piece of the file and download it 43

Endgame Mode When all the sub-pieces that a peer doesn t have are actively being requested, these are requested from every peer Redundant requests cancelled when piece arrives Ensures that a single peer with a slow transfer rate doesn t prevent the download from completing 44

Questions Peers going offline when download completes Integrity of downloads 45

Distributing Content: Coding

Digital Fountains Analogy: water fountain Doesn t matter which bits of water you get Hold the glass out until it is full Ideal: Infinite stream Practice: Approximate, using erasure codes Reed-solomon Tornado codes (faster, slightly less efficient) 47

Applications Reliable multicast Parallel downloads Long-distance transmission (avoiding TCP) One-to-many TCP Content distribution on overlay networks Streaming video 48

Point-to-Point Data Transmission TCP has problems over long-distance connections. Packets must be acknowledged to increase sending window (packets in flight). Long round-trip time leads to slow acks, bounding transmission window. Any loss increases the problem. Using digital fountain + TCP-friendly congestion control can greatly speed up connections. Separates the what you send from how much you send. Do not need to buffer for retransmission. 49

Other Applications Other possible applications outside of networking Storage systems Digital fountain codes for errors?? 50