Peer-to-Peer Applications Reading: 9.4

Similar documents
CSE 486/586 Distributed Systems Peer-to-Peer Architectures

CS 3516: Advanced Computer Networks

CS 3516: Computer Networks

Chapter 2: Application layer

CMSC 332 Computer Networks P2P and Sockets

Lecture 21 P2P. Napster. Centralized Index. Napster. Gnutella. Peer-to-Peer Model March 16, Overview:

Last Lecture SMTP. SUNY at Buffalo; CSE 489/589 Modern Networking Concepts; Fall 2010; Instructor: Hung Q. Ngo 1

Lecture 8: Application Layer P2P Applications and DHTs

Scaling Problem Computer Networking. Lecture 23: Peer-Peer Systems. Fall P2P System. Why p2p?

Reti P2P per file distribution: BitTorrent

Application Layer: P2P File Distribution

Peer-to-Peer Protocols and Systems. TA: David Murray Spring /19/2006

CompSci 356: Computer Network Architectures Lecture 21: Overlay Networks Chap 9.4. Xiaowei Yang

Outline. Peer-to-Peer. P2p file-sharing. Wither p2p? What s out there? The p2p challenge C1: Search(human s goals) -> file

Introduction to Computer Networks

P2P Applications. Reti di Elaboratori Corso di Laurea in Informatica Università degli Studi di Roma La Sapienza Canale A-L Prof.ssa Chiara Petrioli

Extreme Computing. BitTorrent and incentive-based overlay networks.

CS 640 Introduction to Computer Networks. Today s lecture. What is P2P? Lecture30. Peer to peer applications

15-744: Computer Networking P2P/DHT

CE693: Adv. Computer Networking

Introduction to P P Networks

P2P Applications. Reti di Elaboratori Corso di Laurea in Informatica Università degli Studi di Roma La Sapienza

Peer-to-Peer (P2P) Systems

CSC 4900 Computer Networks: P2P and Sockets

EE 122: Peer-to-Peer Networks

Peer-to-Peer Networks

DISTRIBUTED SYSTEMS CSCI 4963/ /4/2015

BitTorrent. Masood Khosroshahy. July Tech. Report. Copyright 2009 Masood Khosroshahy, All rights reserved.

Page 1. How Did it Start?" Model" Main Challenge" CS162 Operating Systems and Systems Programming Lecture 24. Peer-to-Peer Networks"

internet technologies and standards

12/5/16. Peer to Peer Systems. Peer-to-peer - definitions. Client-Server vs. Peer-to-peer. P2P use case file sharing. Topics

Do incentives build robustness in BitTorrent?

Application-Layer Protocols Peer-to-Peer Systems, Media Streaming & Content Delivery Networks

Peer to Peer Systems and Probabilistic Protocols

CC451 Computer Networks

Lecture 17: Peer-to-Peer System and BitTorrent

Content Search. Unstructured P2P

P2P. 1 Introduction. 2 Napster. Alex S. 2.1 Client/Server. 2.2 Problems

Scalable overlay Networks

Content Search. Unstructured P2P. Jukka K. Nurminen

Overview Computer Networking Lecture 16: Delivering Content: Peer to Peer and CDNs Peter Steenkiste

Peer to Peer Computing

Scalability of the BitTorrent P2P Application

Department of Computer Science Institute for System Architecture, Chair for Computer Networks. File Sharing

Scaling Problem Millions of clients! server and network meltdown. Peer-to-Peer. P2P System Why p2p?

Overlay and P2P Networks. Unstructured networks. Prof. Sasu Tarkoma

Distributed Systems. peer-to-peer Johan Montelius ID2201. Distributed Systems ID2201

P2P content distribution Jukka K. Nurminen

EECS 122: Introduction to Computer Networks Overlay Networks and P2P Networks. Overlay Networks: Motivations

Peer-peer and Application-level Networking. CS 218 Fall 2003

Internet Protocol Stack! Principles of Network Applications! Some Network Apps" (and Their Protocols)! Application-Layer Protocols! Our goals:!

Content Overlays (continued) Nick Feamster CS 7260 March 26, 2007

Overview Computer Networking Lecture 17: Delivering Content Peer to Peer Examples Peter Steenkiste

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

CS 43: Computer Networks BitTorrent & Content Distribution. Kevin Webb Swarthmore College September 28, 2017

Game Theory. Presented by Hakim Weatherspoon

Scaling Problem Millions of clients! server and network meltdown. Peer-to-Peer. P2P System Why p2p?

Peer-to-Peer Systems. Internet Computing Workshop Tom Chothia

Middleware and Distributed Systems. Peer-to-Peer Systems. Peter Tröger

Flooded Queries (Gnutella) Centralized Lookup (Napster) Routed Queries (Freenet, Chord, etc.) Overview N 2 N 1 N 3 N 4 N 8 N 9 N N 7 N 6 N 9

Distributed Systems. 16. Distributed Lookup. Paul Krzyzanowski. Rutgers University. Fall 2017

P2P content distribution

Overlay Networks: Motivations. EECS 122: Introduction to Computer Networks Overlay Networks and P2P Networks. Motivations (cont d) Goals.

Content Distribution and BitTorrent [Based on slides by Cosmin Arad]

Overlay and P2P Networks. Unstructured networks. PhD. Samu Varjonen

416 Distributed Systems. Mar 3, Peer-to-Peer Part 2

BitTorrent and CoolStreaming

DISTRIBUTED COMPUTER SYSTEMS ARCHITECTURES

Advanced Computer Networks

Main Challenge. Other Challenges. How Did it Start? Napster. Model. EE 122: Peer-to-Peer Networks. Find where a particular file is stored

Outline. P2P and Content Distribution. P2P Definitions. More Definitions. P2P is not new. P2P Definitions. P2P Overview P2P systems P2P and DRM

Overlay and P2P Networks. Unstructured networks. Prof. Sasu Tarkoma

Peer to Peer Networks

Distributed Systems. 17. Distributed Lookup. Paul Krzyzanowski. Rutgers University. Fall 2016

BitTorrent. Internet Technologies and Applications

P2P Computing. Nobuo Kawaguchi. Graduate School of Engineering Nagoya University. In this lecture series. Wireless Location Technologies

C 1. Last Time. CSE 486/586 Distributed Systems Distributed Hash Tables. Today s Question. What We Want. What We Want. What We Don t Want

CSCI-1680 P2P Rodrigo Fonseca

Peer to Peer Networks

Peer-to-Peer Internet Applications: A Review

Telematics Chapter 9: Peer-to-Peer Networks

CPSC156a: The Internet Co-Evolution of Technology and Society

CS5412: TORRENTS AND TIT-FOR-TAT

Peer- to- Peer and Social Networks. An overview of Gnutella

Unit 8 Peer-to-Peer Networking

Peer-to-peer networks: pioneers, self-organisation, small-world-phenomenons

Localhost: A browsable peer-to-peer file sharing system

improving the performance and robustness of P2P live streaming with Contracts

Implementation of the Gnutella Protocol

Taming the Flood: How I Learned to Stop Worrying and Love the Swarm Yogesh Vedpathak Cleversafe

Last Time. CSE 486/586 Distributed Systems Distributed Hash Tables. What We Want. Today s Question. What We Want. What We Don t Want C 1

Web caches (proxy server) Applications (part 3) Applications (part 3) Caching example (1) More about Web caching

Peer-to-Peer Systems. Winter semester 2014 Jun.-Prof. Dr.-Ing. Kalman Graffi Heinrich Heine University Düsseldorf

CS 43: Computer Networks. 14: DHTs and CDNs October 3, 2018

CS December 2017

PEER-TO-PEER NETWORKS, DHTS, AND CHORD

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University

Distributed Systems. 25. Distributed Caching & Some Peer-to-Peer Systems Paul Krzyzanowski. Rutgers University. Fall 2017

Miscellaneous. Name Service. Examples (cont) Examples. Name Servers Partition hierarchy into zones. Domain Naming System

Lesson 9 Applications of DHT: Bittorrent Mainline DHT, the KAD network

Transcription:

Peer-to-Peer Applications Reading: 9.4 Acknowledgments: Lecture slides are from Computer networks course thought by Jennifer Rexford at Princeton University. When slides are obtained from other sources, a reference will be noted on the bottom of that slide and full reference details on the last slide. 1

Goals of Todayʼs Lecture Scalability in distributing a large file Single server and N clients Peer-to-peer system with N peers Searching for the right peer Central directory (Napster) Query flooding (Gnutella) Hierarchical overlay (Kazaa) BitTorrent Transferring large files Preventing free-riding 2

Clients and Servers Client program Running on end host Requests service E.g., Web browser GET /index.html Server program Running on end host Provides service E.g., Web server Site under construction 3

Client-Server Communication Client sometimes on Initiates a request to the server when interested E.g., Web browser on your laptop or cell phone Doesn t communicate directly with other clients Needs to know the server s address Server is always on Services requests from many client hosts E.g., Web server for the www.cnn.com Web site Doesn t initiate contact with the clients Needs a fixed, wellknown address 4

Server Distributing a Large File F bits d 4 upload rate u s Internet d 1 d 3 d 2 Download rates d i 5

Server Distributing a Large File Server sending a large file to N receivers Large file with F bits Single server with upload rate u s Download rate d i for receiver i Server transmission to N receivers Server needs to transmit NF bits Takes at least NF/u s time Receiving the data Slowest receiver receives at rate d min = min i {d i } Takes at least F/d min time Download time: max{nf/u s, F/d min } 6

Speeding Up the File Distribution Increase the upload rate from the server Higher link bandwidth at the one server Multiple servers, each with their own link Requires deploying more infrastructure Alternative: have the receivers help Receivers get a copy of the data And then redistribute the data to other receivers To reduce the burden on the server 7

Peers Help Distributing a Large File F bits d 4 upload rate u s Internet u 4 d 1 d 3 u 1 u 2 u 3 d 2 Upload rates u i Download rates d i 8

Peers Help Distributing a Large File Start with a single copy of a large file Large file with F bits and server upload rate u s Peer i with download rate d i and upload rate u i Two components of distribution latency Server must send each bit: min time F/u s Slowest peer receives each bit: min time F/d min Total upload time using all upload resources Total number of bits: NF Total upload bandwidth u s + sum i (u i ) Total: max{f/u s, F/d min, NF/(u s +sum i (u i ))} 9

Comparing the Two Models Download time Client-server: max{nf/u s, F/d min } Peer-to-peer: max{f/u s, F/d min, NF/(u s +sum i (u i ))} Peer-to-peer is self-scaling Much lower demands on server bandwidth Distribution time grows only slowly with N But Peers may come and go Peers need to find each other Peers need to be willing to help each other 10

Challenges of Peer-to-Peer Peers come and go Peers are intermittently connected May come and go at any time Or come back with a different IP address How to locate the relevant peers? Peers that are online right now Peers that have the content you want How to motivate peers to stay in system? Why not leave as soon as download ends? Why bother uploading content to anyone else? 11

Locating the Relevant Peers Three main approaches Central directory (Napster) Query flooding (Gnutella) Hierarchical overlay (Kazaa, modern Gnutella) Design goals Scalability Simplicity Robustness Plausible deniability 12

Peer-to-Peer Networks: Napster Napster history: the rise January 1999: Napster version 1.0 May 1999: company founded December 1999: first lawsuits 2000: 80 million users Napster history: the fall Mid 2001: out of business due to lawsuits Mid 2001: dozens of P2P alternatives that were harder to touch, though these have gradually been constrained 2003: growth of pay services like itunes Napster history: the resurrection Shawn Fanning, Northeastern freshman 2003: Napster name/logo reconstituted as a pay service 13

Napster Technology: Directory Service User installing the software Download the client program Register name, password, local directory, etc. Client contacts Napster (via TCP) Provides a list of music files it will share and Napster s central server updates the directory Client searches on a title or performer Napster identifies online clients with the file and provides IP addresses Client requests the file from the chosen supplier Supplier transmits the file to the client Both client and supplier report status to Napster 14

Napster Technology: Properties Server s directory continually updated Always know what music is currently available Point of vulnerability for legal action Peer-to-peer file transfer No load on the server Plausible deniability for legal action (but not enough) Proprietary protocol Login, search, upload, download, and status operations No security: cleartext passwords and other vulnerability Bandwidth issues Suppliers ranked by apparent bandwidth & response time 15

Napster: Limitations of Central Directory Single point of failure Performance bottleneck Copyright infringement File transfer is decentralized, but locating content is highly centralized So, later P2P systems were more distributed Gnutella went to the other extreme 16

Peer-to-Peer Networks: Gnutella Gnutella history 2000: J. Frankel & T. Pepper released Gnutella Soon after: many other clients (e.g., Morpheus, Limewire, Bearshare) 2001: protocol enhancements, e.g., ultrapeers Query flooding Join: contact a few nodes to become neighbors Publish: no need! Search: ask neighbors, who ask their neighbors Fetch: get file directly from another node 17

Gnutella: Query Flooding Fully distributed No central server Public domain protocol Many Gnutella clients implementing protocol Overlay network: graph Edge between peer X and Y if there s a TCP connection All active peers and edges is overlay net Given peer will typically be connected with < 10 overlay neighbors 18

Gnutella: Protocol Query message sent over existing TCP connections Peers forward Query message QueryHit sent over reverse path Query QueryHit Query QueryHit Query QueryHit Quer y File transfer: HTTP Scalability: limited scope flooding Query 19

Gnutella: Peer Joining Joining peer X must find some other peers Start with a list of candidate peers X sequentially attempts TCP connections with peers on list until connection setup with Y X sends Ping message to Y Y forwards Ping message. All peers receiving Ping message respond with Pong message X receives many Pong messages X can then set up additional TCP connections 20

Gnutella: Pros and Cons Advantages Fully decentralized Search cost distributed Processing per node permits powerful search semantics Disadvantages Search scope may be quite large Search time may be quite long High overhead, and nodes come and go often 21

Peer-to-Peer Networks: KaZaA KaZaA history 2001: created by Dutch company (Kazaa BV) Single network called FastTrack used by other clients as well Eventually the protocol changed so other clients could no longer talk to it Smart query flooding Join: on start, the client contacts a super-node (and may later become one) Publish: client sends list of files to its super-node Search: send query to super-node, and the supernodes flood queries among themselves Fetch: get file directly from peer(s); can fetch from multiple peers at once 22

KaZaA: Exploiting Heterogeneity Each peer is either a group leader or assigned to a group leader TCP connection between peer and its group leader TCP connections between some pairs of group leaders Group leader tracks the content in all its children 23

KaZaA: Motivation for Super-Nodes Query consolidation Many connected nodes may have only a few files Propagating query to a sub-node may take more time than for the super-node to answer itself Stability Super-node selection favors nodes with high up-time How long you ve been on is a good predictor of how long you ll be around in the future 24

Peer-to-Peer Networks: BitTorrent BitTorrent history and motivation 2002: B. Cohen debuted BitTorrent Key motivation: popular content Popularity exhibits temporal locality (Flash Crowds) E.g., Slashdot/Digg effect, CNN Web site on 9/11, release of a new movie or game Focused on efficient fetching, not searching Distribute same file to many peers Single publisher, many downloaders Preventing free-loading 25

BitTorrent: Simultaneous Downloading Divide large file into many pieces Replicate different pieces on different peers A peer with a complete piece can trade with other peers Peer can (hopefully) assemble the entire file Allows simultaneous downloading Retrieving different parts of the file from different peers at the same time And uploading parts of the file to peers Important for very large files 26

BitTorrent: Tracker Infrastructure node Keeps track of peers participating in the torrent Peers register with the tracker Peer registers when it arrives Peer periodically informs tracker it is still there Tracker selects peers for downloading Returns a random set of peers Including their IP addresses So the new peer knows who to contact for data Can have trackerless system using DHT 27

BitTorrent: Chunks Large file divided into smaller pieces Fixed-sized chunks Typical chunk size of 256 Kbytes Allows simultaneous transfers Downloading chunks from different neighbors Uploading chunks to other neighbors Learning what chunks your neighbors have Periodically asking them for a list File done when all chunks are downloaded 28

BitTorrent: Overall Architecture Web Server Tracker Web page with link to.torrent.torrent A Peer [Leech] Downloader US B Peer [Leech] C Peer [Seed] 29

BitTorrent: Overall Architecture Web Server Tracker Web page with link to.torrent Get-announce A Peer [Leech] Downloader US B Peer [Leech] C Peer [Seed] 30

BitTorrent: Overall Architecture Web Server Tracker Web page with link to.torrent A Peer [Leech] Downloader US Response-peer list B Peer [Leech] C Peer [Seed] 31

BitTorrent: Overall Architecture Web Server Tracker Web page with link to.torrent A Shake-hand C Peer [Leech] Shake-hand B Peer [Seed] Downloader Peer US [Leech] 32

BitTorrent: Overall Architecture Web Server Tracker Web page with link to.torrent A Peer [Leech] Downloader US pieces B Peer [Leech] pieces C Peer [Seed] 33

BitTorrent: Overall Architecture Web Server Tracker Web page with link to.torrent A pieces C pieces Peer Peer [Leech] pieces B [Seed] Downloader Peer US [Leech] 34

BitTorrent: Overall Architecture Web Server Tracker Web page with link to.torrent A Get-announce Response-peer list pieces C pieces Peer Peer [Leech] pieces B [Seed] Downloader Peer US [Leech] 35

BitTorrent: Chunk Request Order Which chunks to request? Could download in order Like an HTTP client does Problem: many peers have the early chunks Peers have little to share with each other Limiting the scalability of the system Problem: eventually nobody has rare chunks E.g., the chunks need the end of the file Limiting the ability to complete a download Solutions: random selection and rarest first 36

BitTorrent: Rarest Chunk First Which chunks to request first? The chunk with the fewest available copies I.e., the rarest chunk first Benefits to the peer Avoid starvation when some peers depart Benefits to the system Avoid starvation across all peers wanting a file Balance load by equalizing # of copies of chunks 37

Free-Riding Problem in P2P Networks Vast majority of users are free-riders Most share no files and answer no queries Others limit # of connections or upload speed A few peers essentially act as servers A few individuals contributing to the public good Making them hubs that basically act as a server BitTorrent prevent free riding Allow the fastest peers to download from you Occasionally let some free loaders download 38

Bit-Torrent: Preventing Free-Riding Peer has limited upload bandwidth And must share it among multiple peers Prioritizing the upload bandwidth: tit for tat Favor neighbors that are uploading at highest rate Rewarding the top four neighbors Measure download bit rates from each neighbor Reciprocates by sending to the top four peers Recompute and reallocate every 10 seconds Optimistic unchoking Randomly try a new neighbor every 30 seconds So new neighbor has a chance to be a better partner 39

BitTyrant: Gaming BitTorrent BitTorrent can be gamed, too Peer uploads to top N peers at rate 1/N E.g., if N=4 and peers upload at 15, 12, 10, 9, 8, 3 then peer uploading at rate 9 gets treated quite well Best to be the N th peer in the list, rather than 1 st Offer just a bit more bandwidth than the low-rate peers But not as much as the higher-rate peers And you ll still be treated well by others BitTyrant software http://bittyrant.cs.washington.edu/ 40

BitTorrent Today Significant fraction of Internet traffic Estimated at 30% Though this is hard to measure Problem of incomplete downloads Peers leave the system when done Many file downloads never complete Especially a problem for less popular content Still lots of legal questions remains Further need for incentives 41

Conclusions Peer-to-peer networks Nodes are end hosts Primarily for file sharing, and recently telephony Finding the appropriate peers Centralized directory (Napster) Query flooding (Gnutella) Super-nodes (KaZaA) BitTorrent Distributed download of large files Anti-free-riding techniques Great example of how change can happen so quickly in application-level protocols 42