Welcome to CS 3516: Advanced Computer Networks Prof. Yanhua Li Time: 9:00am 9:50am M, T, R, and F Location: Fuller 320 Fall 2017 A-term 1 Some slides are originally from the course materials of the textbook Computer Networking: A Top Down Approach, 7th edition, by Jim Kurose, Keith Ross, Addison-Wesley March 2016. Copyright 1996-2017 J.F Kurose and K.W. Ross, All Rights Reserved.
Chapter 2: outline 2.6 P2P applications 1. P2P vs Client&Server 2. Unstructured -to- Networks BitTorrent 3. Structured -to- Networks Distributed Hash Table (DHT) Application Layer 2-2
Client-server vs P2P architecture peer-to-peer client/server Application Layer 2-3
-to- Networks: v Unstructured -to- Networks Napster Gnutella BitTorrent 4
Chapter 2: outline 2.6 P2P applications 1. P2P vs Client&Server 2. Unstructured -to- Networks Napster Gnutella BitTorrent Application Layer 2-5
-to- Networks: How Did it Start? v A killer application: Napster Free music over the Internet v Key idea: share the content, storage and bandwidth of individual (home) users Internet 6
Model v Each user stores a subset of files v Each user has access (can download) files from all users in the system Challenges v Scale: up to hundred of thousands or millions of machines v Dynamicity: machines can come and go any time 7
Main Challenge v Find where a particular file is stored F E D E? A B C 8
Napster: Example m5 m6 E F E? E E? m5 m1 A m2 B m3 C m4 D m5 E m6 F m4 D m1 A m2 B m3 C 9
-to- Networks: Napster v Napster history: the rise January 1999: Napster version 1.0 May 1999: company founded September 1999: first lawsuits 2000: 80 million users v Napster history: the fall Mid 2001: out of business due to lawsuits Mid 2001: dozens of P2P alternatives that were harder to touch, though these have gradually been constrained 2003: growth of pay services like itunes v Napster history: the resurrection 2003: Napster reconstituted as a pay service 2006: still lots of file sharing going on Shawn Fanning, Northeastern freshman 10
Napster Technology: Directory Service v User installing the software Download the client program Register name, password, local directory, etc. v Client contacts Napster (via TCP) Provides a list of music files it will share and Napster s central server updates the directory v Client searches on a title or performer Napster identifies online clients with the file and provides IP addresses v Client requests the file from the chosen supplier Supplier transmits the file to the client Both client and supplier report status to Napster 11
Napster v Assume a centralized index system that maps files (songs) to machines that are alive v How to find a file (song) Query the index system à return a machine that stores the required file Ideally this is the closest/least-loaded machine ftp the file v Advantages: Simplicity, easy to implement sophisticated search engines on top of the index system v Disadvantages: Robustness, scalability 12
Napster Technology: Properties v Server s directory continually updated Always know what music is currently available v -to-peer file transfer No load on the server v As a protocol Login, search, upload, download, and status operations No security: cleartext passwords and other vulnerability v Bandwidth issues Suppliers ranked by apparent bandwidth & response time 13
Napster: Example m5 m6 E F E? E m1 A m2 B m3 C m4 D m1 A E? m5 m4 D m5 E m6 F m2 B m3 C 14
Napster: Limitations of Central Directory v Single point of failure v Performance bottleneck v Copyright infringement File transfer is decentralized, but locating content is highly centralized v So, later P2P systems were more distributed 15
-to- Networks: Gnutella v Gnutella history 2000: J. Frankel & T. Pepper released Gnutella Soon after: many other clients (e.g., Morpheus, Limewire, Bearshare) 2001: protocol enhancements, e.g., ultrapeers v Query flooding Join: contact a few nodes to become neighbors Publish: no need! Search: ask neighbors, who ask their neighbors Fetch: get file directly from another node 16
Gnutella v Distribute file location v Idea: flood the request v How to find a file: Send request to all neighbors Neighbors recursively multicast the request Eventually a machine that has the file receives the request, and it sends back the answer v Advantages: Totally decentralized, highly robust v Disadvantages: Not scalable; the entire network can be swamped with request (to alleviate this problem, each request has a TTL) 17
Gnutella v Ad-hoc topology v Queries are flooded for bounded number of hops v No guarantees on recall xyz xyz Query: xyz 18
Gnutella: Query Flooding v Fully distributed No central server v Public domain protocol v Many Gnutella clients implementing protocol Overlay network: graph v Edge between peer X and Y if there s a TCP connection v All active peers and edges is overlay net 19
Gnutella: Protocol Query message sent over existing TCP connections s forward Query message QueryHit sent over reverse path Scalability: limited scope flooding Query QueryHit Query QueryHit File transfer: HTTP 20
Gnutella: Pros and Cons v Advantages Fully decentralized, v Disadvantages Search scope may be quite large Search time may be quite long High overhead and nodes come and go often 21
Chapter 2: outline 2.6 P2P applications 1. P2P vs Client&Server 2. Unstructured -to- Networks BitTorrent Application Layer 2-22
BitTorrent: Simultaneous Downloading v Divide large file into many pieces Replicate different pieces on different peers A peer with a complete piece can trade with other peers can (hopefully) assemble the entire file v Allows simultaneous downloading Retrieving different parts of the file from different peers at the same time 23
BitTorrent Components v Seed with entire file Fragmented in pieces v Leech with an incomplete copy of the file v Torrent file Passive component Stores summaries of the pieces to allow peers to verify their integrity v Tracker Allows peers to find each other Returns a list of random peers 24
BitTorrent: Overall Architecture Web page with link to.torrent Web Server Tracker.torrent A [Leech] Downloader US B [Leech] C [Seed] 25
BitTorrent: Overall Architecture Web page with link to.torrent Web Server Tracker A [Leech] Downloader US B [Leech] C [Seed] 26
BitTorrent: Overall Architecture Web page with link to.torrent Web Server Tracker A [Leech] Downloader US B [Leech] C [Seed] 27
BitTorrent: Overall Architecture Web page with link to.torrent Web Server Tracker A [Leech] Downloader US Shake-hand B [Leech] C [Seed] 28
BitTorrent: Overall Architecture Web page with link to.torrent Web Server Tracker A [Leech] Downloader US B [Leech] pieces C [Seed] 29
BitTorrent: Overall Architecture Web page with link to.torrent Web Server Tracker A [Leech] Downloader US B [Leech] pieces C [Seed] 30
BitTorrent: Overall Architecture Web page with link to.torrent Web Server Tracker A [Leech] Downloader US B [Leech] pieces C [Seed] 31
Free-Riding Problem in P2P Networks v Vast majority of users are free-riders Most share no files and answer no queries Others limit # of connections or upload speed v A few peers essentially act as servers A few individuals contributing to the public good Making them hubs that basically act as a server v BitTorrent prevent free riding Allow the fastest peers to download from you Occasionally let some free loaders download 32
BitTorrent: requesting, sending file chunks requesting chunks: v at any given time, different peers have different subsets of file chunks v periodically, Alice asks each peer for list of chunks that they have v Alice requests missing chunks from peers, rarest first sending chunks: tit-for-tat v Alice sends chunks to those four peers currently sending her chunks at highest rate other peers are choked by Alice (do not receive chunks from her) re-evaluate top 4 every10 secs v every 30 secs: randomly select another peer, starts sending chunks optimistically unchoke this peer newly chosen peer may join top 4 Application Layer 2-33
BitTorrent: tit-for-tat (1) Alice optimistically unchokes Bob (2) Alice becomes one of Bob s top-four providers; Bob reciprocates (3) Bob becomes one of Alice s top-four providers higher upload rate: find better trading partners, get file faster! Application Layer 2-34
Chapter 2: summary our study of network apps now complete! v application architectures client-server P2P v application service requirements: reliability, throughput, delay, security v Internet transport service model connection-oriented, reliable: TCP unreliable, datagrams: UDP v specific protocols: HTTP SMTP, POP, IMAP DNS P2P: BitTorrent Application Layer 2-35
Quiz 5 on Friday P2P networks Napster Gnutella BitTorrent Mid-term next Friday (Tentative) Application Layer 2-36
Questions? Application Layer 2-37