Peer-to-Peer Networks

Similar documents
Chapter 2: Application layer

CMSC 332 Computer Networks P2P and Sockets

CS 640 Introduction to Computer Networks. Today s lecture. What is P2P? Lecture30. Peer to peer applications

Telecommunication Services Engineering Lab. Roch H. Glitho

Peer-to-Peer Architectures and Signaling. Agenda

Unit 8 Peer-to-Peer Networking

Advanced Computer Networks

Peer-to-Peer Applications Reading: 9.4

Telematics Chapter 9: Peer-to-Peer Networks

Scalable overlay Networks

Peer-to-Peer Signalling. Agenda

Last Lecture SMTP. SUNY at Buffalo; CSE 489/589 Modern Networking Concepts; Fall 2010; Instructor: Hung Q. Ngo 1

CS 3516: Advanced Computer Networks

Overlay networks. Today. l Overlays networks l P2P evolution l Pastry as a routing overlay example

Lecture 21 P2P. Napster. Centralized Index. Napster. Gnutella. Peer-to-Peer Model March 16, Overview:

Lecture 8: Application Layer P2P Applications and DHTs

Overlay and P2P Networks. Unstructured networks. PhD. Samu Varjonen

Peer-to-Peer (P2P) Systems

Architectures for Distributed Systems

Overlay and P2P Networks. Unstructured networks. Prof. Sasu Tarkoma

Overlay and P2P Networks. Unstructured networks. Prof. Sasu Tarkoma

Web caches (proxy server) Applications (part 3) Applications (part 3) Caching example (1) More about Web caching

Overlay networks. To do. Overlay networks. P2P evolution DHTs in general, Chord and Kademlia. Turtles all the way down. q q q

Peer-peer and Application-level Networking. CS 218 Fall 2003

A Survey of Peer-to-Peer Content Distribution Technologies

Department of Computer Science Institute for System Architecture, Chair for Computer Networks. File Sharing

Content Overlays. Nick Feamster CS 7260 March 12, 2007

Flooded Queries (Gnutella) Centralized Lookup (Napster) Routed Queries (Freenet, Chord, etc.) Overview N 2 N 1 N 3 N 4 N 8 N 9 N N 7 N 6 N 9

Peer-to-Peer Systems. Chapter General Characteristics

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University

Peer to Peer Computing

DISTRIBUTED COMPUTER SYSTEMS ARCHITECTURES

internet technologies and standards

Today. Architectural Styles

ENSC 835: HIGH-PERFORMANCE NETWORKS CMPT 885: SPECIAL TOPICS: HIGH-PERFORMANCE NETWORKS. Scalability and Robustness of the Gnutella Protocol

Content Search. Unstructured P2P

Content distribution networks

Peer-to-Peer Internet Applications: A Review

Internet Technology 3/2/2016

Internet Technology. 06. Exam 1 Review Paul Krzyzanowski. Rutgers University. Spring 2016

CS 3516: Computer Networks

Content Search. Unstructured P2P. Jukka K. Nurminen

Internet Services & Protocols

EE 122: Peer-to-Peer Networks

Today. Architectural Styles

CC451 Computer Networks

Chapter 10: Peer-to-Peer Systems

EECS 122: Introduction to Computer Networks Overlay Networks and P2P Networks. Overlay Networks: Motivations

File Sharing in Less structured P2P Systems

Making Gnutella-like P2P Systems Scalable

Peer-to-Peer Systems. Internet Computing Workshop Tom Chothia

Motivation for peer-to-peer

Addressed Issue. P2P What are we looking at? What is Peer-to-Peer? What can databases do for P2P? What can databases do for P2P?

Internet Protocol Stack! Principles of Network Applications! Some Network Apps" (and Their Protocols)! Application-Layer Protocols! Our goals:!

Page 1. How Did it Start?" Model" Main Challenge" CS162 Operating Systems and Systems Programming Lecture 24. Peer-to-Peer Networks"

Application Layer: P2P File Distribution

Overlay and P2P Networks. Introduction and unstructured networks. Prof. Sasu Tarkoma

Using peer to peer. Marco Danelutto Dept. Computer Science University of Pisa

Slides for Chapter 10: Peer-to-Peer Systems

Overlay Networks: Motivations. EECS 122: Introduction to Computer Networks Overlay Networks and P2P Networks. Motivations (cont d) Goals.

Distributed Information Processing

6. Peer-to-peer (P2P) networks I.

Chapter 9. Multimedia Networking. Computer Networking: A Top Down Approach

Unit background and administrivia. Foundations of Peer-to- Peer Applications & Systems

416 Distributed Systems. Mar 3, Peer-to-Peer Part 2

Early Measurements of a Cluster-based Architecture for P2P Systems

Applications & Application-Layer Protocols: The Domain Name System and Peerto-Peer

Introduction to P2P Computing

CPSC 426/526. P2P Lookup Service. Ennan Zhai. Computer Science Department Yale University

Distributed Systems. peer-to-peer Johan Montelius ID2201. Distributed Systems ID2201

Peer-to-Peer Systems. Network Science: Introduction. P2P History: P2P History: 1999 today

Stratos Idreos. A thesis submitted in fulfillment of the requirements for the degree of. Electronic and Computer Engineering

Networking Potpourri: Plug-n-Play, Next Gen

Distributed Systems. 17. Distributed Lookup. Paul Krzyzanowski. Rutgers University. Fall 2016

P2P Computing. Nobuo Kawaguchi. Graduate School of Engineering Nagoya University. In this lecture series. Wireless Location Technologies

Architectures for distributed systems (Chapter 2)

Characterizing Gnutella Network Properties for Peer-to-Peer Network Simulation

CS6601 DISTRIBUTED SYSTEM / 2 MARK

Peer to Peer Networks

P2P Applications. Reti di Elaboratori Corso di Laurea in Informatica Università degli Studi di Roma La Sapienza

Kademlia: A peer-to peer information system based on XOR. based on XOR Metric,by P. Maymounkov and D. Mazieres

P2P. 1 Introduction. 2 Napster. Alex S. 2.1 Client/Server. 2.2 Problems

Scaling Problem Millions of clients! server and network meltdown. Peer-to-Peer. P2P System Why p2p?

Security for Structured Peer-to-peer Overlay Networks. Acknowledgement. Outline. By Miguel Castro et al. OSDI 02 Presented by Shiping Chen in IT818

Peer-to-peer systems and overlay networks

CS 3516: Advanced Computer Networks

ECSP: An Efficient Clustered Super-Peer Architecture for P2P Networks

Overlay and P2P Networks. Unstructured networks I. Prof. Sasu Tarkoma

Distributed Systems Peer-to-Peer Systems

Introduction on Peer to Peer systems

March 10, Distributed Hash-based Lookup. for Peer-to-Peer Systems. Sandeep Shelke Shrirang Shirodkar MTech I CSE

Scaling Problem Millions of clients! server and network meltdown. Peer-to-Peer. P2P System Why p2p?

12/5/16. Peer to Peer Systems. Peer-to-peer - definitions. Client-Server vs. Peer-to-peer. P2P use case file sharing. Topics

P2P Applications. Reti di Elaboratori Corso di Laurea in Informatica Università degli Studi di Roma La Sapienza Canale A-L Prof.ssa Chiara Petrioli

Introduction to Peer-to-Peer Systems

Assignment 5. Georgia Koloniari

Experimental Study of Skype. Skype Peer-to-Peer VoIP System

CSC 4900 Computer Networks: P2P and Sockets

Middleware and Distributed Systems. Peer-to-Peer Systems. Peter Tröger

Agent and Object Technology Lab Dipartimento di Ingegneria dell Informazione Università degli Studi di Parma. Distributed and Agent Systems

Transcription:

Peer-to-Peer Networks 14-740: Fundamentals of Computer Networks Bill Nace Material from Computer Networking: A Top Down Approach, 6 th edition. J.F. Kurose and K.W. Ross

Administrivia Quiz #1 is next week (25 Sep) Covers all material up to and including Queueing Theory Web site: Study Guide, Equation Sheet HW #1 is posted (due 2 Oct) Lab #1 is posted (due 4 Oct) TAs are here to help! Ask them questions! 2

traceroute P2P Overview Architecture components Napster (Centralized) Gnutella (Distributed) Skype and KaZaA (Hybrid, Hierarchical) KaZaA Reverse Engineering Study 3

What is P2P? Client / Server interaction Client: any end-host Server: specific end-host P2P: Peer-to-peer Any end-host PowerBook G4 PowerBook G4

Aim to leverage resources available on clients (peers) Hard drive space Bandwidth (especially upload) Computational power Anonymity (i.e. Zombie botnets) Edge-ness (i.e. being distributed at network edges)

Clients are particularly fickle Users have not agreed to provide any particular level of service Users are not altruistic -- algorithm must force participation without allowing cheating Clients are not trusted Client code may be modified And yet, availability of resources must be assured

P2P History Proto-P2P systems exist DNS, Netnews/Usenet Xerox Grapevine (~1982): name, mail delivery service Kicked into high gear in 1999 Many users had always-on broadband net connections 1st Generation: Napster (music exchange) 2nd Generation: Freenet, Gnutella, Kazaa, BitTorrent More scalable, designed for anonymity, fault-tolerant 3rd Generation: Middleware -- Pastry, Chord Provide for overlay routing to place/find resources 7

P2P Architecture Content Directory Database of content Structured? Unstructured? Which peer has what files? Metadata: Other info about files Signaling protocol How do peers exchange coordination messages? Proprietary? Encrypted? 8

Architecture (2) File transfer How does a peer retrieve a file from another peer? HTTP or HTTP-like Any peer must be able to send reply messages 9

Overlay network is not the network Overlay networks are formed on top of network graph Connect peers via abstract links in the overlay Transport accomplished on network edges Overlay algorithms abstract particulars of the network P2P Application Application Overlay Network one edge Transport Network perhaps even built on HTTP for transport! Data Link Physical

traceroute P2P Overview Architecture components Napster (Centralized) Gnutella (Distributed) Skype and KaZaA (Hybrid, Hierarchical) KaZaA Reverse Engineering Study 11

Napster Original centralized design 1. When peer connects it informs central server of IP address content 2. Marcia queries for I Like It Server looks through index Reply: Daichi has I Like It 3. Marcia requests file from Daichi Daichi Marcia 3 1 1 1 1 2 centralized directory server

Problems? File transfer is decentralized, but locating content is highly centralized Single point of failure Performance bottleneck Single point of lawsuit Result: Napster was owned by Best Buy Now it s a rebranded Rhapsody music streaming service 13

traceroute P2P Overview Architecture components Napster (Centralized) Gnutella (Distributed) Skype and KaZaA (Hybrid, Hierarchical) KaZaA Reverse Engineering Study 14

Gnutella Created in response to Napster problems Fully decentralized Does not depend on central directory Participants arrange themselves in overlay Queries flood network to find file Fully anonymous Public domain protocol Various Gnutella clients 15

Bootstrapping 1. New peer X must find some member of the Gnutella network Use a list of candidate peers 2. X sequentially attempts to make TCP connection with peers on list until successful with peer Y 3. X sends ping message to Y; Y forwards ping message 4. All peers receiving a ping message respond to X with a pong message 5. X receives many pong messages and can setup additional TCP connections 16

Query Flooding Query messages sent over existing TCP connections Peers forward query message File transfer (HTTP) Query Query QueryHit Query Query QueryHit QueryHit messages sent over reverse path Query File transfer arranged over HTTP QueryHit

Limited Scope Query Flooding Original design not scalable Exponential increase in signaling traffic Solution is to limit scope of query Include peer-count field in query message, e.g. peer-count = 4 This field gets decremented by 1 at each hop Message stops propagating when peer-count hits zero Query (peer-count = 3) Query (peer-count = 2) 18

Question If peer-count = 4 at the start, how many peers would the query message eventually reach? 19

More Questions Is limited scope query flooding scalable? (i.e. How does number of nodes affect message counts?) 20

Even more questions Are we guaranteed to find an object? (Assume the object exists somewhere in the overlay network) 21

traceroute P2P Overview Architecture components Napster (Centralized) Gnutella (Distributed) Skype and KaZaA (Hybrid, Hierarchical) KaZaA Reverse Engineering Study 22

KaZaa: Exploiting Heterogeneity Each peer is either a Super Node (SN) or an Ordinary Node (ON) assigned to a SN TCP connection between ON and its SN TCP connections between some pairs of SNs SN tracks the content in all its children

KaZaa Queries Each file has a hash and a descriptor Client sends keyword query to its SN SN responds with matches: For each match: metadata, hash, IP address If SN forwards query to other SNs, they respond with matches Client then selects files for downloading HTTP requests using hash as identifier sent to peers holding desired file 24

Measurement Study Developed tools to reverse engineering KaZaA Attempt to answer the following questions: What is the ratio of SN to ONs? What is the fraction of SNs overall? How are SNs connected, sparsely or densely? How does ON pick best SN? Random port numbers and NATs? 25

Structural Properties Deployed apparatus in Polytechnic campus and broadband residential network SN connects to 40-50 other SNs (dynamic) SN has 100-160 ONs at Polytechnic, 55-70 at access network Given 3 million peers, 25000 40000 SNs SN is connected to ~0.1% of other SNs 26

Unanswered Questions... Details about the residential access network? Where is it? What is it? What is the uplink/download bandwidth? How long was the measurement study? 6 hours on 2 days? Aug 22 03, Oct 24 03 How are these time periods representative samples? Where did the 3 million peers number come from? From KaZaA? 27

Overlay Dynamics Connection lifetimes are short Average for ON-SN is 34 mins, SN-SN is 11 mins 38% of ON-SN and 32% of SN-SN lasted < 30 secs Why so short? SN searching for other SNs with small workload Long-term connection shuffling, so larger set of SNs can be explored Exchange of SN lists 28

Unanswered Questions... Big jump from overlay dynamic numbers to conjectures of what SNs are doing How can we interpret these numbers better? Staircases in the cumulative distribution? Different distinct groups of connection times Compare these times to conjectures 29

Parent Selection Workload Exact algorithm to calculate workload is unknown Tied to the number of connections a SN is current supporting Locality RTT measurements 60% of SN-SN connections < 50 msec 40% of ON-SN < 5 msecs Transatlantic traffic ~ 100 msecs Transpacific traffic ~ 180 msecs Topological closeness (Prefix matching) SNs in SN list close to ON Issues with this methodology? 30

Skype P2P Voice-over-IP (VoIP) pc-to-pc, pc-to-phone, phoneto-pc also IM, video proprietary application-layer protocol (inferred via reverse engineering) Skype login server hierarchical overlay

Making a Call User starts Skype Client registers with SN list of bootstrap SNs Client logs in (authenticates) Skype login server Call: client queries SN with callee ID SN contacts other SNs (how? unknown) to find addr of callee SN returns address to client Client directly contacts callee (TCP)

Lesson Objectives Now, you should be able to: list reasons that led to the creation of P2P networks describe what an overlay network is and how it is different from the internet use historical P2P networks to describe centralized P2P networks, fully distributed P2P networks, and hierarchical P2P networks describe search techniques in the various P2P forms, and to analyze search efficiencies 33