Peer- Peer to -peer Systems

Similar documents
08 Distributed Hash Tables

Chapter 10: Peer-to-Peer Systems

Motivation for peer-to-peer

Overlay networks. Today. l Overlays networks l P2P evolution l Pastry as a routing overlay example

Slides for Chapter 10: Peer-to-Peer Systems. From Coulouris, Dollimore, Kindberg and Blair Distributed Systems: Concepts and Design

Slides for Chapter 10: Peer-to-Peer Systems

CS4700/CS5700 Fundamentals of Computer Networks

Telematics Chapter 9: Peer-to-Peer Networks

Overlay and P2P Networks. Structured Networks and DHTs. Prof. Sasu Tarkoma

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University

Distributed Hash Tables

Peer-to-peer computing research a fad?

Distributed Systems Peer-to-Peer Systems

P2P Network Structured Networks: Distributed Hash Tables. Pedro García López Universitat Rovira I Virgili

CS514: Intermediate Course in Computer Systems

Module SDS: Scalable Distributed Systems. Gabriel Antoniu, KERDATA & Davide Frey, ASAP INRIA

Distributed Hash Tables: Chord

Web caches (proxy server) Applications (part 3) Applications (part 3) Caching example (1) More about Web caching

Modern Technology of Internet

Chapter 10 Peer-to-Peer Systems

Content Overlays. Nick Feamster CS 7260 March 12, 2007

12/5/16. Peer to Peer Systems. Peer-to-peer - definitions. Client-Server vs. Peer-to-peer. P2P use case file sharing. Topics

Overlay and P2P Networks. Structured Networks and DHTs. Prof. Sasu Tarkoma

Chapter 10 Peer-to-Peer Systems

Middleware and Distributed Systems. Peer-to-Peer Systems. Peter Tröger

Searching for Shared Resources: DHT in General

Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems

Advanced Computer Networks

Architectures for Distributed Systems

Reminder: Distributed Hash Table (DHT) CS514: Intermediate Course in Operating Systems. Several flavors, each with variants.

Distributed Hash Tables

Peer-to-Peer Internet Applications: A Review

INF5070 media storage and distribution systems. to-peer Systems 10/

March 10, Distributed Hash-based Lookup. for Peer-to-Peer Systems. Sandeep Shelke Shrirang Shirodkar MTech I CSE

Ken Birman. Cornell University. CS5410 Fall 2008.

internet technologies and standards

(C) Chukiat Worasucheep 1

Searching for Shared Resources: DHT in General

CS 640 Introduction to Computer Networks. Today s lecture. What is P2P? Lecture30. Peer to peer applications

Telecommunication Services Engineering Lab. Roch H. Glitho

Kademlia: A P2P Informa2on System Based on the XOR Metric

Unit 8 Peer-to-Peer Networking

Peer-to-Peer (P2P) Systems

INF5071 Performance in distributed systems: Distribution Part III

: Scalable Lookup

Scalability In Peer-to-Peer Systems. Presented by Stavros Nikolaou

Should we build Gnutella on a structured overlay? We believe

Large-scale distributed systems. Anne-Marie Kermarrec

L3S Research Center, University of Hannover

11/10/08. Today. P561: Network Systems Week 7: Finding content Multicast. Names and addresses. Naming in systems. Key considerations

CIS 700/005 Networking Meets Databases

Distributed Systems. peer-to-peer Johan Montelius ID2201. Distributed Systems ID2201

Peer-to-Peer Systems. Chapter General Characteristics

Peer-to-Peer Systems and Distributed Hash Tables

Peer-to-peer systems and overlay networks

Distributed Meta-data Servers: Architecture and Design. Sarah Sharafkandi David H.C. Du DISC

Today. Why might P2P be a win? What is a Peer-to-Peer (P2P) system? Peer-to-Peer Systems and Distributed Hash Tables

A Survey of Peer-to-Peer Content Distribution Technologies

P2P: Distributed Hash Tables

A Scalable Content- Addressable Network

Making Gnutella-like P2P Systems Scalable

CompSci 356: Computer Network Architectures Lecture 21: Overlay Networks Chap 9.4. Xiaowei Yang

Locality in Structured Peer-to-Peer Networks

P2P Network Structured Networks (IV) Distributed Hash Tables. Pedro García López Universitat Rovira I Virgili

Lecture 8: Application Layer P2P Applications and DHTs

Flooded Queries (Gnutella) Centralized Lookup (Napster) Routed Queries (Freenet, Chord, etc.) Overview N 2 N 1 N 3 N 4 N 8 N 9 N N 7 N 6 N 9

LECTURE 3: CONCURRENT & DISTRIBUTED ARCHITECTURES

15-744: Computer Networking P2P/DHT

MOPAR: A Mobile Overlay Peer-to-Peer Architecture for Scalable Massively Multiplayer Online Games

Overlay networks. To do. Overlay networks. P2P evolution DHTs in general, Chord and Kademlia. Turtles all the way down. q q q

Structured Peer-to-Peer Search to build a Bibliographic Paper Recommendation System

Secure Distributed Storage in Peer-to-peer networks

Handling Churn in a DHT

Introduction to Peer-to-Peer Systems

An Expresway over Chord in Peer-to-Peer Systems

Bayeux: An Architecture for Scalable and Fault Tolerant Wide area Data Dissemination

Subway : Peer-To-Peer Clustering of Clients for Web Proxy

Distributed Systems. 17. Distributed Lookup. Paul Krzyzanowski. Rutgers University. Fall 2016

Lecture 6: Overlay Networks. CS 598: Advanced Internetworking Matthew Caesar February 15, 2011

Athens University of Economics and Business. Dept. of Informatics

Scaling Out Systems. COS 518: Advanced Computer Systems Lecture 6. Kyle Jamieson

Challenges in the Wide-area. Tapestry: Decentralized Routing and Location. Global Computation Model. Cluster-based Applications

*Adapted from slides provided by Stefan Götz and Klaus Wehrle (University of Tübingen)

Distributed Hash Tables

Peer-to-Peer Systems. Network Science: Introduction. P2P History: P2P History: 1999 today

DHT Overview. P2P: Advanced Topics Filesystems over DHTs and P2P research. How to build applications over DHTS. What we would like to have..

Distributed Knowledge Organization and Peer-to-Peer Networks

Overview Computer Networking Lecture 16: Delivering Content: Peer to Peer and CDNs Peter Steenkiste

Coordinate-based Routing:

CS5412: TIER 2 OVERLAYS

Kademlia: A peer-to peer information system based on XOR. based on XOR Metric,by P. Maymounkov and D. Mazieres

GNUnet Distributed Data Storage

DATA. The main challenge in P2P computing is to design and implement LOOKING UP. in P2P Systems

Distributed Data Management. Profr. Dr. Wolf-Tilo Balke Institut für Informationssysteme Technische Universität Braunschweig

Distributed Systems. 16. Distributed Lookup. Paul Krzyzanowski. Rutgers University. Fall 2017

Overlay and P2P Networks. Introduction. Prof. Sasu Tarkoma

6 Structured P2P Networks

Cycloid: A constant-degree and lookup-efficient P2P overlay network

Distributed Hash Table

Structured Peer-to-Peer Networks

Transcription:

Peer-to-peer Systems Brian Nielsen bnielsen@cs.aau.dk

Client-Server Centralized Bottleneck Functional specialization Single point of failure Central administration NFS Client NFS Client NFS Server NFS Client NFS Client

Peer2Peer Peer = Equal Partner Peers are equal computers located at the border of the network Logical overlay network on top of IP network NFS Host NFS Host IP-network NFS Host NFS Host NFS Host

Aim of peer-to-peer systems Sharing of data and resources at very large scale No centralized and separately managed servers and infrastructure Share load by using computer resources (memory and CPU) contributed t by Endhosts located at the edges of the internet Security Anonymity

Background Pioneers: Napster, Gnutella, FreeNet Hot / new research topic: Infrastructure Pastry, Tapestry, Chord, Kademlia,.. Application: File sharing: CFS, PAST [SOSP 01] Network storage: FarSite [Sigmetrics 00], Oceanstore [ASPLOS 00], PAST [SOSP 01] Multicast: t Herald [HotOS 01], Bayeux [NOSDAV 01], CANmulticast [NGC 01], SCRIBE [NGC 01]

Napster: centralized, peers replicated index Napster server Index 1. File location request 2. List of peers offering the file 5. Index update 3. File request 4. File delivered Napster server Index

Gnutella-Flooding Unstructured Overlay Network: Each node knows a set of other nodes Flood overlay network with query: Who has file X?

Characteristics ti Distributed Participants distributed across the internet All contributes with resources Decentralized control no central decision point no single point of failure dynamic: unpredictable set of participants Self-organizing No permanent infrastructure No centralized administration Symmetric communication/roles i Same functional capabilities

Common issues Organize, maintain i overlay network node arrivals node failures Resource allocation/load balancing Efficient Resource localization Locality (network proximity) Idea: generic P2P middleware (aka substrate )

Architecture OceanStore IVY? P2P application layer DHT API P2P substrate (self-organizing overlay network) TCP/IP Internet

Basic interface for distributed hash table (DHT) Peer-to-peer object location and routing substrate Distributed Hash Table: maps object key to a live node put(guid, data) The data is stored in replicas at all nodes responsible for the object identified by GUID. remove(guid) Deletes all references to GUID and the associated data. value = get(guid) The data associated with GUID is retrieved from one of the nodes responsible it. Pastry (developed at Microsoft Research Cambridge/Rice) is an example of such an infrastructure. t

Example DHT Nodes are given a GUID (Globally Unique ID) Data values are identified by a key GUID Store (key, value) pairs Key computed as hash of value Hash-key( Die Hard.mpg ) = 28 Store data at node whose id is numerically closest to key Each node receives at most K/N keys Keys are >= 128 bits Hash-key (http://www. research.microsoft. com/~antr) = 4ff367a14b374e3dd99f (hex)

Secure Hash Function Aka. Secure digest, Given data item M, h=h(m) Properties 1. Given M, h is easy to compute 2. Given h, M is hard to compute 3. Given M,it is hard to find M s.t H(M)=H(M ) ) Uniqueness: For two items M, M it is unlikely that H(M)=H(M ) ( ) Tamperproof: Contents of M cannot be modified and produce same hash-key E.g MD5, SHA-1

Exhaustive Routing Exhaustive Routing Table for node 65a1fc d471f1 d467c4 d462ba d4213f Node ID 65a1fc IP self(127.0.0.1) 65a1ff 123.4.4.9 65a20f 47.122.99.7 d13da3 d13da3 123.4.4.8 d4213f 10.10.34.56 65a1fc 65a1ff 65a20f 2 Million Nodes=2Million entries =2M*(128+32)/8 bytes= 40MB+data!!!!!!!!! Impossible to collect and maintain

Circular routing Each node knows the l left and l right neigbors Leaf-set (size 2l): Route to node closest to target Circular Routing Table for node 65A1FC Node ID IP Left l 47.122.99.7 Left l-1 123.4.4.9 Left 1 132.32.32.40 self self(127.0.0.1) Right1 123.4.4.8 65A1FC 0 FFFFF...F (2 128-1) D471F1 D467C4 D46A1C D13DA3 Right l 10.10.34.56 2 Million nodes, l = 4 => 2M/(2*4) = 125000 hops!!!

Pastry Pastry (developed at Microsoft Research Cambridge/Rice) is an example of DHT Generic p2p location and routing substrate t (DHT) Self-organizing overlay network (join, departures, locality repair) Consistent hashing Lookup/insert object in < log 2b N routing steps (expected) O(log N) per-node state Network locality heuristics Scalable, fault resilient, self-organizing, locality aware, secure (according to authors)

Pastry: Object distribution 2 128-1 nodeids O objid/key Consistent hashing 128 bit circular id space nodeids (uniform random) objids/keys (uniform random) Invariant: node with numerically closest nodeid maintains object

Pastry: Object insertion/lookup 2 128-1 O X Msg with key X is routed to live node with nodeid closest to X Problem: complete routing table not feasible Route(X)

Longest Common Prefix Two numerically close IDs are also close nodes in the overlay network D471F1 D471F3 D47889 D99888 999999 The longer the common prefix the closer together (on logical ring) View address as hierarchy Cluster nodes with numerically ID close The closer more routing info denser routing table

Prefix Routing Eg. Simple ID = 4 digit (range 0-3) string Routing Table for A: NB: asociated IP not shown *= don t care = select any (preferably close) node with matching prefix

Pastry: Routing d46a1c d471f1 d467c4 d462ba d4213f 65a1fc Route(d46a1c) d13da3 Properties log 2b N steps O(log N) state

Pastry routing table for node 65A1 NB: n = associated IP EX. Route from: to: 65A1 6544 Common Prefix Length p=2 Distinguising digit i = 4 Next-hop = R[p,i] = IP of 654* //index includes 0

Pastry: Leaf sets Each node maintains IP addresses of the nodes with the L numerically closest larger and smaller nodeids, respectively. routing efficiency/robustness fault detection (keep-alive) application-specific local coordination

Pastry: Routing procedure If (destination is within range of our leaf set) forward to numerically closest member else let p = length of shared prefix let i = value of l-th digit in D s address if (R[p,i] exists) forward to R[p,i] else forward to a known node that (a) shares at least as long a prefix p (b) is numerically closer than this node

Pastry: Routing Tradeoff O(log N) routing table size 2 b *l log b 2b N+2l O(log N) message forwarding steps log 2b N

Pastry: Locality properties Overlay network not related to geography or network distance (IP hops, RTT,..) Risk very long/slow transmissions in overlay network Prefer to route via nodes in nearby in network distance Proximity invariant: Each routing table entry refers to a node nearby to the local node among all nodes with the appropriate nodeid prefix. Assumption: scalar proximity metric e.g. ping/rtt delay, # IP hops traceroute, subnet masks a node can probe distance to any other node (Incomplete DB on registration ti country of IP addresses) Maintain Neighbor-set of network distance nearby nodes

Pastry: Node addition New node: X=d46a1c X=d46a1c d471f1 Z=d467c4 d462ba d4213f Route(d46a1c) d13da3 Important Observation: A = 65a1fc common prefix of X and intermediate node i increases by one at each hop Use i s routing column i as initial choice for X row i

Pastry: Node addition New node X contacts t nearby node A A routes join message to X, which arrives to Z, closest to X X obtains leaf set from Z, i th row for routing table from i th node from A to Z X informs any nodes that need to be aware of its arrival X also improves its table locality by requesting neighborhood sets from all nodes X knows In practice: optimistic approach

Node departure (failure) Leaf set repair (eager all the time): Send heart-beat messages to (left) leaf-set members request set from furthest live node in set Routing table repair (lazy upon failure): get table from peers in the same row, if not found from higher h rows Neighborhood set repair (eager)

Pastry: Average # of hops 4.5 4 of hops Average e number 3.5 3 2.5 2 1.5 1 0.5 Pastry Log(N) 0 1000 10000 100000 L =16, 100k random queries Number of nodes

END