Unit 8 Peer-to-Peer Networking

P2P Systems Use the vast resources of machines at the edge of the Internet to build a network that allows resource sharing without any central authority.

Client/Server System Web Server FTP Server Media Server Database Server Application Server Every entity has its dedicated different role (Client/Server)

Pure P2P architecture no always-on server end systems directly communicate peer-peer Introduction Application 2-4

P2P Applications

P2P Applications P2P Search, File Sharing and Content spreading Napster, Gnutella, Kazaa, edonkey, BitTorrent Chord, CAN, Pastry/Tapestry, Kademlia, Bullet, SplitStream, CREW, FareCAST P2P Communications MSN, Skype, Social Networking Apps P2P Storage OceanStore/POND, CFS (Collaborative FileSystems),TotalRecall, FreeNet, Wuala

Peer to Peer File Sharing Introduction 1-7

P2P Communication Instant Messaging Skype is a VoIP P2P system Alice runs IM client application on her notebook computer Intermittently connects to Internet Alice initiates direct TCP connection with P Bob, then chats 2 Gets new IP address for each connection Register herself with system Learns from system that Bob in her buddy list is active

Characteristics of P2P Systems Exploit edge resources. Significant autonomy from any centralized authority. Storage, content, CPU, Human presence. Each node can act as a Client as well as a Server. Resources at edge have intermittent connectivity, constantly being added & removed.

Promising properties of P2P Self-organizing Massive scalability Autonomy : non single point of failure Resilience to Denial of Service Load distribution

Overlay Network A P2P network is an overlay network. Each link between peers consists of one or more IP links.

Overlays : All in the application layer Tremendous design flexibility Topology, maintenance Message types Protocol Messaging over TCP or UDP Underlying physical network is transparent to developer But some overlays exploit proximity

Overlay Graph Virtual edge TCP connection or simply a pointer to an IP address Overlay maintenance Periodically ping to make sure neighbor is still alive Or verify aliveness while messaging If neighbor goes down, may want to establish new edge New incoming node needs to bootstrap

Distributed Hash Table (DHT) A distributed hash table (DHT) is a class of a decentralized distributed system that provides a lookup service similar to a hash table: (key, value) pairs are stored in a DHT, and any participating node can efficiently retrieve the value associated with a given key. Responsibility for maintaining the mapping from keys to values is distributed among the nodes, in such a way that a change in the set of participants causes a minimal amount of disruption. This allows a DHT to scale to extremely large numbers of nodes and to handle continual node arrivals, departures, and failures. DHTs form an infrastructure that can be used to build more complex services, such as anycast, cooperative Web caching, distributed file systems, domain name services, instant messaging, multicast, and also peer-to-peer file sharing and content distribution systems.

DHTs characteristically emphasize the following properties: Autonomy and decentralization: the nodes collectively form the system without any central coordination. Fault tolerance: the system should be reliable (in some sense) even with nodes continuously joining, leaving, and failing. Scalability: the system should function efficiently even with thousands or millions of nodes.

P2P Case study: Skype inherently P2P: pairs of users communicate. proprietary Skype application-layer login server protocol (inferred via reverse engineering) hierarchical overlay with SNs Index maps usernames to IP addresses; distributed over SNs Skype clients (SC) Supernode (SN) Application 2-17