Department of Computer Science Institute for System Architecture, Chair for Computer Networks. File Sharing

Department of Computer Science Institute for System Architecture, Chair for Computer Networks File Sharing

What is file sharing? File sharing is the practice of making files available for other users to download over the internet and smaller networks Content: typically music, videos or software File-sharing utilities: Conventional networks: FTP, HTTP P2P-networks: Napster, Gnutella, Emule/Kademlia, BitTorrent, Mute, Freenet, Gnu-Net File sharing has made P2P popular Due to their scalability requirements, file sharing systems are highly interesting systems for educational purposes 2

Why does file sharing make P2P networking so popular? Conventional networks offer rather cost-intensive central mass store solutions The P2P model enables a decentralized data management Peers can work as clients and as servers Users can share their files with hundreds of thousands of users and access hundreds of millions of files Files can be accessed directly from the local hard disc of other peers Overlay structure End host / Peer IP router IP link Overlay link 3

Outline Generations of P2P file sharing networks. First generation: Centralized peer-to-peer Decentralized P2P 2. Second generation: Hybrid P2P 3. Third generation: Distributed Hash Tables based on Kademlia BitTorrent 4

First generation - Centralized P2P Central directory server is the most important communication entity It makes lists of files with their associated peers available Decentralized file transfer Central directory server. Peers register at central server, publish their IP address and a list of files to be shared 2. Peers send queries to central server, server returns a list of peers with the requested objects 2 3 3. After pinging the selected peer the file transfer happens directly between peers Example: Napster 5

Strengths and Weaknesses - Centralized P2P Strengths: Consistent view of the network Central server always knows who is available and who is not Fast and efficient searching in small networks Central server makes all files offered by peers available to other peers Answer is guaranteed to be correct Weaknesses: Single point of failure If the central server crashes then the whole P2P network crashes Performance Bottleneck In a P2P network with hundreds of thousands of connected users, the central server needs enough computation power to handle thousands of queries per second -> Not scalable 6

First generation Decentralized P2P (/2) The decentralized network has only peers Peers called servents (server+client) 84.79.87.23 CONNECT 84.79.82.6 OK 84.79.65.57 84.79.79.3 Joining the network via Bootstrapping Contact out-of-band channel, e.g. get the member IP address from a website All subsequent access operations can use the host cache The communication establishment happens via a connect message, if the participating peer accepts this connection request it answers with the OK message Example: Gnutella.4 7

First generation Decentralized P2P (2/2) Discovering new peers Sending a broadcast ping Connected peers answer with pong Locating specific content Query messages are sent to all neighbours Query messages contain TTL (time to life) TTL counter is decreased by hop TTL=2 Example: Gnutella.4 IF (neighbour owns the file) THEN answer with QueryHit message via query path to querying peer; ELSE neighbour forwards Query message to its neighbours; Download the file Directly from source node e.g. via HTTP 8

Strengths and Weaknesses - Decentralized P2P Strengths: Absolutely decentralized network No single point of failure No performance bottleneck Main part of communication is anonymous Weaknesses: High network traffic via ping or pong packet Not protected against fire queries (i.e. attacker broadcasts artificial queries) Establishes request loops (i.e. overheads caused by message cycles) QueryHit rate is reduced in large networks 9

Second generation Hybrid P2P (/2) Goal: Combine efficiency properties of centralized model with robustness of decentralized model Web Cache Web Cache Web Cache list of SuperNodes Super Node Super Node Super Node Leaf Leaf Leaf HostCache Dividing servents into SuperNodes and LeafNodes Client software makes servents to SuperNodes by means of specific criteria (i.e. connection speed) SuperNodes are temporary servers Every SuperNode can manage approx. 2-5 leafs Reduce message transfer by Flow Control and Pong caching Bootstrapping by WebCaches (directory servers) and HostCache

Second generation Hybrid P2P (2/2) Leaf file transfer Super Node QueryHit Leaf Query forwarding Query Super Node Super Node Leaf Leaf Leaf Locating specific content Query message is sent to the SuperNode SuperNode decides about forwarding the query to the leafs by RouteTable managed by SuperNode RouteTableUpdate message is sent to the SuperNode by the LeafNode and contains all keywords which describe the content shared by this LeafNode Dependent on TTL, SuperNode can forward the query to other connected SuperNodes Eventually direct file transfer between LeafNodes

Third generation Structured P2P Lookup problems in unstructured P2P networks Where to place and how to find data items in a distributed system with regard to scalability and efficiency? Centralized P2P: Fast and efficient searching but not scalable Retrieving a data item is O() Decentralized P2P: Broadcast mechanism is not scalable about 4% of total traffic is caused by file queries Linear increasing communication overhead - O(n) No guarantee for a result when searching with limited TTL Structured P2P networks Based on Distributed Hash Table (DHT) Guaranteed correct results Quick search O(logN) 2

Distributed Hash Table (DHT) DHT is a data structure in which (Key, Value) pairs are distributed over the node amount as constantly as possible Key = hashed object identifier (OID) Value = IP address, NodeID, port Example: address space = 2 4 2 3 4 4 Logical view of the DHT 8 Key NodeID= 7 6 5 2 4 3 NodeIDs and keys are hashed in a common address space Every node is responsible for its own address space Thereby every node is analog to a bucket of a hash table The address space is viewed as a circle (Chord), a binary tree (Kademlia) or a quadratic area (CAN) If a node searches for a Value of a key then it has to locate a NodeID in whose address space the Key is included Mapping on the real topology 3

Distributed Hash Table - a simple algorithm (/2) 4 3 4 2 7 8 Key=H( my data ) (4,(IP,NodeID,Port)) Value 4 3 4 2 7 8 Node 4 manages Key 4 H( my data ) = 4 6 2 5 2 4 (4,(IP,Port)) 6 5 4 3 3 A naive algorithm for circle address space: Each node n knows its next neighbour NodeID -> successor(n) Each node n manages the address space s = ((predecessor(n)+) to n) consisting of (Key,Value) pairs Indicated by arrow Locating a Value Initial node searches for a Value of a Key H( my data ) () Initial node checks itself IF Key not found THEN initial Node n sends the FindValue msg to its next node (2) IF this next node manages this Key THEN send the Value to its next node until initial node is reached ELSE forward the FindValue msg to its next node 4

Distributed Hash Table - a simple algorithm (2/2) 2 3 4 4 7 6 2 4 3 After a new node joined the network, the responsible manager of its address space has to partition this address space New node gets succ(n) and its predecessor updates its succ(n) 8 5 4 2 4 4 7 6 2 4 3 If a node leaves the network then its next neighbour (succ(n)) adds its whole address space Its predecessor updates its succ(n) 8 5 5

Kademlia a specific DHT Approach Wide propagation (Emule, BitTorrent, ) Basic Idea: NodeIDs or their prefixes are mapped onto leafs of a binary tree Thereby, a prefix represents a single node managing all identifiers in its subtree On searching every hop leads to a small subtree, of which any node can be accessed Prefix of NodeID=4 () NodeID=3 () step: step:2 step:3 Prefix of NodeID= () NodeID=2 () Distance of two binary NodeIDs are calculated by the XOR metric Example: -> Node 2 XOR -> Node 4 Max. number of steps = 8+4=2 ; between Node 2 and Node 4 = ld(2)=3,58=4 Nodes 6 Hint: In the following examples a simplified tree structure is used

Kademlia routing information k-bucket The structure of an internal routing table consists of a binary tree Every node knows at least one node in each of its k-buckets Every node keeps a list (k-buckets) of (IP address, UDP port, i NodeID) triples for nodes of distance between and originated from itself with i 6 Every k-bucket keeps a maximum of contact triples Default k=2 (in our example below, kmax=2 only) Contacts are kept sorted by time last seen Least-recently seen node at the head Most-recently seen at the tail 2 i 2 7

Update of the k-bucket Each received message updates the corresponding k-bucket Case differentiation Sender node is known Move this node to the tail of the k-bucket Sender node is unknown and the k-bucket is not full Insert new node to tail of the k-bucket Sender node is unknown and the k-bucket is full Ping the node at the head of the k-bucket IF this node does not respond THEN remove this node AND insert the new node ELSE the new node is dropped 8

Kademlia k-bucket structure example View from Node : Possible routing table of NodeID contacts k-bucket 3 k-bucket k-bucket 2 k-bucket Node : Distance -> between and -> inserted in k-bucket 3 Node : Distance 7 -> between and -> inserted in k-bucket 2 Node : Distance 2 -> between and -> inserted in k-bucket... k-bucket 3 4 2 2 2 3 2 2 2 2 2 9

Kademlia Operation (/7) FIND_NODE Locating k closest NodeIDs to a NodeID:. Initiator picks the n closest NodeIDs of the matched k- bucket, THEN sends the FIND_NODE msg to n NodeIDs in parallel, default: n=3 NodeID e.g. n=2 k-buckets (distance between 2 i i and 2 ) 8 and 5 4 and 7 dist= dist= 2 and 3 step NodeID is searching for NodeID max. number of Hops: ld(2)=3,58=4 Nodes dist distance Contacts/ NodeIDs 2

Kademlia Operation (2/7) FIND_NODE Locating k closest NodeIDs to a NodeID:. Initiator picks the n closest NodeIDs of the matched k- bucket, THEN sends the FIND_NODE msg to n NodeIDs in parallel, default: n=3 2. Those n NodeIDs answer with their k closest nodes NodeID e.g. n=2 2 step 2 Answer with AND k-buckets (distance between 2 i i and 2 ) 8 and 5 4 and 7 dist= dist= 2 and 3 Contacts/ NodeIDs 4 and 7 dist= 2 and 3 8 and 5 Contacts/ NodeIDs 2

Kademlia Operation (3/7) FIND_NODE Locating k closest NodeIDs to a NodeID:. Initiator picks the n closest NodeIDs of the matched k- bucket, THEN sends the FIND_NODE msg to n NodeIDs in parallel, default: n=3 2. Those n NodeIDs answer with their k closest nodes 3. In recursive steps initiator selects the n closest NodeIDs from the response set and resends the FIND_NODE msg to them IF one node does not answer, THEN select another one of the set NodeID Contacts/ NodeIDs e.g. n=2 4 and 7 dist= 2 and 3 8 and 5 Contacts/ NodeIDs step 3 k-buckets (distance between 2 i i and 2 ) 8 and 5 4 and 7 dist= dist= 2 and 3 22

Kademlia Operation (4/7) FIND_NODE Locating k closest NodeIDs to a NodeID:. Initiator picks the n closest NodeIDs of the matched k- bucket, THEN sends the FIND_NODE msg to n NodeIDs in parallel, default: n=3 2. Those n NodeIDs answer with their k closest nodes 3. In recursive steps initiator selects the n closest NodeIDs from the response set and resends the FIND_NODE msg to them IF one node does not answer, THEN select another one of the set Recipient returns (IP-Address, UDP-Port, NodeID) NodeID Contacts/ NodeIDs step 3 e.g. n=2 Answer with 2 and 3 dist= dist= 4 and 7 8 and5 Contacts/ NodeIDs Ping 4 and 7 dist= 2 and 3 8 and 5 Contacts/ NodeIDs k-buckets (distance between 2 i i and 2 ) 8 and 5 4 and 7 dist= dist= 2 and 3 23

Kademlia Operation (5/7) FIND_NODE Locating k closest NodeIDs to a NodeID:. Initiator picks the n closest NodeIDs of the matched k- bucket, THEN sends the FIND_NODE msg to n NodeIDs in parallel, default: n=3 2. Those n NodeIDs answer with their k closest nodes 3. In recursive steps initiator selects the n closest NodeIDs from the response set and resends the FIND_NODE msg to them IF one node does not answer, THEN select another one of the set Recipient returns (IP-Address, UDP-Port, NodeID) NodeID Contacts/ NodeIDs e.g. n=2 Contact the node No reaction 4 and 7 dist= 2 and 3 8 and 5 Contacts/ NodeIDs 2 and 3 dist= dist= 4 and 7 8 and5 Contacts/ NodeIDs k-buckets (distance between 2 i i and 2 ) 8 and 5 4 and 7 dist= dist= 2 and 3 24

Kademlia Operation (6/7) FIND_NODE Locating k closest NodeIDs to a NodeID:. Initiator picks the n closest NodeIDs of the matched k- bucket, THEN sends the FIND_NODE msg to n NodeIDs in parallel, default: n=3 2. Those n NodeIDs answer with their k closest nodes 3. In recursive steps initiator selects the n closest NodeIDs from the response set and resends the FIND_NODE msg to them IF one node does not answer, THEN select another one of the set Recipient returns (IP-Address, UDP-Port, NodeID) NodeID Contacts/ NodeIDs 3 e.g. n=2 2 Answer with AND Answer with No reaction Ping 4 and 7 dist= 2 and 3 8 and 5 Contacts/ NodeIDs 2 and 3 dist= dist= 4 and 7 8 and5 Contacts/ NodeIDs k-buckets (distance between 2 i i and 2 ) 8 and 5 4 and 7 dist= dist= 2 and 3 25

Kademlia Operation (7/7) FIND_VALUE Locating the Value of a Key The process is similar to FIND_NODE IF a requested node manages this Key, THEN answer with Key Value ELSE FIND_VALUE get k closest NodeIDs to the Key and ask them for the Key Value STORE Save (Key,Value) pair to a node Locating the k closest NodeIDs for the Key by FIND_NODE Sending STORE msg to these k closest NodeIDs These k nodes re-publish this (Key,Value) pair to other closer new nodes at hourly intervals After 24 hours the link will expire PING Node online identification 26

Third generation BitTorrent Third generation file sharing protocol and system architecture for the distribution of large amounts of data High amount of the world wide P2P traffic are caused by BitTorrent Based on 3 important components: A web server offering a.torrent file A tracker keeping a list of clients downloading a specific file A client program acting as peer Two classes of peers exist:. Leechers: - Users who download files - These users provide their downloaded chunks for the upload 2. Seeders: - Users who have downloaded the complete file and only provide the upload Based on swarming Strengths: Very efficient file distribution system Highly scalable due to swarming Weaknesses: Tracker is a single point of failure (BitTorrent extensions without a central tracker based on Kademlia avoid this problem) No explicit file search functionality 27

Chunk 2 Swarming To increase efficiency of downloads, clients implement specific strategies Used in BitTorent system in combination with Kademlia algorithm File is splitted into many chunks and can be retrieved by downloading necessary chunks from different peers Swarming: A file sharing client downloads a file from many sources (peers) at the same time A swarm is a set of clients downloading the same file While a peer is downloading a file it already offers downloaded chunks to other peers Goal: Quickly replicate chunks to a large number of clients file XYZ Chunk Chunk 2 Chunk 3 Chunk 2 Chunk 2 still missing Chunk Chunk 2 Chunk 3 Chunk Chunk 2 Chunk 3 Chunk Chunk 2 Chunk 3 28 file XYZ file XYZ file XYZ

.torrent file For files shared via BitTorrent, a.torrent file is created and usually published on some web site Contains URL of tracker (server) Name and description of the file Length of chunks (e.g. 52 kb) Amount of chunks SHA- hashes of each piece in file for reliability A user has to search (manually) for this file After the.torrent file has been retrieved by a user (usually by its URI), it is imported by a BitTorrent client for getting necessary information for downloading the associated shared file 29

BitTorrent Web Server GET.torrent file 2 Identify tracker 3 Get adresses of peers Tracker 4 Request list of chunks The.torrent file is fetched and interpreted by a BitTorrent client ( and 2 ) The tracker specified for the desired file is contacted and a list with peers (leechers and seeders) offering parts of this file are returned by the tracker ( 3 ) The peers are contacted and a list of chunks offered by the apropriate peer is requested ( 4 ) After a client has retrieved the list with chunks from every peer, it has to decide which chunk it will request from which peer 3

Chunk selection policy Rarest first download the rarest chunk first Each peer creates a statistic of rarest chunks, indicator is the frequency of communication of this chunk to other peers Increases likelihood that all pieces are still available even if original seed leaves before any one node has downloaded entire file Random first The first chunk is downloaded randomly Risk: A new node downloads rarest first and logs off without sharing this chunk -> block the download Endgame Mode To avoid slow download in the end, all peers are requested for the missing chunks Request for missing chunk Chunk Chunk 2 Chunk 3 Client has almost downloaded all of the chunks Chunk Chunk 2 Chunk 3 Chunk Chunk 2 Chunk 3 Chunk Chunk 2 Chunk 3 3

Third generation Freenet Decentralized, censorship-resistant distributed data store Each participant provides a part of his own hard disc to store files from other participants A user has no control or knowledge of what kind of files his node stores All files on each node are encrypted One node knows only neighbor nodes thus requests can be only sent to neighbors No global semantic search functionality Uses Globally Unique Identifier (GUI) keys for identifying shared files Several types of key generation mechanisms may be applied, most important type of key: Signed-subspace Key (SSK) Every node has an ID and stores files having a key hash similar to this ID 32

Key Management rn/lectures/ public Key Hash String Signed-subspace Keys (SSK). Public-private key pair and rn/lectures/mc/scripts/chapter.pdf symmetric key are generated private Key Symmetric key 2. File is encrypted using the file symmetric key and signed using encrypt the private key. signature Nodes do not store the symmetric key, only the public key part of the SSK as an index to the data, thus a node hosting the file can plausibly deny knowledge of the stored data. 3. SSK is build by the hash of the public key and the symmetric key. The hash of the public key acts as the index to the data for searching purposes. Furthermore, the public key is stored with the data thus nodes can verify the signature when a SSK file enters a node and clients can verify the signature when retrieving a file. The symmetric key is used by clients for file decryption. file Key stored file Structure: SSK@<hash of the public key>,<symmetric key>, <information about crypto mechanism><description - file version> Published via e.g. a website or mail Example:SSK@GB3wuHmtxN2wLc7g4yZVydkK6sOT-DuOsUoeHK35w,c63EzO7uBENpiUbHPkMcJYW7i7cOvG42CM3YDduXDs,AQABAAE/rn/lectures/mc/scripts/c hapter.pdf-4 33

Requesting files and data transfer A user gets a SSK (e.g. SSK@GB...AE/rn/lectures/mc/scripts/chapter.pdf-4) Client software extracts from this SSK the hash of the public key and the symmetric decryption Key Then sends the request message containing the hash of public key, a TTL value and a unique request ID () Initiator node checks its own file system (2) IF there is no match THEN send request message to the node with the closest ID until TTL expires (3) IF the file is found THEN send reply message to the requesting node (neighbour!) (4) Initiator decrypts the encrypted file Decryption ((encrypted file), decryption key) Decryption ((bla.bla), 3op7) = chapter.pdf Trans.id=453 Sourceadd=x.x.x.2 Des.add=x.x.x.3 hash = ah66 TTL=3 - abh6 search for chapter.pdf Data Request 3 Trans.id=453 Sourceadd=x.x.x.3 Des.add=x.x.x.4 hash = ah66 TTL=2 Trans.id=453 Sourceadd=x.x.x. Des.add=x.x.x.2 hash = ah66 TTL=4 Data Fail 2 - ah66 - ahg5 7 5 - ai2u 4 6 34 - ahg5 Data Reply

Conclusion As decentralized as possible (anonymity/security), as centralized as necessary (efficient searching) and as fast as BitTorrent!? 35

References Ralf Steinmetz: Peer-to-Peer systems and applications, 25 Kademlia: A Peer-to-peer Information System Based on the XOR Metric by Petar Maymounkov and David Mazieres Freenet project: http://www.freenetproject.org/ Freenet: A Distributed Anonymous Information Storage and Retrieval System by Ian Clarkem and Oskar Sandberg http://www.cs.uiowa.edu/~ghosh/freenet.pdf Petar Maymounkov and David Mazieres: Kademlia: A Peer-to-peer Information System Based on the XOR Metric, http://www.scs.stanford.edu/~dm/home/papers/kpos.pdf 36