Public Review for Design Choices for Content Distribution in P2P Networks. Anwar Al Hamra and Pascal Felber

Size: px
Start display at page:

Download "Public Review for Design Choices for Content Distribution in P2P Networks. Anwar Al Hamra and Pascal Felber"

Transcription

1 Public Review for Design Choices for Content Distribution in P2P Networks Anwar Al Hamra and Pascal Felber Peer-to-peer file distribution accounts for a tremendous amount of bandwidth consumption on today's Internet. One of the most popular protocols, BitTorrent, organizes participating peers into a mesh. Many academically proposed protocols instead organize peers into trees, or forests of trees, such as SplitStream. All of the protocols make specific policy choices about which peers a given node should transmit to next, which file block should be sent next, and how many peers a given node should communicate with at a given time. In spite of the richness of the design space for distribution protocols, little is known about the impact of these design choices. This paper uses a combination of modeling and simulation to explore the impact of different design choices, and it specifically attempts to tackle the question of whether there is any fundamental advantage to tree-based or mesh-based protocols when comparing them to each other. The reviewers liked two aspects of this paper. First, the results suggest that mesh- and tree-based approaches can achieve competitive results, and moreover, that under certain policy choices they become quite similar to each other in practice. Second, that the impact of different policies can be large and counter-intuitive. While it is hard to predict whether peer-to-peer distribution will have a lasting presence on the Internet, if it does, this paper confirms that there are still interesting questions surrounding them that are unanswered. One criticism stood out from the reviewers' comments about this paper: any simulation-driven analysis is suspect, and like many similar papers, this one had to make some questionable simplifying assumptions. Even though our community has been developing large scale emulation platforms and experimental testbeds, we still have not found a way to evaluate a system intended to run on millions of nodes around the world, save for actually deploying it at scale. Until we crack this problem, we will always have some reservations about our results. a c m s i g c o m m Public review written by Steve Gribble University of Washington, Seattle, WA, USA ACM SIGCOMM Computer Communication Review 29 Volume 35, Number 5, October 2005

2 ACM SIGCOMM Computer Communication Review 30 Volume 35, Number 5, October 2005

3 Design Choices for Content Distribution in P2P Networks Anwar Al Hamra IUT of Nice GTR Departement Sophia Antipolis, FRANCE Pascal A. Felber Université de Neuchâtel Institut d informatique Switzerland pascal.felber@unine.ch ABSTRACT Content distribution using the P2P paradigm has become one of the most dominant services in the Internet today. Most of the research effort in this area focuses on developing new distribution architectures. However, little work has gone into identifying the principle design choices that draw the behavior of the system. In this paper, we identify these design choices and show how they influence the performance of different P2P architectures. For example, we discuss how clients should organize and cooperate in the network. We believe that our findings can serve as guidelines in the design of efficient future architectures. Categories and Subject Descriptors C.2.0 [General]: Data communications; C.2.1 [Network Architecture and Design]: Network communications; C.2.4 [Distributed Systems]: Distributed applications General Terms Design, Performance Keywords peer to peer networks, content distribution, performance evaluation 1. INTRODUCTION Peer-to-peer systems are distributed Internet applications in which the resources of a large number of autonomous participants are harnessed in order to carry out the system s function. In such systems, peers connect to each others and form an application overlay network on top of conventional Internet protocols. P2P systems are experiencing a great success mainly because they are self-organized, very reliable, and fault tolerant. The P2P paradigm has been initially discussed for file sharing services such as Napster and Gnutella. However, P2P networks represent also an efficient infrastructure for content distribution. The most promising property of P2P distribution networks is that the responsibility of distributing the content is spread amongst downloaders, which greatly reduces the load on the server. In these networks, clients contribute to the resources of the system as a function of their upload capacities. Clients download the content and upload it to other clients and so on. As a consequence, P2P This work has been entirely done at Institut Eurecom distribution networks are seen as an ultimate solution to scalability. The larger the number of clients in the system, the larger its service capacity and the better it scales and serves the content. There are two popular solutions to distribute the content in P2P networks. The first one organizes clients in a mesh. In mesh-based approaches, each node knows a subset of clients, called neighbors. Neighbors cooperate and exchange the content they hold according to a predefined cooperation strategy. The second solution is to construct a distribution tree on top of clients. Tree-based approaches can be used under a particular scenario where clients arrive to the system very close in time. Biersack et al. [1] recently showed that tree-based approaches can be very efficient for distributing a file to a large number of clients. The authors proved that the number of served clients in tree-based approaches scales exponentially in time. Given these two solutions, we still do not know which one performs the best. Thus, a first design choice is to select the way clients must organize themselves in the network. In the first part of the paper, we compare the number of served clients over time in both, tree-based and mesh-based approaches. Our results demonstrate that, mesh-based approaches take less time than tree-based approaches to serve the same number of clients. To the best of our knowledge, this comparison has never been done previously. Once we know how clients must be organized, i.e. in a mesh, the second design issue is related to the way peers 1 should cooperate to retrieve the content. We need clever cooperation strategies to efficiently leverage the available bandwidth resources at the different peers and to rapidly distribute the content. A cooperation strategy is the result of many factors coupled together. These factors include mainly (i) the peer selection strategy, (ii) the chunk selection strategy, and (iii) the network degree. Given a set of neighbors, the peer selection strategy indicates which neighbor must be served first. Once the neighbor to be served is chosen, the chunk selection strategy tells the peer which chunk of the content must be transmitted. We assume that the content is divided into pieces called chunks. Finally, the network degree represents the maximum number of active download and upload connections a peer can simultaneously maintain. In the second part of the paper, we investigate the impact of the above factors on the cooperation between peers. We show through examples and simulations how these factors play a key role in the behavior of mesh-based approaches. 1 We use the term peer only when we want to refer to both, clients and server at the same time. ACM SIGCOMM Computer Communication Review 31 Volume 35, Number 5, October 2005

4 Our results and discussions provide interesting hints and observations that can be very helpful in designing efficient distribution architectures. The rest of the paper is structured as follows. In section 2 we introduce the parameters that we use in the sequel. Section 3 briefly overviews the basic design of tree-based approaches while section 4 discusses mesh-based ones. In section 5 we compare the time needed to serve the same number of clients in both, tree-based and mesh-based approaches. We then focus on mesh-based approaches and extensively investigate their performance in section 6. We conclude the paper in section 7 with a discussion of the results. 2. NOTATION We denote by C and respecively the set and number of all chunks in the file being distributed. D i corresponds to the set of chunks that client i has already downloaded. In contrast, M i represents the set of chunks that client i is still missing. Under these assumptions, we obtain M i D i = C and M i D i =. Similarly, d i D i denotes the proportion of chunks that client i has already downloaded. m i M i corresponds to the proportion of chunks that client i is still missing. The parameter S up stands for the upload capacity of the server while C up and C down represent respectively the upload and download capacity of clients. The indegree of peers is represented by P in and the outdegree by P out. one unit of time is the time needed to download the file at rate r, where r is a constant expressed in Kbps. It follows that, for 1 a file of chunks of equal size, unit of time is needed to download one single chunk at rate r. Finally, Life denotes the time the client stays online after it has completely downloaded the file. For example, Life = 0 means that the client is selfish and disconnects immediately after it finishes the download. 3. TREE-BASED APPROACHES As their name indicates, tree-based approaches construct a tree to distribute the content to clients. These approaches differ in the strategy employed to construct the tree. They also differ in the algorithms used to handle bandwidth degradation on a link or clients failure. A recent paper by Biersack et al. [1] analyzes the performance of three tree-based architectures namely, Linear, T ree k, and P T ree k. These architectures cover almost all existing tree-based ones. In their analysis, the authors assume that all N clients arrive to the system at time t = 0 and all clients have homogeneous and symmetric bandwidth capacities. They also assume that the server stays online indefinitely. In this section we briefly overview the basic idea of each of these architectures. This overview will facilitate the comparison that we perform next between tree-based and mesh-based approaches. Linear Architecture. Linear organizes clients in a chain. The server uploads the file to client 1, which in turn uploads the file to client 2, and so on. When the server uploads the entire file to client 1, it becomes free and starts serving a new client. As a consequence, a new chain is initiated and the same process above is repeated. Under these assumptions, the number of served clients with Linear is shown to increase quadratically in time. Tree Distribution with Outdegree k (T ree k ). With T ree k, clients are organized in a regular tree with an outdegree k. In T ree k, an interior node in the tree serves the file to k clients simultaneously. Comparing to Linear, Biersack el al. demonstrate that this architecture performs much better because clients are served in parallel rather than sequentially. The number of served clients in T ree k increases exponentially in time. However, the main drawback of T ree k is that clients that are located at the leaves of the tree do not help in distributing the file. Forest of Parallel Trees (P T ree k ). This architecture is inspired from SplitStream [2] proposed initially for live streaming applications. With P T ree k, the file is split into k parts and each part is distributed over an independent tree rooted at the server. To construct these independent trees, P T ree k requires each client to be an interior node in one tree and a leaf node in the remaining ones. This architecture performs the best compared to Linear and T ree k. In P T ree k, all clients finish downloading the file rapidly and almost at the same time. There are two interesting properties that make this architecture highly efficient. The first one is that clients are served in parallel as in T ree k. The second property is that, by constructing parallel trees, this architecture ensures that all clients help in distributing the file. 4. MESH-BASED APPROACHES Usually tree-based approaches require the server to run complex algorithms to construct and maintain the distribution trees. In contrast, in mesh-based approaches clients self-organize in a mesh. The way clients organize and cooperate is responsible of the performance of mesh-based approaches. A cooperation strategy between peers consists of a peer selection strategy coupled with a chunk selection strategy. We also add the network degree as a critical factor that influences the behavior of a cooperation strategy. Recall that the network degree is represented by the maximum number of active download and upload connections a peer can simultaneously maintain, i.e. P in and P out. 4.1 Peer Selection strategy The peer selection strategy defines trading relationships between peers and affects the way the network self-organizes. In our simplified model we assume that (i) a client can communicate with any other client in the network and (ii) each client knows which chunks the other clients in the system hold. Our results should remain valid in practice given that each client knows a large enough subset of other clients. For instance, a client in BitTorrent [3] has between 40 to 120 neighbors as we observed in [6]. When a client has some chunks available and some free upload capacity, it uses a peer selection strategy to locally determine which client it will serve next. In this paper, we consider two strategies. Least missing. Preference is given to the clients that have many chunks, i.e., we serve in priority client j with d j d i, i. This strategy is inspired by the SRPT (shortest remaining processing time) scheduling ACM SIGCOMM Computer Communication Review 32 Volume 35, Number 5, October 2005

5 policy that is known to minimize the service time of jobs [7]. Most missing. Preference is given to the clients that have few chunks (new comers), i.e., we serve in priority client j with d j d i, i. We expect this strategy to engage clients into the file delivery very fast and keep them busy for most of the time. We note that Least Missing and Most Missing have been already introduced by Felber et al. in [4]. In this paper we use these two strategies to achieve two goals. The first one is to prove that mesh-based approaches are more efficient than tree-based ones. The second goal is to explain the importance of the peer selection strategy as a key design factor for mesh-based approaches. 4.2 Chunk Selection Strategy Once the receiving client j is selected, the sending client i performs an algorithm to figure out which chunk to send to client j. The chunk selection strategy aims at modifying the way the chunks get duplicated in the network. A good strategy is a key to achieve good performance. Bad strategies may result in many clients with non relevant chunks. To avoid such a scenario, we require the sending client i to schedule the rarest chunk C r (D i M j) among those that it holds and the receiving client j needs. Rarity is computed from the number of instances of each chunk held by the clients known to the sender. This strategy is inspired from BitTorrent and expected to maximize the number of copies of the rarest chunks in the system. 4.3 Network Degree Given the peer and chunk selection strategies, one interesting question is how to choose the indegree/outdegree of a client subject to its upload/download capacity? By maintaining k concurrent upload connections, a client i could serve k clients at once and intuitively quickly upload the chunks that it holds. In addition, by serving k different clients simultaneously, the client can fully use its upload capacity and thus, maximize its contribution to the system throughput. However, this intuition is not always correct. The upload capacity of client i would be divided amongst the k different connections. The larger the value of k, the lower the bandwidth dedicated to each connection. As a result, a large value of k might slow down the rate at which chunks get distributed in the network. For instance, for a tree distribution, a small outdegree of k = 2 is optimal [1]. In our analysis, we will consider different values for the indegree P in and outdegree P out of peers. Note that there are also technical parameters that might have a significant impact on the system behavior. One parameter is the available bandwidth in the network. In this paper we assume that peers have limited upload/download capacities but the network is assumed to have infinite bandwidth. The reason is that we want to focus on the advantages and the shortcomings related to our architectures and not to external factors. Another important parameter is the data management. Previous work [2, 1] has advised to split the file into various chunks, which permits concurrent downloads from multiple peers. In addition, with this technique, instead of waiting to download the whole file, a client can start serving a chunk as soon as it finishes downloading it. Still, the choice of the number and the size of the chunks is critical. The larger the number and the smaller the size of the chunks, the faster the clients participate into the file delivery, which in turn improves the system performance. However, this improvement would be at the cost of a higher overhead induced by more messages exchanged between clients. So, the service provider can choose the number and size of chunks based upon the required goal. Along this paper, we consider a common chunk size of 256 KB and a number of chunks = 200, which makes a file of 51.2 MB. We also consider the case of one single file. 5. MESH-BASED VS. TREE-BASED APPROACHES In this section, we compare the number of served clients in mesh-based and tree-based approaches. First of all, we compare Least Missing to T ree k. We demonstrate that Least Missing can achieve a similar performance to a tree distribution in a simpler way. Then we compare Most Missing to P T ree k. We show that Most Missing performs better than P T ree k while avoiding the penalty of constructing parallel trees. 5.1 Least Missing vs. T ree k In Least Missing, each peer tries to serve first the neighbor that has the largest number of chunks amongst all other neighbors. Remind that peers serve the rarest chunks in priority. Despite its simplicity, the number of served clients with Least Missing can scale exponentially in time as in T ree k. The key idea is to set the indegree of peers to P in = 1 and the outdegree to P out = k. For ease of explanation, we assume k = 2. We analytically prove that Least Missing with P in = 1 and P out = 2 is equivalent to a tree distribution with an outdegree k = 2. following assumptions. In this analysis, we make the All peers have equal upload and download capacity S up = C up = C down = r and maintain an outdegree P out = 2 and an indegree P in = 1. A client can start serving the file once it receives a first chunk. Each client remains online until it delivers twice the number of chunks it receives from the system while the server stays online indefinitely. All N clients arrive to the system at time t = 0. At time t = 0, the server starts serving two disjoint chunks, C 1 and C 2, to two clients 1 and 2, each at rate r. The 2 server finishes uploading chunks C 1 and C 2 to these two clients by time t = 2. Recall that one unit of time is the time needed to download the whole file at rate r and that the file comprises chunks of equal size. Given that this policy favors clients that have the largest number of chunks, the server will continue uploading to clients 1 and 2 until they completely download the whole file. At time t = 2, client 1 holds chunk C1 and client 2 holds chunk C 2. So each of them starts serving two new clients, each at rate r and consequently, 4 new clients are engaged into 2 the file delivery. This same process repeats and one new level is added each 2 unit of time (see figure 1(a)). Level i includes 2 i+1 clients, which are served by the 2 i clients located at level (i 1). As a result, the number of clients ACM SIGCOMM Computer Communication Review 33 Volume 35, Number 5, October 2005

6 in the system evolves as in a tree with an outdegree k = 2. We can easily verify that the scenario P in = P out = 1 gives Level 0 Level 1 Level 2 Clients r/2 Server r/2 r/2 r/2 r/2 r/2 t=0 t=2/ t=4/ Time (a) Least Missing with P in = 1 and P out = 2 Client Server t=0 t=1/3 t=2/3 t=1 t=4/3 t=5/3 t=2 Time (b) Least Missing, P in = P out = 1 Figure 1: Scaling behavior of Least Missing vs. time for different values of P in and P out. The black circle represents the server while the black squares represent the clients a linear chain as described in figure 1(b). In this figure, we assume that the number of chunks is = 3. After one unit of time, the server finishes uploading the file to the head of the first chain. It then starts serving a new client and consequently, a new chain is initiated. More formally, a new chain starts each one unit of time until all clients are served. 5.2 Most Missing vs. P T ree k Having seen how Least Missing can scale exponentially in time, we now investigate the scaling behavior of Most Missing. In this policy, clients that have the lowest number of chunks are served in priority. Thus, we expect Most Missing to engage clients into the file delivery as fast as possible and keep them busy for most of the time. In other words, we believe that Most Missing meets the same properties that make P T ree k very efficient. To validate our intuition, we compare the time needed to serve N = 10 4 clients with Most Missing and P T ree k under the scenario given in table 1. Note that all N = 10 4 clients are assumed to arrive simultaneously at time t = 0 and that the server stays online indefinitely. In number of served clients N Most Missing,N=10 4,P 10 4 in =1,C up =C down =S up =128,Life= time (in hours) Figure 2: Number of served clients vs. time for Most Missing the same time. If we compare the two architectures in absolute terms and under the parameter values given in table 1, we find that Most Missing needs 3472 seconds to serve all the 10 4 clients while P T ree k lasts a bit longer, 3584 seconds 3. This result shows that we can achieve a high efficiency while avoiding the overhead of constructing parallel trees. Note that the service time achieved by Most Missing, i.e., 3472 seconds, is very close to the optimal one. For S up = C up = C down = 128 Kbps and for a file of 51.2 MB, the optimal transmission time of the file is 3200 seconds. This high performance of Most Missing is actually due to the fast distribution of chunks. In figure 3, we draw the number of copies for each chunk against time. This figure shows number of copies Figure 3: Missing Most Missing, N=10 4,P in =1,S up =C up =C down =128,Life= index of the chunk time (in hour) Chunks distribution over time for Most 1 Table 1: Parameter values C up C down S up P in P out Life 128 Kbps 128 Kbps 128 Kbps figure 2 we plot the number of served clients against time for Most Missing as obtained from simulations 2. This figure proves that Most Missing behaves similarly to P T ree k. This means that all clients complete very fast and almost at 2 Details about our simulator are given in section 6 that, when a chunk is injected in the network, it gets distributed at an exponential rate. While this behavior of Most Missing is not intuitive, it can be easily understood through the following example. Consider the case of N = 15 clients and a file of = 4 chunks. At time t = 0, the server sends the first chunk C 1 to client 1 at rate r. At time t = 1, client 1 receives completely chunk C 1. Given that the policy tends to serve always rarest chunks to clients that have the fewest 3 This value is computed using eq.(6), page 7 in [1]. ACM SIGCOMM Computer Communication Review 34 Volume 35, Number 5, October 2005

7 number of chunks, at t = 1, the server schedules a new chunk C 2 to a new client. Similarly, client 1 starts delivering its chunk C 1 to another new client, say client 2. Let us focus for the moment on chunk C 1. At time t = 2, client 1 uploads completely chunk C 1 to client 2. As a result, at time t = 2, there are two clients, 1 and 2, that maintain a copy of the first chunk C 1. By this time t = 2, these two clients deliver that chunk to two new clients and so forth. Hence, the number of copies of chunk C 1 in the system doubles each 1 i unit of time. After unit(s) of time, there are 2 i 1 clients that have the first chunk and only the first chunk. This same analysis can be applied on chunk C 2. Chunk C 2 was injected in the network by the server at time t = 1, i.e. 1 unit of time after the first chunk C 1. By time t = i, there are 2 i 2 clients that have chunk C 2 and only chunk C 2. More formally, at time t =, chunk j, with j i, has i 2 i j copies. Figure 4 summarizes what we are explaining. From the above analysis, we can easily verify that, at time Chunk C1 Client Server Chunk C2 Chunk C4 Chunk C3 t=1/ t=2/ t=3/ t=4/ Time Figure 4: Growth of number of chunks for Most Missing t = 4, each client holds only one chunk. Clients 1,..., 8 hold chunk C 1, Clients 9,..., 12 hold chunk C 2, Clients 13 and 14 hold chunk C 3, and client 15 holds chunk C 4. By time 4, two extreme scenarios can happen. The first one is that the server chooses to serve client 1. At the same time, clients 2,..., 8 exchange their chunks with clients 9,..., 15. In this case, no clients are idle and consequently, the number of copies for each chunk keeps on doubling each 1 unit of time. The second scenario is that the server chooses to serve client 9 and, at the same time, clients 10, 11, and 12 exchange their chunks with clients 13, 14, and 15. Under this scenario, half of the clients, i.e., clients 1,..., 8, remain idle. In this case, the number of chunks in the system keeps increasing exponentially in time but not as fast as in the first scenario. It is hard, say impossible, to predict which scenario takes place each 1 unit of time. However, we believe the behavior of Most Missing to be somewhere between these two extreme scenarios. A few clients remain idle while the majority exchange their chunks. 5.3 Preliminary Conclusion In this section we studied the evolution of the number of served clients in both, mesh-based and tree-based approaches. We first compared Least Missing and T ree k. We analytically demonstrated that, by setting the indegree of peers to P in = 1 and their outdegree to P out = k, Least Missing behaves like T ree k. We then showed via simulations that Most Missing takes less time than P T ree k to serve the same number of clients. In addition, in our comparison, we assumed peers with homogeneous and symmetric bandwidth capacities and that there are no early departures of clients. Under this scenario, tree-based approaches exhibit their best performance. Thus, our results prove that, even under such optimal scenarios, mesh-based approaches are more efficient than tree-based ones. 6. DESIGN CHOICES FOR MESH-BASED ARCHITECTURES In the following, we focus on mesh-based approaches and discuss the main factors that influence their performance. These factors include the peer selection strategy, the chunk selection strategy, and the network degree. For lack of space, we give results for the scenario where all clients arrive to the system at the same time. This could happen when a critical data, e.g., an anti-virus, must be updated over a set of machines as fast as possible. It can also be the case of a flash crowd where a large number of clients arrive to the system very close in time. The simulation results that we present in this paper have been first validated under various system assumptions such as a Poisson arrival of clients and heterogeneous/asymmetric bandwidth capacity of peers. For more details, you can refer to our technical report [5]. We digress briefly to explain the simulation methodology that we use. Our simulator is essentially event-driven, with events being scheduled and mapped to real-time with a millisecond precision. The transmission delay of each chunk is computed dynamically according the link capacities (minimum of the sender upload and receiver download capacities) and the number of simultaneous transfers on the links (bandwidth is equally split between concurrent connections). Once a peer i holds at least one chunk, it becomes a potential server. It first sorts its neighboring peers according to the specified peer selection strategy. It then iterates through the sorted list until it finds a peer j that (i) needs some chunks from D i(d i M j φ), (ii) is not already being served by peer i, and (iii) is not overloaded. We say that a peer is overloaded if it has reached its maximum number of connections and has less than 128 kbps bandwidth capacity left. Peer i then applies the specified chunk selection strategy to choose the best chunk to send to peer j. Peer i repeats this whole process until it becomes overloaded or finds no other peer to serve. 6.1 Influence of Peer Selection Strategy To reveal the importance of the peer selection strategy as a key design, we compare in figure 5 the scaling behavior of the two opposite strategies, Least Missing and Most Missing. The results that we plot in this figure are for the scenario depicted in table 1. Actually, this is the same scenario that we considered in section 5.2 when comparing Most Missing to P T ree k. Despite its simplicity, this scenario provides new insights while keeping the analysis extremely simple. As we can observe from figure 5, Most Missing performs much better than Least Missing. In absolute terms, Most Missing takes 3472 seconds, 1 hour, to serve 10 4 clients. In contrast, to serve the same number of clients, Least Missing needs a larger time, up to seconds, 7 hours. However, from figure 5 we can also notice that Least Missing ACM SIGCOMM Computer Communication Review 35 Volume 35, Number 5, October 2005

8 number of served clients N N=10 4,P in =1,S up =C up =C down =128,Life=0 Least Missing Most Missing time (in hours) Figure 5: The number of served clients against time for Least Missing and Most Missing optimizes the service time of the first few clients. In contrast to Most Missing, Least Missing pushes quickly few clients to completion and maintains the majority of clients early in their download. We explain the behavior of Least Missing as follows. Under the assumptions of P in = P out = 1 and C up = C down = S up = r, Least Missing policy would result in one single linear chain. Such a chain increases by one client each 1 unit of time (see figure 1(b)). In that chain, the download time of each client is one unit of time. This works as follows. Assume the server starts delivering the file at time t = 0 to a first client 1. By time t = 1, client 1 receives a first chunk and starts serving a second client 2 and so forth. More formally, client i receives a first byte of the file by time t = i 1. In addition, this client i will always have one and only one chunk more than client i + 1. This means that, the root of the chain, in our scenario client 1, has the largest number of chunks, then client 2, etc. At time t = 1, the server finishes uploading the file to client 1. Even though the server becomes free, it does not initiate a new chain. Indeed, we assume here that the client disconnects once it receives the whole file, i.e., Life = 0. So, at time t = 1, client 1 leaves the network and client 2 is left stranded. In addition, client 2 has the largest number of chunks amongst all other clients in the system. Therefore, at time t = 1, the server delivers to client 2 the last chunk this client misses. At time t = 1 + 1, the same process repeats. This means that client 2 disconnects and the server uploads to client 3 the last chunk this client misses. As a consequence, the server will be always delivering a last chunk to the client located at the root of the chain. Thus, Least Missing will be serving clients sequentially in a chain and the overall distribution time of the file is too slow as compared to Most Missing. We mention that our goal through this comparison is not to prove that Most Missing outperforms Least Missing. Instead, we tend to prove that the peer selection strategy is a key design factor that must never be ignored Conclusions In this section we compared the two strategies Least Missing and Most Missing under a basic and homogeneous scenario. This scenario has clearly shown how the peer selection strategy guides the behavior of the system. For instance, Most Missing optimizes the overall service time because it engages rapidly clients into the file delivery and keeps them busy for most of the time. In contrast, Least Missing minimizes the delay experienced by the first few clients while the last client to complete notices a large delay. 6.2 Influence of Chunk Selection Strategy In their basic version, Most Missing and Least Missing require peers (server and clients) to serve rarest chunks first, i.e., the least duplicated chunks in the system. In this section we investigate a possible simplification of the system. We allow peers to schedule chunks at random as follows. The sending peer i selects a chunk C i (D i M j) at random among those that it holds and the receiving client j needs. Under this assumption, we refer to Most Missing as Most Missing Random and to Least Missing as Least Missing Random. Our goal through Most Missing Random and Least Missing Random is to see whether this feature can be integrated into the system without sacrificing a lot the performance. The simulation results that we provide in this section are for the same scenario that we considered in section 5.2 (see table 1). For space reasons, we give results only for Most Missing Random. However, our broad conclusions also apply to Least Missing Random Most Missing Random We can notice the first impact of the chunk selection strategy in figure 6 where the number of served clients with Most Missing Random has different scaling tendency as compared to Most Missing. This figure shows that, with Most Missing number of served clients N Most Missing,N=10 4,P c =P s =1,S up =C up =C down =128,Life= rarest random time (in hours) Figure 6: Impact of chunk selection strategy on Most Missing Random, around 2000 clients finish at almost the same time, within 4160 seconds of simulation, 1.15 hours. Then, the number of completed clients increases slowly. Indeed, it increases by one client each 1 unit of time, which is equivalent to 16 seconds under the parameter values of table 1. To explain this behavior of Most Missing Random, we plot the chunks distribution over time in figure 7. If we take a closer look at this figure, we can observe that, after 1.15 hours of simulation, all chunks are widely distributed in the system except one single chunk. In other words, all clients in the system have each 199 chunks and they are all waiting for ACM SIGCOMM Computer Communication Review 36 Volume 35, Number 5, October 2005

9 Most Missing Random, N=10 4,P in =1,S up =C up =C down =128,Life= Server Client t=0 number of copies index of the chunk time (in hour) Figure 7: Chunks distribution over time for Most Missing Random one rarest chunk to be scheduled from the server. Let us denote this rarest chunk by C r. At time t = 4160 seconds, the server schedules chunk C r to client 1. After 16 seconds, client 1 receives completely chunk C r. Given that Life = 0, by receiving chunk C r, client 1 completes its set of chunks and disconnects immediately. The server then delivers again this chunk C r to a second client 2 and so on. As a result, one client completes each 16 seconds and, instead of 3472 seconds to serve 10 4 clients as with the rarest selection case, during 8 hours, Most Missing Random serves no more than 3733 clients. We give the following simplified scenario (figure 8) that helps to understand better why random selection of chunks can really block the clients in the system. Consider the case where there are only N = 7 clients that want to download a file that comprises only two chunks, C 1 and C 2. At time t = 0, the server starts serving chunk C 1 to client 1. After 1 unit of time, the chunk is completely delivered and the server starts serving a new client 2. Given that the chunk selection is done at random, it is possible that the server schedules to client 2 the same chunk C 1 and not a new one. Meanwhile, client 1 uploads its chunk C 1 to a new client 3. At time 2, the system includes 3 clients that hold chunk C 1 and four clients with no chunks at all. By that time, the server starts serving a new chunk C 2 to a new client 4. Similarly, clients 1, 2 and 3 upload their chunks to 3 new clients 5, 6, and 7. As a result, we land up with 6 clients that maintain chunk C 1 and one single client with chunk C 2. At time 3, clients continue to exchange their chunks. For sake of simplicity, assume that clients 1 and 4 exchange their chunks, C 1 and C 2 respectively. At the same time, the server serves chunk C 2 to a client, say client 2. Remaining clients, 3, 5, 6, and 7, can not cooperate because they hold all the same chunk C 1 and thus remain idle. By time t = 4, clients 1, 2, and 3, complete their set of chunks and disconnect, i.e., Life = 0. As a result, at time t = 4, the system would comprise four clients (3,5,6, and 7) that are all waiting for the same chunk C 2 and each 1 unit of time, the server delivers that chunk to one client Conclusions The chunk selection strategy is a main factor that draws 6 Chunk C2 Chunk C1 t=1/ t=2/ Time Figure 8: The distribution of the chunks over the different clients during the first 2 unit of time in Most Missing Random. We assume that the number of clients is N = 7 and the number of chunks is = 2 the performance of mesh-based approaches. In this section and to gain space, we evaluated only the Most Missing policy under the assumptions that peers schedule chunks at random and not rarest ones first. Under particular scenarios, the degradation in the system performance can be dramatic. Fortunately, the poor performance that we saw for Most Missing is not only due to the chunk selection strategy. It also comes from the fact that we set P in = P out = 1 and Life = 0. When we increase the network degree, we lessen the likelihood that one single chunk becomes a bottleneck. Also, in practice, not all clients would be selfish and thus, the performance of the system would not degrade that much. 6.3 Influence of Network Degree The choice of the indegree and outdegree of peers is not an easy task. A small network degree can be suitable for some architectures while it is bad for others. In this section we show through simulations how the performance of mesh-based approaches can be completely different. We increase P in and P out from 1 to 5 while other parameter values remain the same (see table 2). Table 2: Parameter values C up C down S up P in P out Life 128 Kbps 128 Kbps 128 Kbps Most Missing We first analyze the Most Missing policy with the number of completed clients graphed in figure 9. The results show again that clients download the file quickly and finish at almost the same time. What is interesting here is that, having multiple download/upload connections is not of benefit to the Most Missing policy. To better understand this result, we consider in figure 10 the evolution of one single chunk in Most Missing. We compare the time needed to distribute this chunk to 15 clients for two different values of the network degree, i.e., P in = P out = 1 and P in = P out = 3. When P in = P out = 1, the chunk is delivered at full rate r and at rate r when 3 P in = P out = 3. As we can observe from figure 10, when P in = P out = 1, the chunk is distributed to 15 clients within 4 unit of time. In contrast, when Pin = Pout = 3, we need 6 unit of time. ACM SIGCOMM Computer Communication Review 37 Volume 35, Number 5, October 2005

10 number of served clients N N=10 4,S up =C up =C down =128,Life=0 Least Missing,P in =1 Most Missing,P in =1 Least Missing,P in =5 Most Missing,P in = time (in hours) Figure 9: Impact of network degree on the system performance Client Client rate r Server rate r t=1/ t=2/ t=3/ t=4/ Time (a) Most Missing, P in = P out = 1 rate r/3 Server rate r/3 t=3/ t=4/ t=5/ t=6/ Time (b) Most Missing, P in = P out = 3 Figure 10: The evolution of the number of copies of one single chunk in Most Missing when S up = C up = C down = r Least Missing In contrast to what we saw for Most Missing, increasing the network degree improves significantly the performance of Least Missing. As we can observe in figure 9, for P in = P out = 5, the time needed to serve 10 4 clients is halved compared to the scenario where P in = P out = 1, i.e., seconds instead of seconds. This improvement in the performance of Least Missing is expected and in accordance with the analysis that we presented in section 5.1. There we showed that when P in = P out = 1, Least Missing performs poorly because it serves clients sequentially, i.e., similarly to the Linear architecture. In contrast, when the outdegree of peers is larger than 1, clients are served in parallel, i.e. as in T ree k, and Least Missing becomes more efficient Conclusions We investigated the impact of the network degree on our two policies. As P in and P out go from 1 to 5, the performance of Most Missing slightly degrades while the performance of Least Missing doubles. This result is very interesting as it shows how the different factors interact with each other and sometimes, lead to counter-intuitive behaviors. Regardless of what we have seen in this section, we argue that parallel download and upload of the chunks can offer many advantages in real environments. Mainly, it allows clients to fully use their upload and download capacities, which makes the system more robust against bandwidth fluctuations in the network and client departures. In addition, parallel connections ensures a good connectivity between peers in the system. Still, what is interesting to do is to derive the optimal network degree in function of the cooperation strategy that is employed. 7. DISCUSSION In this paper, we addressed the content distribution service in P2P networks. The performance of P2P distribution architecture is drawn by two design choices. The first one deals with the way clients must organize in the network, i.e., in a tree or a mesh. In our study, we proved that mesh-based approaches are not only simpler to construct and maintain than tree-based approaches, but they also take less time to serve the same number of clients. The second design choice is related to the cooperation strategy between peers. A cooperation strategy is the result of three factors coupled together. These factors include the peer selection strategy, the chunk selection strategy, and the network degree. We investigated through simulations the influence of these factors on the performance of the system. Even though our analysis is basic, it allowed us to draw many interesting conclusions. We believe that our results and discussions present helpful guidelines for the design of future distribution architectures. Our work can be seen as a first step towards a more complete study of P2P distribution architectures. Future work can proceed along many avenues. For instance, we need to evaluate a larger set of peer selection strategies. One could think of selecting the most cooperating peer or the peer that has the largest upload capacity. Also the choice of the network degree is still difficult. One interesting contribution would be to derive a relation between the network degree and the peer selection strategy. ACM SIGCOMM Computer Communication Review 38 Volume 35, Number 5, October 2005

11 8. ACKNOWLEDGMENTS The authors would like to thank Dr. Arnauld Legout and Dr. Chadi Barakat for their helpful comments and discussions. 9. REFERENCES [1] E. W. Biersack, P. Rodriguez, and P. A. Felber. Performance analysis of Peer-to-Peer networks for file distribution. In Proc. of QofIS, Barcelona, Spain, September [2] M. Castro, P. Druschel, A. M. Kermarrec, A. Nandi, A. Rowstron, and A. Singh. SplitStream: High-bandwidth content distribution in a cooperative environment. In Proc. of IPTPS, Berkeley, CA, USA, February [3] B. Cohen. Incentives to build robustness in Bittorrent. In Proc. of the Workshop on Economics of Peer-to-Peer Systems, Berkeley, CA, USA, June [4] P. A. Felber and E. W. Biersack. Self-scaling networks for content distribution. In Proc. of Self-*, Bertinoro, Italy, May [5] A. A. Hamra. Cooperative strategies for file replication in p2p networks. Technical Report RR , Institut Eurécom, Sophia Antipolis, France, October [6] M. Izal, G. Urvoy-Keller, E. W. Biersack, P. A. Felber, A. A. Hamra, and L. Garcés-Erice. Dissecting bittorrent: Five months in a torrent s lifetime. In Proc. of PAM, Juan-les-Pins, France, April [7] L. E. Schrage. A proof of the optimality of the shortest remaining service time discipline. Operations Research, 16: , ACM SIGCOMM Computer Communication Review 39 Volume 35, Number 5, October 2005

12 ACM SIGCOMM Computer Communication Review 40 Volume 35, Number 5, October 2005

Performance Analysis of Peer-to-Peer Networks for File Distribution

Performance Analysis of Peer-to-Peer Networks for File Distribution Performance Analysis of Peer-to-Peer Networks for File Distribution Ernst W. Biersack, Pablo Rodriguez, and Pascal Felber Institut EURECOM, France {erbi,felber}@eurecom.fr Microsoft Research, UK pablo@microsoft.com

More information

Efficiency of Data Distribution in BitTorrent-Like Systems

Efficiency of Data Distribution in BitTorrent-Like Systems Efficiency of Data Distribution in BitTorrent-Like Systems Ho-Leung Chan 1,, Tak-Wah Lam 2, and Prudence W.H. Wong 3 1 Department of Computer Science, University of Pittsburgh hlchan@cs.pitt.edu 2 Department

More information

Extreme Computing. BitTorrent and incentive-based overlay networks.

Extreme Computing. BitTorrent and incentive-based overlay networks. Extreme Computing BitTorrent and incentive-based overlay networks BitTorrent Today we will focus on BitTorrent The technology really has three aspects A standard that BitTorrent client systems follow Some

More information

P2P Applications. Reti di Elaboratori Corso di Laurea in Informatica Università degli Studi di Roma La Sapienza Canale A-L Prof.ssa Chiara Petrioli

P2P Applications. Reti di Elaboratori Corso di Laurea in Informatica Università degli Studi di Roma La Sapienza Canale A-L Prof.ssa Chiara Petrioli P2P Applications Reti di Elaboratori Corso di Laurea in Informatica Università degli Studi di Roma La Sapienza Canale A-L Prof.ssa Chiara Petrioli Server-based Network Peer-to-peer networks A type of network

More information

Scalability of the BitTorrent P2P Application

Scalability of the BitTorrent P2P Application Scalability of the BitTorrent P2P Application Kolja Eger, Ulrich Killat Hamburg University of Technology 5.Würzburger Workshop 8.-9. July 2005 Overview File dissemination in peer-to-peer (p2p) networks

More information

Impact of Inner Parameters and Overlay Structure on the Performance of BitTorrent

Impact of Inner Parameters and Overlay Structure on the Performance of BitTorrent Impact of Inner Parameters and Overlay Structure on the Performance of BitTorrent Guillaume Urvoy-Keller Institut Eurecom, France Email: urvoy@eurecom.fr Pietro Michiardi Institut Eurecom, France Email:

More information

Performance Consequences of Partial RED Deployment

Performance Consequences of Partial RED Deployment Performance Consequences of Partial RED Deployment Brian Bowers and Nathan C. Burnett CS740 - Advanced Networks University of Wisconsin - Madison ABSTRACT The Internet is slowly adopting routers utilizing

More information

Early Measurements of a Cluster-based Architecture for P2P Systems

Early Measurements of a Cluster-based Architecture for P2P Systems Early Measurements of a Cluster-based Architecture for P2P Systems Balachander Krishnamurthy, Jia Wang, Yinglian Xie I. INTRODUCTION Peer-to-peer applications such as Napster [4], Freenet [1], and Gnutella

More information

BiToS: Enhancing BitTorrent for Supporting Streaming Applications

BiToS: Enhancing BitTorrent for Supporting Streaming Applications BiToS: Enhancing BitTorrent for Supporting Streaming Applications Aggelos Vlavianos, Marios Iliofotou and Michalis Faloutsos Department of Computer Science and Engineering University of California Riverside

More information

Peer Assisted Content Distribution over Router Assisted Overlay Multicast

Peer Assisted Content Distribution over Router Assisted Overlay Multicast Peer Assisted Content Distribution over Router Assisted Overlay Multicast George Xylomenos, Konstantinos Katsaros and Vasileios P. Kemerlis Mobile Multimedia Laboratory & Department of Informatics Athens

More information

DIT - University of Trento Performance Evaluation of Overlay Content Distribution Systems

DIT - University of Trento Performance Evaluation of Overlay Content Distribution Systems PhD Dissertation International Doctorate School in Information and Communication Technologies DIT - University of Trento Performance Evaluation of Overlay Content Distribution Systems Damiano Carra Advisor:

More information

Peer-to-Peer Systems. Chapter General Characteristics

Peer-to-Peer Systems. Chapter General Characteristics Chapter 2 Peer-to-Peer Systems Abstract In this chapter, a basic overview is given of P2P systems, architectures, and search strategies in P2P systems. More specific concepts that are outlined include

More information

inria , version 1-6 Sep 2006

inria , version 1-6 Sep 2006 Rarest First and Choke Algorithms Are Enough Arnaud Legout I.N.R.I.A. Sophia Antipolis France arnaud.legout@sophia.inria.fr G. Urvoy-Keller and P. Michiardi Institut Eurecom Sophia Antipolis France {Guillaume.Urvoy,Pietro.Michiardi}@eurecom.fr

More information

Loopback: Exploiting Collaborative Caches for Large-Scale Streaming

Loopback: Exploiting Collaborative Caches for Large-Scale Streaming Loopback: Exploiting Collaborative Caches for Large-Scale Streaming Ewa Kusmierek Yingfei Dong David Du Poznan Supercomputing and Dept. of Electrical Engineering Dept. of Computer Science Networking Center

More information

Cooperative End-to-end content distribution. Márk Jelasity

Cooperative End-to-end content distribution. Márk Jelasity Cooperative End-to-end content distribution Márk Jelasity Content distribution So far we looked at search Content distribution is about allowing clients (peers) to actually get a file or other data after

More information

The Scalability of Swarming Peer-to-Peer Content Delivery

The Scalability of Swarming Peer-to-Peer Content Delivery The Scalability of Swarming Peer-to-Peer Content Delivery Daniel Zappala Brigham Young University zappala@cs.byu.edu with Daniel Stutzbach Reza Rejaie University of Oregon Page 1 Motivation Small web sites

More information

The Importance of History in a Media Delivery System

The Importance of History in a Media Delivery System The Importance of History in a Media Delivery System Richard J. Dunn, Steven D. Gribble, Henry M. Levy, John Zahorjan University of Washington E-mail: {rdunn,gribble,levy,zahorjan}@cs.washington.edu Abstract

More information

Dissecting BitTorrent: Five Months in a Torrent s Lifetime

Dissecting BitTorrent: Five Months in a Torrent s Lifetime Dissecting BitTorrent: Five Months in a Torrent s Lifetime M. Izal, G. Urvoy-Keller, E.W. Biersack, P.A. Felber, A. Al Hamra, and L. Garcés-Erice Institut Eurecom, 2229, route des Crêtes, 694 Sophia-Antipolis,

More information

Time-related replication for p2p storage system

Time-related replication for p2p storage system Seventh International Conference on Networking Time-related replication for p2p storage system Kyungbaek Kim E-mail: University of California, Irvine Computer Science-Systems 3204 Donald Bren Hall, Irvine,

More information

Application Layer Multicast Algorithm

Application Layer Multicast Algorithm Application Layer Multicast Algorithm Sergio Machado Universitat Politècnica de Catalunya Castelldefels Javier Ozón Universitat Politècnica de Catalunya Castelldefels Abstract This paper presents a multicast

More information

ITERATIVE MULTI-LEVEL MODELLING - A METHODOLOGY FOR COMPUTER SYSTEM DESIGN. F. W. Zurcher B. Randell

ITERATIVE MULTI-LEVEL MODELLING - A METHODOLOGY FOR COMPUTER SYSTEM DESIGN. F. W. Zurcher B. Randell ITERATIVE MULTI-LEVEL MODELLING - A METHODOLOGY FOR COMPUTER SYSTEM DESIGN F. W. Zurcher B. Randell Thomas J. Watson Research Center Yorktown Heights, New York Abstract: The paper presents a method of

More information

Designing and debugging real-time distributed systems

Designing and debugging real-time distributed systems Designing and debugging real-time distributed systems By Geoff Revill, RTI This article identifies the issues of real-time distributed system development and discusses how development platforms and tools

More information

Peer-to-Peer Streaming Systems. Behzad Akbari

Peer-to-Peer Streaming Systems. Behzad Akbari Peer-to-Peer Streaming Systems Behzad Akbari 1 Outline Introduction Scaleable Streaming Approaches Application Layer Multicast Content Distribution Networks Peer-to-Peer Streaming Metrics Current Issues

More information

CSE 486/586 Distributed Systems Peer-to-Peer Architectures

CSE 486/586 Distributed Systems Peer-to-Peer Architectures CSE 486/586 Distributed Systems eer-to-eer Architectures Steve Ko Computer Sciences and Engineering University at Buffalo CSE 486/586 Last Time Gossiping Multicast Failure detection Today s Question How

More information

A Search Theoretical Approach to P2P Networks: Analysis of Learning

A Search Theoretical Approach to P2P Networks: Analysis of Learning A Search Theoretical Approach to P2P Networks: Analysis of Learning Nazif Cihan Taş Dept. of Computer Science University of Maryland College Park, MD 2742 Email: ctas@cs.umd.edu Bedri Kâmil Onur Taş Dept.

More information

BitTorrent. Masood Khosroshahy. July Tech. Report. Copyright 2009 Masood Khosroshahy, All rights reserved.

BitTorrent. Masood Khosroshahy. July Tech. Report. Copyright 2009 Masood Khosroshahy, All rights reserved. BitTorrent Masood Khosroshahy July 2009 Tech. Report Copyright 2009 Masood Khosroshahy, All rights reserved. www.masoodkh.com Contents Contents 1 Basic Concepts 1 2 Mechanics 3 2.1 Protocols: Tracker and

More information

Thwarting Traceback Attack on Freenet

Thwarting Traceback Attack on Freenet Thwarting Traceback Attack on Freenet Guanyu Tian, Zhenhai Duan Florida State University {tian, duan}@cs.fsu.edu Todd Baumeister, Yingfei Dong University of Hawaii {baumeist, yingfei}@hawaii.edu Abstract

More information

Evaluating Unstructured Peer-to-Peer Lookup Overlays

Evaluating Unstructured Peer-to-Peer Lookup Overlays Evaluating Unstructured Peer-to-Peer Lookup Overlays Idit Keidar EE Department, Technion Roie Melamed CS Department, Technion ABSTRACT Unstructured peer-to-peer lookup systems incur small constant overhead

More information

Stochastic Analysis and File Availability Enhancement for BT-like File Sharing Systems

Stochastic Analysis and File Availability Enhancement for BT-like File Sharing Systems Stochastic Analysis and File Availability Enhancement for BT-like File Sharing Systems Fan Bin Dah-Ming Chiu John C.S. Lui Abstract In this paper, we present the mathematical analysis of two important

More information

Collaborative Multi-Source Scheme for Multimedia Content Distribution

Collaborative Multi-Source Scheme for Multimedia Content Distribution Collaborative Multi-Source Scheme for Multimedia Content Distribution Universidad Autónoma Metropolitana-Cuajimalpa, Departament of Information Technology, Mexico City, Mexico flopez@correo.cua.uam.mx

More information

Dynamic Load-Balanced Multicast for Data-Intensive Applications on Clouds 1

Dynamic Load-Balanced Multicast for Data-Intensive Applications on Clouds 1 Dynamic Load-Balanced Multicast for Data-Intensive Applications on Clouds 1 Contents: Introduction Multicast on parallel distributed systems Multicast on P2P systems Multicast on clouds High performance

More information

Tree-Based Analysis of Mesh Overlays for Peer-to-Peer Streaming

Tree-Based Analysis of Mesh Overlays for Peer-to-Peer Streaming Tree-Based Analysis of Mesh Overlays for Peer-to-Peer Streaming Bartosz Biskupski 1, Marc Schiely 2, Pascal Felber 2,andRené Meier 1 1 Trinity College Dublin, Ireland 2 University of Neuchâtel, Switzerland

More information

Peer-to-Peer Applications Reading: 9.4

Peer-to-Peer Applications Reading: 9.4 Peer-to-Peer Applications Reading: 9.4 Acknowledgments: Lecture slides are from Computer networks course thought by Jennifer Rexford at Princeton University. When slides are obtained from other sources,

More information

It s Not the Cost, It s the Quality! Ion Stoica Conviva Networks and UC Berkeley

It s Not the Cost, It s the Quality! Ion Stoica Conviva Networks and UC Berkeley It s Not the Cost, It s the Quality! Ion Stoica Conviva Networks and UC Berkeley 1 A Brief History! Fall, 2006: Started Conviva with Hui Zhang (CMU)! Initial goal: use p2p technologies to reduce distribution

More information

Enhancing Downloading Time By Using Content Distribution Algorithm

Enhancing Downloading Time By Using Content Distribution Algorithm RESEARCH ARTICLE OPEN ACCESS Enhancing Downloading Time By Using Content Distribution Algorithm VILSA V S Department of Computer Science and Technology TKM Institute of Technology, Kollam, Kerala Mailid-vilsavijay@gmail.com

More information

BitTorrent Fairness Analysis

BitTorrent Fairness Analysis BitTorrent Fairness Analysis Team Asians Zhenkuang He Gopinath Vasalamarri Topic Summary Aim to test how the fairness affect the file transfer speed in a P2P environment (here using the BitTorrent Protocol)

More information

Introducing MESSIA: A Methodology of Developing Software Architectures Supporting Implementation Independence

Introducing MESSIA: A Methodology of Developing Software Architectures Supporting Implementation Independence Introducing MESSIA: A Methodology of Developing Software Architectures Supporting Implementation Independence Ratko Orlandic Department of Computer Science and Applied Math Illinois Institute of Technology

More information

Doctoral Written Exam in Networking, Fall 2008

Doctoral Written Exam in Networking, Fall 2008 Doctoral Written Exam in Networking, Fall 2008 December 5, 2008 Answer all parts of all questions. There are four multi-part questions, each of equal weight. Turn in your answers by Thursday, December

More information

On Minimizing Packet Loss Rate and Delay for Mesh-based P2P Streaming Services

On Minimizing Packet Loss Rate and Delay for Mesh-based P2P Streaming Services On Minimizing Packet Loss Rate and Delay for Mesh-based P2P Streaming Services Zhiyong Liu, CATR Prof. Zhili Sun, UniS Dr. Dan He, UniS Denian Shi, CATR Agenda Introduction Background Problem Statement

More information

Optimizing Parallel Access to the BaBar Database System Using CORBA Servers

Optimizing Parallel Access to the BaBar Database System Using CORBA Servers SLAC-PUB-9176 September 2001 Optimizing Parallel Access to the BaBar Database System Using CORBA Servers Jacek Becla 1, Igor Gaponenko 2 1 Stanford Linear Accelerator Center Stanford University, Stanford,

More information

Peer-to-Peer Networks 12 Fast Download

Peer-to-Peer Networks 12 Fast Download Peer-to-Peer Networks 12 Fast Download Arne Vater Technical Faculty Computer Networks and Telematics University of Freiburg IP Multicast Motivation - Transmission of a data stream to many receivers Unicast

More information

Peer to Peer Networks

Peer to Peer Networks Sungkyunkwan University Peer to Peer Networks Prepared by T. Le-Duc and H. Choo Copyright 2000-2017 Networking Laboratory Presentation Outline 2.1 Introduction 2.2 Client-Server Paradigm 2.3 Peer-To-Peer

More information

Chapter 5 (Week 9) The Network Layer ANDREW S. TANENBAUM COMPUTER NETWORKS FOURTH EDITION PP BLM431 Computer Networks Dr.

Chapter 5 (Week 9) The Network Layer ANDREW S. TANENBAUM COMPUTER NETWORKS FOURTH EDITION PP BLM431 Computer Networks Dr. Chapter 5 (Week 9) The Network Layer ANDREW S. TANENBAUM COMPUTER NETWORKS FOURTH EDITION PP. 343-396 1 5.1. NETWORK LAYER DESIGN ISSUES 5.2. ROUTING ALGORITHMS 5.3. CONGESTION CONTROL ALGORITHMS 5.4.

More information

ADAPTIVE AND DYNAMIC LOAD BALANCING METHODOLOGIES FOR DISTRIBUTED ENVIRONMENT

ADAPTIVE AND DYNAMIC LOAD BALANCING METHODOLOGIES FOR DISTRIBUTED ENVIRONMENT ADAPTIVE AND DYNAMIC LOAD BALANCING METHODOLOGIES FOR DISTRIBUTED ENVIRONMENT PhD Summary DOCTORATE OF PHILOSOPHY IN COMPUTER SCIENCE & ENGINEERING By Sandip Kumar Goyal (09-PhD-052) Under the Supervision

More information

Scheduling Algorithms to Minimize Session Delays

Scheduling Algorithms to Minimize Session Delays Scheduling Algorithms to Minimize Session Delays Nandita Dukkipati and David Gutierrez A Motivation I INTRODUCTION TCP flows constitute the majority of the traffic volume in the Internet today Most of

More information

P2P. 1 Introduction. 2 Napster. Alex S. 2.1 Client/Server. 2.2 Problems

P2P. 1 Introduction. 2 Napster. Alex S. 2.1 Client/Server. 2.2 Problems P2P Alex S. 1 Introduction The systems we will examine are known as Peer-To-Peer, or P2P systems, meaning that in the network, the primary mode of communication is between equally capable peers. Basically

More information

Chunk Scheduling Strategies In Peer to Peer System-A Review

Chunk Scheduling Strategies In Peer to Peer System-A Review Chunk Scheduling Strategies In Peer to Peer System-A Review Sanu C, Deepa S S Abstract Peer-to-peer ( P2P) s t r e a m i n g systems have become popular in recent years. Several peer- to-peer systems for

More information

Data Indexing and Querying in DHT Peer-to-Peer Networks

Data Indexing and Querying in DHT Peer-to-Peer Networks Institut EURECOM Research Report N o 73 RR-03-073 Data Indexing and Querying in DHT Peer-to-Peer Networks P.A. Felber, E.W. Biersack, L. Garcés-Erice, K.W. Ross, G. Urvoy-Keller January 15, 2003 2 Data

More information

hot plug RAID memory technology for fault tolerance and scalability

hot plug RAID memory technology for fault tolerance and scalability hp industry standard servers april 2003 technology brief TC030412TB hot plug RAID memory technology for fault tolerance and scalability table of contents abstract... 2 introduction... 2 memory reliability...

More information

Scheduling of Multiple Applications in Wireless Sensor Networks Using Knowledge of Applications and Network

Scheduling of Multiple Applications in Wireless Sensor Networks Using Knowledge of Applications and Network International Journal of Information and Computer Science (IJICS) Volume 5, 2016 doi: 10.14355/ijics.2016.05.002 www.iji-cs.org Scheduling of Multiple Applications in Wireless Sensor Networks Using Knowledge

More information

AVALANCHE: A NETWORK CODING ANALYSIS

AVALANCHE: A NETWORK CODING ANALYSIS COMMUNICATIONS IN INFORMATION AND SYSTEMS c 2007 International Press Vol. 7, No. 4, pp. 353-358, 2007 003 AVALANCHE: A NETWORK CODING ANALYSIS RAYMOND W. YEUNG Abstract. In this paper, we study the application

More information

Reduction of Periodic Broadcast Resource Requirements with Proxy Caching

Reduction of Periodic Broadcast Resource Requirements with Proxy Caching Reduction of Periodic Broadcast Resource Requirements with Proxy Caching Ewa Kusmierek and David H.C. Du Digital Technology Center and Department of Computer Science and Engineering University of Minnesota

More information

High-Bandwidth Mesh-based Overlay Multicast in Heterogeneous Environments

High-Bandwidth Mesh-based Overlay Multicast in Heterogeneous Environments High-Bandwidth Mesh-based Overlay Multicast in Heterogeneous Environments Bartosz Biskupski, Raymond Cunningham, Jim Dowling, and René Meier Distributed Systems Group Trinity College Dublin, Ireland {biskupski,racunnin,jpdowlin,rmeier}@cs.tcd.ie

More information

Chapter 9. Software Testing

Chapter 9. Software Testing Chapter 9. Software Testing Table of Contents Objectives... 1 Introduction to software testing... 1 The testers... 2 The developers... 2 An independent testing team... 2 The customer... 2 Principles of

More information

Achieve Significant Throughput Gains in Wireless Networks with Large Delay-Bandwidth Product

Achieve Significant Throughput Gains in Wireless Networks with Large Delay-Bandwidth Product Available online at www.sciencedirect.com ScienceDirect IERI Procedia 10 (2014 ) 153 159 2014 International Conference on Future Information Engineering Achieve Significant Throughput Gains in Wireless

More information

Real-time grid computing for financial applications

Real-time grid computing for financial applications CNR-INFM Democritos and EGRID project E-mail: cozzini@democritos.it Riccardo di Meo, Ezio Corso EGRID project ICTP E-mail: {dimeo,ecorso}@egrid.it We describe the porting of a test case financial application

More information

Designing a Tit-for-Tat Based Peer-to-Peer Video-on-Demand System

Designing a Tit-for-Tat Based Peer-to-Peer Video-on-Demand System Designing a Tit-for-Tat Based Peer-to-Peer Video-on-Demand System Kévin Huguenin, Anne-Marie Kermarrec IRISA / INRIA Rennes, France Vivek Rai, Maarten van Steen Vrije Universiteit Amsterdam, The Netherlands

More information

Assignment 5. Georgia Koloniari

Assignment 5. Georgia Koloniari Assignment 5 Georgia Koloniari 2. "Peer-to-Peer Computing" 1. What is the definition of a p2p system given by the authors in sec 1? Compare it with at least one of the definitions surveyed in the last

More information

Introduction to P P Networks

Introduction to P P Networks Introduction to P P Networks B Sc Florian Adamsky florianadamsky@iemthmde http://florianadamskyit/ cbd Internet Protocols and Applications SS B Sc Florian Adamsky IPA / Outline Introduction What is P P?

More information

Peer-to-Peer Systems. Network Science: Introduction. P2P History: P2P History: 1999 today

Peer-to-Peer Systems. Network Science: Introduction. P2P History: P2P History: 1999 today Network Science: Peer-to-Peer Systems Ozalp Babaoglu Dipartimento di Informatica Scienza e Ingegneria Università di Bologna www.cs.unibo.it/babaoglu/ Introduction Peer-to-peer (PP) systems have become

More information

Network-Adaptive Video Coding and Transmission

Network-Adaptive Video Coding and Transmission Header for SPIE use Network-Adaptive Video Coding and Transmission Kay Sripanidkulchai and Tsuhan Chen Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA 15213

More information

Incompatibility Dimensions and Integration of Atomic Commit Protocols

Incompatibility Dimensions and Integration of Atomic Commit Protocols The International Arab Journal of Information Technology, Vol. 5, No. 4, October 2008 381 Incompatibility Dimensions and Integration of Atomic Commit Protocols Yousef Al-Houmaily Department of Computer

More information

Delay Bounds of Peer-to-Peer Video Streaming

Delay Bounds of Peer-to-Peer Video Streaming Delay Bounds of Peer-to-Peer Video Streaming Yong Liu Electrical & Computer Engineering Department Polytechnic Institute of NYU Brooklyn, NY, email: yongliu@poly.edu June, 9 Abstract Peer-to-Peer (PP)

More information

Introduction to Peer-to-Peer Systems

Introduction to Peer-to-Peer Systems Introduction Introduction to Peer-to-Peer Systems Peer-to-peer (PP) systems have become extremely popular and contribute to vast amounts of Internet traffic PP basic definition: A PP system is a distributed

More information

MVAPICH2 vs. OpenMPI for a Clustering Algorithm

MVAPICH2 vs. OpenMPI for a Clustering Algorithm MVAPICH2 vs. OpenMPI for a Clustering Algorithm Robin V. Blasberg and Matthias K. Gobbert Naval Research Laboratory, Washington, D.C. Department of Mathematics and Statistics, University of Maryland, Baltimore

More information

System Models. 2.1 Introduction 2.2 Architectural Models 2.3 Fundamental Models. Nicola Dragoni Embedded Systems Engineering DTU Informatics

System Models. 2.1 Introduction 2.2 Architectural Models 2.3 Fundamental Models. Nicola Dragoni Embedded Systems Engineering DTU Informatics System Models Nicola Dragoni Embedded Systems Engineering DTU Informatics 2.1 Introduction 2.2 Architectural Models 2.3 Fundamental Models Architectural vs Fundamental Models Systems that are intended

More information

arxiv:cs.ni/ v1 21 Nov 2006

arxiv:cs.ni/ v1 21 Nov 2006 Clustering and Sharing Incentives in BitTorrent Systems Arnaud Legout Nikitas Liogkas Eddie Kohler Lixia Zhang I.N.R.I.A. University of California, Los Angeles Sophia Antipolis, France Los Angeles, CA,

More information

Application Layer Multicast For Efficient Peer-to-Peer Applications

Application Layer Multicast For Efficient Peer-to-Peer Applications Application Layer Multicast For Efficient Peer-to-Peer Applications Adam Wierzbicki 1 e-mail: adamw@icm.edu.pl Robert Szczepaniak 1 Marcin Buszka 1 1 Polish-Japanese Institute of Information Technology

More information

Whitepaper Italy SEO Ranking Factors 2012

Whitepaper Italy SEO Ranking Factors 2012 Whitepaper Italy SEO Ranking Factors 2012 Authors: Marcus Tober, Sebastian Weber Searchmetrics GmbH Greifswalder Straße 212 10405 Berlin Phone: +49-30-3229535-0 Fax: +49-30-3229535-99 E-Mail: info@searchmetrics.com

More information

Video Streaming Over the Internet

Video Streaming Over the Internet Video Streaming Over the Internet 1. Research Team Project Leader: Graduate Students: Prof. Leana Golubchik, Computer Science Department Bassem Abdouni, Adam W.-J. Lee 2. Statement of Project Goals Quality

More information

Dynamic Search Algorithm in P2P Networks

Dynamic Search Algorithm in P2P Networks Dynamic Search Algorithm in P2P Networks Prabhudev.S.Irabashetti M.tech Student,UBDTCE, Davangere, Abstract- Designing efficient search algorithms is a key challenge in unstructured peer-to-peer networks.

More information

Architecting the High Performance Storage Network

Architecting the High Performance Storage Network Architecting the High Performance Storage Network Jim Metzler Ashton, Metzler & Associates Table of Contents 1.0 Executive Summary...3 3.0 SAN Architectural Principals...5 4.0 The Current Best Practices

More information

Understanding BitTorrent: An Experimental Perspective

Understanding BitTorrent: An Experimental Perspective Understanding BitTorrent: An Experimental Perspective Arnaud Legout, Guillaume Urvoy-Keller, Pietro Michiardi To cite this version: Arnaud Legout, Guillaume Urvoy-Keller, Pietro Michiardi. Understanding

More information

A Bandwidth-Aware Scheduling Strategy for P2P-TV Systems

A Bandwidth-Aware Scheduling Strategy for P2P-TV Systems A Bandwidth-Aware Scheduling Strategy for P2P-TV Systems Abstract P2P-TV systems distribute live streaming contents by organizing the information flow in small chunks that are exchanged among peers. Different

More information

Introduction to Big-Data

Introduction to Big-Data Introduction to Big-Data Ms.N.D.Sonwane 1, Mr.S.P.Taley 2 1 Assistant Professor, Computer Science & Engineering, DBACER, Maharashtra, India 2 Assistant Professor, Information Technology, DBACER, Maharashtra,

More information

Multiprocessing and Scalability. A.R. Hurson Computer Science and Engineering The Pennsylvania State University

Multiprocessing and Scalability. A.R. Hurson Computer Science and Engineering The Pennsylvania State University A.R. Hurson Computer Science and Engineering The Pennsylvania State University 1 Large-scale multiprocessor systems have long held the promise of substantially higher performance than traditional uniprocessor

More information

6.001 Notes: Section 4.1

6.001 Notes: Section 4.1 6.001 Notes: Section 4.1 Slide 4.1.1 In this lecture, we are going to take a careful look at the kinds of procedures we can build. We will first go back to look very carefully at the substitution model,

More information

Computer Experiments. Designs

Computer Experiments. Designs Computer Experiments Designs Differences between physical and computer Recall experiments 1. The code is deterministic. There is no random error (measurement error). As a result, no replication is needed.

More information

Performing MapReduce on Data Centers with Hierarchical Structures

Performing MapReduce on Data Centers with Hierarchical Structures INT J COMPUT COMMUN, ISSN 1841-9836 Vol.7 (212), No. 3 (September), pp. 432-449 Performing MapReduce on Data Centers with Hierarchical Structures Z. Ding, D. Guo, X. Chen, X. Luo Zeliu Ding, Deke Guo,

More information

Parallel Performance Studies for a Clustering Algorithm

Parallel Performance Studies for a Clustering Algorithm Parallel Performance Studies for a Clustering Algorithm Robin V. Blasberg and Matthias K. Gobbert Naval Research Laboratory, Washington, D.C. Department of Mathematics and Statistics, University of Maryland,

More information

Exploring the Optimal Replication Strategy in P2P-VoD Systems: Characterization and Evaluation

Exploring the Optimal Replication Strategy in P2P-VoD Systems: Characterization and Evaluation 1 Exploring the Optimal Replication Strategy in P2P-VoD Systems: Characterization and Evaluation Weijie Wu, Student Member, IEEE, and John C.S. Lui, Fellow, IEEE Abstract P2P-Video-on-Demand (P2P-VoD)

More information

Understanding BitTorrent: An Experimental Perspective

Understanding BitTorrent: An Experimental Perspective Understanding BitTorrent: An Experimental Perspective Arnaud Legout, Guillaume Urvoy-Keller, Pietro Michiardi To cite this version: Arnaud Legout, Guillaume Urvoy-Keller, Pietro Michiardi. Understanding

More information

Multi-path based Algorithms for Data Transfer in the Grid Environment

Multi-path based Algorithms for Data Transfer in the Grid Environment New Generation Computing, 28(2010)129-136 Ohmsha, Ltd. and Springer Multi-path based Algorithms for Data Transfer in the Grid Environment Muzhou XIONG 1,2, Dan CHEN 2,3, Hai JIN 1 and Song WU 1 1 School

More information

Computer Networks. Pushing BitTorrent locality to the limit. Stevens Le Blond, Arnaud Legout, Walid Dabbous. abstract

Computer Networks. Pushing BitTorrent locality to the limit. Stevens Le Blond, Arnaud Legout, Walid Dabbous. abstract Computer Networks 55 (20) 54 557 Contents lists available at ScienceDirect Computer Networks journal homepage: www.elsevier.com/locate/comnet Pushing locality to the limit Stevens Le Blond, Arnaud Legout,

More information

ARELAY network consists of a pair of source and destination

ARELAY network consists of a pair of source and destination 158 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL 55, NO 1, JANUARY 2009 Parity Forwarding for Multiple-Relay Networks Peyman Razaghi, Student Member, IEEE, Wei Yu, Senior Member, IEEE Abstract This paper

More information

improving the performance and robustness of P2P live streaming with Contracts

improving the performance and robustness of P2P live streaming with Contracts MICHAEL PIATEK AND ARVIND KRISHNAMURTHY improving the performance and robustness of P2P live streaming with Contracts Michael Piatek is a graduate student at the University of Washington. After spending

More information

6.033 Spring 2015 Lecture #11: Transport Layer Congestion Control Hari Balakrishnan Scribed by Qian Long

6.033 Spring 2015 Lecture #11: Transport Layer Congestion Control Hari Balakrishnan Scribed by Qian Long 6.033 Spring 2015 Lecture #11: Transport Layer Congestion Control Hari Balakrishnan Scribed by Qian Long Please read Chapter 19 of the 6.02 book for background, especially on acknowledgments (ACKs), timers,

More information

McGill University - Faculty of Engineering Department of Electrical and Computer Engineering

McGill University - Faculty of Engineering Department of Electrical and Computer Engineering McGill University - Faculty of Engineering Department of Electrical and Computer Engineering ECSE 494 Telecommunication Networks Lab Prof. M. Coates Winter 2003 Experiment 5: LAN Operation, Multiple Access

More information

Efficient Resource Management for the P2P Web Caching

Efficient Resource Management for the P2P Web Caching Efficient Resource Management for the P2P Web Caching Kyungbaek Kim and Daeyeon Park Department of Electrical Engineering & Computer Science, Division of Electrical Engineering, Korea Advanced Institute

More information

RAID SEMINAR REPORT /09/2004 Asha.P.M NO: 612 S7 ECE

RAID SEMINAR REPORT /09/2004 Asha.P.M NO: 612 S7 ECE RAID SEMINAR REPORT 2004 Submitted on: Submitted by: 24/09/2004 Asha.P.M NO: 612 S7 ECE CONTENTS 1. Introduction 1 2. The array and RAID controller concept 2 2.1. Mirroring 3 2.2. Parity 5 2.3. Error correcting

More information

On Feasibility of P2P Traffic Control through Network Performance Manipulation

On Feasibility of P2P Traffic Control through Network Performance Manipulation THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGINEERS TECHNICAL REPORT OF IEICE On Feasibility of P2P Traffic Control through Network Performance Manipulation HyunYong Lee Masahiro Yoshida

More information

DISTRIBUTED COMPUTER SYSTEMS ARCHITECTURES

DISTRIBUTED COMPUTER SYSTEMS ARCHITECTURES DISTRIBUTED COMPUTER SYSTEMS ARCHITECTURES Dr. Jack Lange Computer Science Department University of Pittsburgh Fall 2015 Outline System Architectural Design Issues Centralized Architectures Application

More information

High Performance Computing Prof. Matthew Jacob Department of Computer Science and Automation Indian Institute of Science, Bangalore

High Performance Computing Prof. Matthew Jacob Department of Computer Science and Automation Indian Institute of Science, Bangalore High Performance Computing Prof. Matthew Jacob Department of Computer Science and Automation Indian Institute of Science, Bangalore Module No # 09 Lecture No # 40 This is lecture forty of the course on

More information

FUTURE communication networks are expected to support

FUTURE communication networks are expected to support 1146 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL 13, NO 5, OCTOBER 2005 A Scalable Approach to the Partition of QoS Requirements in Unicast and Multicast Ariel Orda, Senior Member, IEEE, and Alexander Sprintson,

More information

A Survey on the Performance of Parallel Downloading. J. L. Chiang April 7, 2005

A Survey on the Performance of Parallel Downloading. J. L. Chiang April 7, 2005 A Survey on the Performance of Parallel Downloading J. L. Chiang April 7, 2005 Outline Parallel download schemes Static equal Static unequal Dynamic Performance comparison and issues adpd scheme Large-scale

More information

Peer to Peer Systems and Probabilistic Protocols

Peer to Peer Systems and Probabilistic Protocols Distributed Systems 600.437 Peer to Peer Systems & Probabilistic Protocols Department of Computer Science The Johns Hopkins University 1 Peer to Peer Systems and Probabilistic Protocols Lecture 11 Good

More information

Optimizing the use of the Hard Disk in MapReduce Frameworks for Multi-core Architectures*

Optimizing the use of the Hard Disk in MapReduce Frameworks for Multi-core Architectures* Optimizing the use of the Hard Disk in MapReduce Frameworks for Multi-core Architectures* Tharso Ferreira 1, Antonio Espinosa 1, Juan Carlos Moure 2 and Porfidio Hernández 2 Computer Architecture and Operating

More information

Scalable overlay Networks

Scalable overlay Networks overlay Networks Dr. Samu Varjonen 1 Lectures MO 15.01. C122 Introduction. Exercises. Motivation. TH 18.01. DK117 Unstructured networks I MO 22.01. C122 Unstructured networks II TH 25.01. DK117 Bittorrent

More information

Combining In-Transit Buffers with Optimized Routing Schemes to Boost the Performance of Networks with Source Routing?

Combining In-Transit Buffers with Optimized Routing Schemes to Boost the Performance of Networks with Source Routing? Combining In-Transit Buffers with Optimized Routing Schemes to Boost the Performance of Networks with Source Routing? J. Flich 1,P.López 1, M. P. Malumbres 1, J. Duato 1, and T. Rokicki 2 1 Dpto. Informática

More information

Building a low-latency, proximity-aware DHT-based P2P network

Building a low-latency, proximity-aware DHT-based P2P network Building a low-latency, proximity-aware DHT-based P2P network Ngoc Ben DANG, Son Tung VU, Hoai Son NGUYEN Department of Computer network College of Technology, Vietnam National University, Hanoi 144 Xuan

More information