Distributed Two-way Trees for File Replication on Demand Ramprasad Tamilselvan Department of Computer Science Golisano College of Computing and Information Sciences Rochester, NY 14586 rt7516@rit.edu Abstract Edge data centers reduce the access time of contents significantly. It pulls the content from the origin servers, stores the content locally and serves the client. The storage system in the edge data centers should be flexible and dynamic to handle the popular files or sudden peak in the traffic for the files. In this paper, we propose an algorithm called two-way tree to replicate files efficiently based on demand. The experimental results show that the two-way tree algorithm relieves hot spots in the storage cluster and performs significantly better than the existing peer to peer storage system during the peak time in traffic. I. INTRODUCTION Content caching helps in reducing the access time of web contents. Global edge networks store the content in a location close to the users to reduce the access time of contents. It also reduces the burden on the origin servers. Since edge networks store heterogeneous contents, the popularity of the files changes dynamically. Change in the popularity of the files creates hot spots in the storage cluster. Replicating files dynamically in the storage cluster relieves hot spots and reduces the access time of the popular files. The data traffic in edge networks keeps on increasing, so edge networks use mini data centers to store the cache. The architecture of this data center comprises of proxy servers and a storage cluster. The storage cluster uses peer to peer distributed systems to reduce the access time of contents. In the existing system, the number of replication remains constant. The number of copies of both popular and non-popular files remains same in the storage cluster. When the storage cluster receives more requests for the popular files, all the systems in the cluster try to access the servers which store popular files. This overloads the server and affects the access time of files. Random tree algorithm [1] makes use of consistent hashing to relieve hot spots in the web servers. It creates a virtual tree for a file and stores the file in the root node of the tree. The leaf nodes of the tree receive the initial requests. Each node forwards the requests to its parent if it doesn t have the file until the request reaches the node with the file. If a node has the file, it serves the client. The only problem with the random tree is that requests for both popular and non-popular files follow the same path and use the same tree. Though it reduces the access time of popular files, it increases the access time of non-popular files. Two-way tree algorithm makes use of consistent hashing and random trees. But it decouples a tree into a fat tree and a slim tree. The initial requests make use of the fat tree to reach the content quickly. Once the file becomes popular, the requests for the file make use of the thin tree and replicates the file to fewer nodes based on demand. This algorithm makes the storage cluster more flexible. It helps to relieve hot spots in the servers and also reduces the access time of popular and non-popular files. We developed a simulation to generate the requests based on the web traces available from UC Berkley. The simulation requests contents from the systems which use peer to peer distributed storage and Two-way tree algorithm. The results from the simulation are used to evaluate the performance of both systems. II. RELATED WORK In the early stages, most of the systems used file replication to backup the data. When Internet traffic started growing, replications are used in peer to peer distributed systems to prevent overloading the servers. In the existing peer to peer distributed systems like Cassandra, replication factor can be configured as more than 1. So that it backs up data and at the same time distributes load among the servers. But in these systems, replication factors are configured statically. It does not change dynamically based on demand. Li et al. [2] proposed Tachyon to replicate data based on demand. Later, Tachyon was renamed as Alluxio and developed as a standalone system. Alluxio acts as an intermediate layer between traditional storage and computation framework to provide faster access to data. Alluxio introduces cache layer between application and storage layer. Alluxio has three components such as master, workers and clients. Computation frameworks make use of alluxio clients to access the server. Workers help to store, retrieve and cache the data when required. Data sharing among workers reduces workload in storage servers. In this way, popular data are cached in alluxio workers and served in memory speed. Blowfish [3] is a distributed data store that achieves dynamic storage-performance trade off. Scarlett [4] replicates popular content efficiently in MapReduce clusters. Blowfish and Scarlett systems replicate files in the storage layer. So these two 1 P a g e
systems are closely coupled with storage layer. These systems are not suitable for heterogeneous environment where different softwares are used in storage layers. Karger et al. [1] proposed distributed caching protocols for relieving hot spots in servers. This protocol makes use of consistent hashing and random trees. In this algorithm, a virtual tree is built for each file. And the file is placed in the root node of the tree. All the initial requests are received in the leaf nodes and pass on to its parent node till it reaches the node which has the file. Each node keeps track of popularity of files. When a file becomes popular, node which has the file replicates it to its leaf nodes. We propose an algorithm called Two-way tree which replicates the files efficiently based on demand. The algorithm is based on consistent hashing and random tree. But it decouples lookup tree and replication tree to improve the lookup time complexity. This algorithm is used in a system which acts as an intermediate layer between storage and application layer. A. Tree Construction III. RANDOM TREES A random tree algorithm constructs a virtual tree for each file in the storage as shown in Fig.1 and places the file in the root node of the tree. The server in the random tree is accessed using hash function(f ilename, level, position). The hash function takes the parameters as file name, level of the node in the tree and position of the node in the tree and returns the server id. A node can compute its parent node using the above hash function for the file. The node passes the name of the file, the parent position, and the parent level to the hash function and it gets back the parent s server id. The parent position and the parent level can be computed using the degree of the tree. The initial requests are received by the leaf nodes the random tree. The server id of the leaf nodes of the tree for the file can be computed using the hash function. The level and the position of the leaf nodes of the tree can be computed using the degree of lookup tree. lookup path B. Limitations 2 4 5 1 replication path 3 6 7 Fig. 1. Random tree (n = 7 and d = 2) A random tree algorithm helps in relieving hot spots in the storage cluster by replicating the files based on demand. When the node in the random tree receives the request for the file, the node forwards the request to its parent node till the request reaches the node which has the file. The node which has the file serves the request. The additional overhead in forwarding the request affects the overall performance of the system. To overcome this issue, we can increase the degree of the random tree. But the issue with the fat tree is that replication is not under control in the storage cluster. A. Tree Construction IV. TWO-WAY TREES In the two-way tree algorithm, a virtual tree is constructed for lookup path and replication path for each file as shown in Fig 2 and Fig 3 respectively. The degree of the lookup tree is larger than the degree of the replication tree. The fat lookup tree reduces the lookup path significantly. The small replication tree controls the number of replications in the system. The server id of the nodes are retrieved using hash f unc (f ile name, level, position, degree). The hash function takes parameters as the filename, the level of the node in the tree and the position of the node in the tree. The degree in the hash function denotes the degree of the lookup tree or the random tree based on the type of request. 1 lookup path 2 3 4 5 6 7 Fig. 2. Two-way tree (lookup) (n = 7 and D = 6) 2 4 5 1 replication path 3 6 7 Fig. 3. Two-way tree (replication) (n = 7 and d = 2) B. Handling Read Requests In the two-way tree algorithm, the server forwards the request to its parent node in both lookup tree and replication tree. The mode of the request is UP in the lookup tree and LEFT in the replication tree. Algorithm 1 describes a method for handling the read request in the lookup tree. The server keeps track of the number of files forwarded to its parent node in the lookup tree and the replication tree. If the file is not available on the server, it forwards the requests to its parent node in the lookup tree. If the popularity of a file increases, the server forwards the request for the file to its parent node in the replication tree. If the popularity of the file is high, the server sends the write request to its parent node in the replication tree. Once the file is available on the server, it directly serves 2 P a g e
the client. If the server receives more number of requests for the file, it shares the load by forwarding some requests to its parent node in the replication tree. Algorithm 1 Handling read requests in the server in the lookup tree 1: procedure READ(UP, file, sender id) 2: countup countup + 1 3: if file.color = white then 4: if countup < threshold1 then 5: forward read(up, file, this.id) to up parent 6: else if countup < threshold2 then 7: forward read(left, file, this.id) to left parent 8: else 9: file.color gray 10: forward write(left, file, this.id) to left parent 11: enqueue(file) 12: else 13: if countup < threshold3 OR leaf OR left parent(sender id) then 14: if file.color = black then 15: serve client 16: else if file.color = gray then 17: enqueue(file) 18: else 19: forward read(left, file, sender id) to left parent(sender id) Algorithm 2 describes a method for handling read requests in the replication tree (mode as LEFT). The server keeps track of the number of files forwarded to its parent node in the replication tree. If the popularity of the file is less on the server, it forwards read requests to its parent node in the replication tree. If the file is more popular on the server, it forwards the write requests to its parent node in the replication tree. When the file is available on the server, it serves the client. Algorithm 2 Handling read requests in the server in the replication tree 1: procedure READ(LEFT, file, sender id) 2: countleft countleft + 1 3: if file.color = white then 4: if countleft < threshold1 then 5: forward read(left, file, this.id) to left parent 6: else 7: file.color gray 8: forward write(left, file, this.id) to left parent 9: else if file.color = gray then 10: enqueue(file) 11: else 12: serve client C. Handling Write Requests The write request for the file indicates that file is more popular in the storage cluster. When a server receives the write request, it forwards the write request to its parent node in the replication tree if the file is not available. Otherwise, it replicates the file to its sender child node in the tree. The write request keeps track of all the servers in its path so that all the servers receive a copy of the file. Algorithm 3 describes a method for handling write requests in the server. When a server receives the write request with mode as RIGHT, it replicates the file to the next server id in the write request and forwards the write request to it with mode as RIGHT. Algorithm 4 describes a method for handling write requests with mode as right in the server. Algorithm 3 Handling write requests in the server in the replication tree 1: procedure WRITE(LEFT, file, sender id) 2: if file.color = black then 3: replicate file to sender id 4: Forward write(right, file, sender id) 5: else if file.color = white then 6: file.color gray 7: forward(left, file, this id) to left parent Algorithm 4 Handling replication requests in the server in the replication tree 1: procedure WRITE(RIGHT, file, sender id) 2: store file 3: forward(right, file, this.id) to right child D. Time Complexity Analysis In this section, the time complexity of lookup path length is discussed in detail. The time complexity of lookup path in the random tree is O (log d n), where n is the number of servers in the storage cluster and d is the degree of the random tree. The time complexity of lookup path in the two-way tree is O (log D n), where n is the number of servers in the storage cluster and D is the degree of lookup tree. The lookup path length in the two-way tree algorithm is significantly reduced. The time complexity of lookup path in the two-way tree algorithm is less than the random tree algorithm. This reduces the overhead of forwarding requests to the server which has the file. It helps in improving the performance of storage system. V. SIMULATIONS Simulations of two-way tree system, random tree storage system and peer to peer distributed storage system are developed to compare the performance of storage cluster. The details of the simulation of each system are discussed in detail in this section. A. Peer to Peer Distributed Storage System Storage cluster uses peer to peer distributed storage system to improve its performance. Cassandra is an example of a peer to peer system for the key-value store. The peer to peer system uses consistent hashing which assigns a range of hash keys to each server in the storage cluster. The servers in the storage cluster can receive the initial requests. The peer to peer system locates the file in the server by computing the hash key based on the name of the file. The hash key of the file should match with the range of hash keys in the server. The peer to peer system stores the file on the server which holds the hash key for the file. Any servers in the storage cluster can receive the read request for the file. The server serves the client directly if the file is available on the server. Otherwise, the server requests the file from the peer server which has the file and serves the client directly. The server does not store the content from the peer server. The replication factor in the peer to peer system is configurable. But the replication factor is static and constant. For the replication factor 2, the system stores two copies of files in the cluster. We developed a system which simulates the peer to peer system with functionalities as mentioned above. 3 P a g e
B. Random Tree Storage Simulation The random tree system constructs a virtual tree for each file in the storage cluster. The leaf nodes of the tree receive the initial request for the file. If the file is available on the server, it serves the request. Otherwise, it forwards the request to its parent node till it reaches the node which has the file. If the popularity of the file reaches configured threshold value in the server, the server replicates the file to its child nodes. We developed a system which simulates the random tree algorithm as mentioned above. C. Two-way Tree Storage Simulation The two-way tree system constructs a virtual lookup tree and a virtual replication tree for each file in the storage cluster. The degree of lookup tree is larger than the degree of replication tree. The two-way tree algorithm forwards the request to its parent node in the lookup tree. It uses the replication tree for file replication. We developed a simulation which simulates the two-way tree algorithm. VI. SIMULATION EXPERIMENTS In this section, we discuss the trace driven approach, the performance metrics for the evaluation and the system configurations in detail. A. A Trace driven approach In this project, we used a trace-driven approach for testing our simulations. We used web traces from real proxy servers as input to our simulation. For our experiments, we used the web traces data available in http://www.web-caching.com/traceslogs.html. We preprocessed the web traces available in the above link to suit our simulation. This approach helps in testing our system in the real time. B. System Configurations The experiments are conducted in Linux machine with 8 GB memory and 2GHz Intel Core i5 processor. The simulations are developed using JAVA version 8. C. Simulation Configurations In our experiment, some of the properties are common in all three simulations. This helps in evaluating the performance of all simulations in the same environment. Each server in the storage cluster process a client request and a peer request per second. Each server completes processing a request exactly in one second. The server forwards the request to its parent node in one second. VII. RESULTS In this section, we present the results of the performance of random tree algorithm, two-way tree algorithm and peer to peer system. For this experiment, the number of servers in the storage cluster is 7. The degree of the random tree in this experiment is 2. In two-way trees, the degree of the lookup tree is 6 and the degree of the replication tree is 2. In the scaled configuration experiment, the number of servers in the storage cluster is 15. The degree of the random tree is 2. In two-way trees, the degree of the lookup tree is 14 and the degree of the replication tree is 2. A. Random Tree Vs Two-way Tree Fig. 4 shows the plot of maximum queue length of the random tree and the two-way tree measured during the simulation. In the simulation, the maximum queue length of the servers in the storage is measured at the interval of 5000 seconds. From the graph, we can say that the maximum queue length of the random tree and the two-way tree remains same and low. This indicates that there is no occurrence of hot spots in the storage cluster. Fig. 5 shows the plot of the number of files served by the random tree and the two-way tree in the simulation. The graph shows that the two-way tree system serves more files compared to the random The fat lookup tree helps two-way trees to serve more files compared to the random tree. Fig. 4. Maximum queue length in the random tree and the two-way tree system. B. Two-way Tree Vs Peer to Peer System Fig. 6 shows the plot of maximum queue length in the twoway tree and the peer to peer system. The graph shows that the max queue length of the two-way tree remains low throughout the simulation. The max queue length of the peer to peer system has a spike at time 60000th second. This indicates that there are occurrences of hot spots in the peer to peer system. Fig. 7 shows the number of files served by both the two-way tree system and the peer to peer system. The plot shows that the two-way tree system performs better than the peer to peer system around time period 60000. The above two plots show that there are occurrences of hot spots in the storage cluster in the peer to peer system at time 60000. When there are occurrences of hot spots in the peer to peer system, it serves less number of files compared to the two-way 4 P a g e
Fig. 5. system. Number of files served by the random tree and the two-way tree Fig. 7. Number of files served by the peer to peer system and the two-way Fig. 6. Maximum queue length in the peer to peer system and the two-way Fig. 8. Maximum queue length in the peer to peer system and the two-way C. Scaled Configuration: Two-way Tree Vs Peer to Peer System This experiment evaluates the performance of the two-way tree and the peer to peer system by increasing the number of servers in the storage cluster. The plots of max queue length of the peer to peer system and the two-way tree system are shown in Fig. 8. The plot shows that there is no occurrence of hot spots in the peer to peer system when the number of servers is increased in the storage cluster. The max queue length of the two-way tree remains the same as the peer to peer system. The number of files served by the peer to peer system and the two-way tree system is plotted in Fig. 9. The plot shows that the performance of the peer to peer system and the twoway tree system are same when there are more servers in the cluster. VIII. CONCLUSION In this paper, we designed an algorithm called two-way trees to replicate files dynamically based on demand in the storage cluster. The two-way tree algorithm relieves hot spots in the storage cluster like the random tree algorithm. But the two-way tree algorithm performs better than the random tree. Decoupling the lookup path and the replication path significantly reduces the time complexity of the two-way tree compared to the random tree. The above experiment uses the data collected from the proxy server in the real network. The results show that two-way tree relieves hot spots in the storage cluster and performs better than the peer to peer system during the peak time in the traffic. The storage cluster in edge data centers can make use of two-way tree algorithm to replicate files dynamically based on demand. 5 P a g e
Fig. 9. Number of files served by the peer to peer system and the two-way REFERENCES [1] D. Karger, E. Lehman, T. Leighton, R. Panigrahy, M. Levine, and D. Lewin, Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the world wide web, in Proceedings of the twenty-ninth annual ACM symposium on Theory of computing. ACM, 1997, pp. 654 663. [2] H. Li, A. Ghodsi, M. Zaharia, E. Baldeschwieler, S. Shenker, and I. Stoica, Tachyon: Memory throughput i/o for cluster computing frameworks, memory, vol. 18, p. 1, 2013. [3] A. Khandelwal, R. Agarwal, and I. Stoica, Blowfish: Dynamic storageperformance tradeoff in data stores. in NSDI, 2016, pp. 485 500. [4] G. Ananthanarayanan, S. Agarwal, S. Kandula, A. Greenberg, I. Stoica, D. Harlan, and E. Harris, Scarlett: coping with skewed content popularity in mapreduce clusters, in Proceedings of the sixth conference on Computer systems. ACM, 2011, pp. 287 300. 6 P a g e