Storage vs Repair Bandwidth for Network Erasure Coding in Distributed Storage Systems

Size: px
Start display at page:

Download "Storage vs Repair Bandwidth for Network Erasure Coding in Distributed Storage Systems"

Transcription

1 Storage vs Repair Bandwidth for Network Erasure Coding in Distributed Storage Systems 1 Swati Mittal Singal, 2 Nitin Rakesh, MIEEE, MACM, MSIAM, LMCSI, MIAENG 1, 2 Department of Computer Science and Engineering, ASET, Amity University Uttar Pradesh, Noida, India 1 ssingal@amity.edu, 2 {nrakesh@amity.edu, nitin.rakesh@gmail.com, nitin.rakesh@ieee.org} Abstract Network coding is used in peer-to-peer storage systems, archival storage, wireless networks, satellite communication, video conferencing etc. Storage system stores data at different locations. For the data to be available, durable and reliable, it must be able to recover from failures efficiently. Different approaches applied on storage systems are examined and evaluated in this paper. Keeping replicas of the data at multiple places is traditional technique used by major storage systems. To reduce the amount of storage required by replication the distributed system is now transitioning towards, the erasure codes. Several approaches like the hybrid and regenerating codes provide solution to storage and repair bandwidth. But still improvement in terms of communication cost in the face of failures is required. These approaches and main application areas of these approaches are examined and analyzed in this paper. A comparative analysis based on storage requirement, disk access, repair bandwidth and unavailability probability is also done. Keywords Distributed Storage System,, Erasure Codes, Regenerating codes I. INTRODUCTION Network error correction or recovering from failures has been of interest over decades. The traditional approach of error-correction codes is a link by link error correction technique [1]. It controls errors by adding redundancies in the time domain. When failure occurs during transmission the packets are retransmitted by the sender. The process is thus time consuming. Hence, the conception of network error correction codes was introduced as a simplification of the classical error-correction codes. Network error correction code (NEC), control errors by introducing redundancies in the space domain [2, 3]. The data object of size M is broken into k fragments and r redundant units are added onto it in such a manner that only k data units out of the total data units, are enough to recover form failure. This makes the system resilient to r data unit failures. Network error correction code is applicable in a variety of fields. These include network communication, satellite communication, video conferencing, peer to peer file sharing systems, distributed storage, wireless sensor networks, data grids, archival storage [5-7] and many more. With the emerging trends of cloud and big data, the digital data is 3 Rakesh Matam, MIEEE, MACM 3 Department of Computer Science and Engineering, Indian Institute of Information Technology (IIIT), Guwahati, India 3 rakesh@iiitg.ac.in growing very fast and is expected to be ten times in seven years. This demands the efficient storage and recovery in data centers. Hence large scale Distributed Storage Systems (DSS) are currently transitioning to erasure codes. This paper describes various approaches that are used in distributed storage systems. A comparison between these approaches based on the application areas and parameters like, storage required, disk access, repair bandwidth and unavailability probability is presented. The analysis of these approaches and the problems with distributed storage systems is also presented. Distributed storage systems used replication to make data more available, reliable and durable, in this paper we have examined various replication methods and compared based on usability by many storage requirement applications. The paper is organized as follows. Section first introduces various distributed storage approaches and complications in terms of performance. In section second different models used for distributed storage systems have been described. Section third compares these approaches based on storage overheads, disk access for repair/regeneration and reconstruction, cost of communication and unavailability probability. In section fourth we have performed the quantitative analysis of these approaches. Section fifth concludes the findings of comparison and analysis done in the paper. II. MODELS In a distributed storage system, the data is divided into fragments and distributed over several nodes. This data is required to be communicated to another node in the network, in two situations; first, when the user requests for a particular data, the nodes that have the required data participate in communication and second, when a node fails, a new node takes its place and recovers the data by communicating the surviving nodes. Depending on the application environment different approaches are used that provide efficient storage, less repair bandwidth and more reliability to the system. The different approaches used in distributed storage systems are described below: A. One method to recover from failure is to maintain a full replica of the data object that can be used at the time of

2 recovery [8]. technique is usually used by Amazon Dynamo, Google file System and Cassandra of Facebook and other file storage systems to provide better data availability and durability. Several copies of the same data are replicated at different nodes. These copies can serve multiple clients concurrently. If any node fails, then it can be recovered from the replica which is maintained at other nodes. This redundant data may lead to inconsistences and also involves a large amount of storage overhead. The main question here is to identify how many replicas are enough for the system so that no data is lost? This has been experimentally proved that, on the PlanetLab trace the data is lost only when replicas are less than three. This is due to the fact that, during the failure of the node if the creation rate of creating a new replica is slightly greater than the average failure rate, the data will be lost before recovery [9]. Thus most of the applications maintain three replicas. The main advantage of using replication is that no encoding/ decoding is involved and the design of system is easy. If a file of size M bytes maintains R replicas, then it stores a total of M R bytes with M bytes of storage per node [6, 11]. We only need one replica out of R replicas to recover the data in the face of failure. The file is said to be unavailable if no replica is available. The unavailability probability where is the mean node availability [6, 10]. Two types of failure may occur in the system namely: permanent failure (in which data is lost due to disk failure or permanent departure of the node) and transient failure (may occur due to temporary network problem). The DHash algorithm responds to both permanent and transient failures to provide 100% availability and thus the bandwidth is wasted if the failure is temporary. Carbonite is an efficient replication algorithm that ignores transient failures [9]. B. Erasure Codes The major drawback of using replication strategy is large storage overhead which is three times the original data. Erasure codes reduce this storage requirement to about 1.4 times the original data. A data object of size bytes is distributed into analogous fragments and encoded in fragments which are stored separately, where. It is capable of correcting erasure (errors whose location is known). Erasure adds encoding/decoding and update complexities to the system. OceanStore, Cleversafe, Facebook-HDFS RAID, HP and IBM use erasure codes i.e. X-codes, Star codes, EVENODD, RDP, Reed-Solomon codes [12-16] to mention a few. The EVENODD, RDP, X-codes are all RAID6 codes. The encoding/decoding performance of RDP is better than other codes. The network erasure codes are applied in the areas like digital file distribution and peer-to-peer file sharing-avalanche from Microsoft, distributed storage, wireless mesh network, adhoc sensor network, satellite communication, video conferencing, disk array systems and archival storage [5]. The erasure stores bytes of data in each node and the total storage becomes bytes. The rate of code,. The unavailability probability of erasure code (when less than k nodes are available) is given by, ( ) [6, 11]. The repair bandwidth of erasure codes is M, which is not optimal because the newcomer node that replaces the failed node has to connect to any k surviving nodes to download the entire message of bytes from each node just to recover its own bytes [10]. C. Stratagy In addition to (n,k) erasure coded fragment, hybrid strategy use one full replica which adds redundancies to original [6,11]. This adds complexity to the system design by using two types of redundancies. Storage required is. At the time of failure, only a single node generates new fragment of size bytes and send it to the newcomer, which enables transferring bytes [10]. As compared to replication and erasure codes discussed above, the repair bandwidth is k times less in case of hybrid approach. The data is unavailable if the replicas of the data is unavailable or if less than k erasure coded nodes are available. The unavailability probability in case of hybrid approach is (. In some text hybrid approach may consider combining two approaches of erasure codes like RDP and EVENODD. This paper, consider hybrid as a combination of replication and erasure codes. In high agitate environment, in which high rate of nodes join and leave the system, the bandwidth cost is too high. In low agitate situation lack of bandwidth is unimportant, but in moderate agitate, hybrid approach is beneficial [6, 11]. D. Regenerating Codes Erasure codes and hybrid approach provide a better solution for storage and repair bandwidth but these are not optimal. In a distributed storage environment the nodes are replaced periodically during failure, so there is a need of such codes which can generate codes by communicating as diminutive data across the network. In Regenerating codes (RC) the data can be regenerated by communicating data in the surviving disks, with minimum communication cost. Regenerating codes follow the condition of MDS, so the minimum disks required for repair is k but it can be maximum. If less than k nodes are available the repair is not possible. The unavailability probability of RC codes is thus same as that of erasure codes, [6]. RC codes use to some extent bigger fragments than MDS but can reduce overall bandwidth by 25% compared to hybrid approach and also simplifies the system design [10]. Two important operations of RC codes are Reconstruction (connect to nodes to obtain a data of size M bytes) and Regeneration (connects to d surviving nodes to recover the data of failed node). In [6] feasible storage-repair bandwidth curve was plotted for RC codes with values (5, 10, 9) and (10, 15, 14). The curve had two special points; minimum storage and minimum bandwidth. The code that obtains the minimum storage is known as the Minimum Storage Regenerating (MSR) codes, and the codes that attain the best repair overhead are called the Minimum Bandwidth Regenerating (MBR) codes. For MSR the storage per node is. At the time of failure the newcomer, unlike the erasure codes connects to d nodes the value of which should be greater that k and less than

3 . The repair bandwidth of MSR is given by γ =. If the surviving nodes, d, then the repair bandwidth and if surviving nodes, then the repair bandwidth becomes γ =. Thus the cost of repair communication is minimized for. The MBR point at which the minimum bandwidth is obtained has storage α = and repair bandwidth γ=. Note that the storage,, for minimum bandwidth, which behaves like replication system, thus communicating exactly the same amount of data as is stored by the node. In case of failure, if the surviving nodes,, then the storage and bandwidth values becomes and if the surviving nodes,, then the values of storage and repair bandwidth is [6]. Table 1, gives the applications where each of the above approaches is used. TABLE I. APPLICABILITY Approach Application [17-30] Google File System, Cassandra in Facebook, Amazon Dynamo Sprite File System, Farsite Fils system, Coda, Bayou database system, Myriad, Locus, TotalRecall, Harp File system. Erasure codes RC codes III. Glacier, CleverSafe, HP, IBM, HDFS- RAID, wireless mesh network, Windows Azure System, digital file distribution and peer-to-peer file sharingavalanche from Microsoft, Adhoc Sensor Network, satellite communication, video conferencing, disk array systems and archival storage. DHT, OceanStore, P2P systems like PAST, Farsite. NCCloud, P2P backup systems, archival storage. COMPARISON OF DIFFERENT NETWORK CODING TECHNIQUES USED IN DSS This paper presents a comparison of the four approaches of network coding used in DSS that were discussed in the previous section. The comparison table for replication, erasure codes,, MSR and MBR for both and is given in the table 2. Based on the comparative table, graph has been plotted for all the approaches. The x-axis shows the increasing configuration for which the value of is fixed to 4 and the y-axis shows the parameters on which these approaches has been compared. The figure 1, depicts these parameters, considering the Data object of size M bytes divided into k fragments and distributed over n nodes ( ). To reconstruct the data the Data Collector (DC) connects to any k disks and to repair a failed node, the newcomer that takes its place connects to d surviving node, ( to recover the data. The parameters of comparison are below in detail: A. Storage Requirements Storage requirement means the total space required (in bytes) to store the data object, so that it is available all times. In a distributed storage, if a data object of size, divided into fragments is distributed over n nodes, the total storage required to store the original data object will vary on the approach used. The graph in figure 2 shows the storage requirement to store a data object of size divided into k fragments and then distributed over n disks. In case of replication (3 replicas), the storage required to store 2Mb does not depend on n and remains 6Mb for all configurations. For all other approaches, the storage requirement decreases as the value for grows. Erasure codes and MSR codes require same storage and the values are optimal compared to all the other approaches. B. Disk access for reconstruction When the data is required by the user, the Data Collector is connected to other nodes to recreate the data that is distributed over several nodes. The number of nodes, required to recreate the data is called the disk access for reconstruction. For an MDS code it connects to any k nodes to reconstruct the data distributed over n nodes. But in case of replication and hybrid codes the data collector connects to a single node where the data or its replication is placed. The graph in figure 3 shows that the number of disk access grows with increasing value of configuration. The value is same for erasure codes, MSR and MBR, which is k disks. C. Disk access for repair or regeneration At the time of failure, the newcomer node takes its place. By connecting to the surviving nodes (working, ) the lost data of the failed node is repaired. The number of disks required to be accessed for regenerating the new node, varies for different approaches. The graph for these approaches is in figure 4. For replication and hybrid approach the number of disk access for all configuration is 1 disk. Disk access for erasure and MBR codes follow MDS property and therefore the value is k disks. For MSR and MBR, the number of disks access depends on n. D. Repair Bandwidth Repair bandwidth is the cost of communication at the time of failure. It depends on several nodes the newcomer node connects to and the data in bytes transferred by these nodes. The figure 5 shows the repair bandwidth curve for all approaches with increasing ( ) configuration, when the size of data object M=2Mb. In case of replication and erasure codes, the repair bandwidth is equivalent to the size of data object i.e. 2Mb. This value is not optimal and the other approaches give better solutions to reduce the repair bandwidth. MBR has minimum repair bandwidth when it connects to surviving nodes. In this technique the repair bandwidth is a decreasing function of. The size of data communicated reduces with increasing value of, thus reducing the cost of communication for repair. It is clear from the figure, that for hybrid and regenerating codes (MSR, MBR) the repair bandwidth also decreases as the value of ( ) configuration increases. The reason is, when the data objects of size is stored over a large number of disks then the amount of storage per disk reduces, thus decreasing the bytes transferred at time of failure. The value of repair bandwidth is minimum for hybrid approach which is. If the replica of

4 data is available then, the loss is recovered by transferring only the part of data that was stored in failed node. E. Unavailability Probability In a distributed system, the data is unavailable if less than the required number of nodes is unavailable. In case of replication, data is unavailable if no replica is available. If is the probability of node being available then, is the probability of node not being available for R replicas. For all MDS codes the minimum required nodes is, k nodes. combines the result of replication and MDS codes (i.e. if less than nodes are available or no replica is available). Data Collector (DC) connects to any k nodes Disk access for reconstruction Data Object of size M bytes Data divided into k fragments k fragments encoded over n nodes n nodes store M k bytes data Node fails Disk access for repair or regeneration, k d n Newcomer node Repair bandwidth depend on disk access and bytes transferred by each node Fig. 1. Coding Structure: The data object of size M bytes is divided into k fragments and encoded over n nodes. (I) storage required: if each node store M/k bytes of data, then total storage in n nodes is (. (II) Disk access for reconstruction: in an MDS code the minimum nodes required to reconstruct the data is k. (III) Disk access for repair or regeneration: during failure of any node the newcomer node that takes its place connects to d survival nodes, where,. (IV) Repair Bandwidth: the cost of communication of data from d surviving nodes to newcomer node in order to recover the lost data. (V) Unavailability probability: the probability of less than the required number of nodes being available. TABLE II. COMPARISION TABLE Approaches (n,k) Erasure codes MSR MBR Parameters codes for Storage required MBR codes for Disk access for reconstruction Disk access for repair or regeneration 1 1 or 1 1 Repair bandwidth Unavailability probability IV. QUANTITATIVE ANALYSIS Consider the value of and in the ( ) configuration. The table 3 gives the values of storage require, disk access for reconstruction, disk access for regeneration and repair bandwidth for replication, erasure codes, hybrid, MSR MBR( ) and MBR( ) approaches discussed in this paper. It shows the values calculated using the formulas in table 2. MSR and erasure codes have the same storage requirement but the repair bandwidth of MSR is 73% less than erasure codes with only 3 more disk access. The storage requirement for hybrid, MBR ( ) and MBR ( ) is almost same with hybrid approach having the least value of 4.173Mb to store 2Mb of data. The repair bandwidth of and MBR approach is also very less compared to other approaches. The hybrid approach has about 46% less repair bandwidth compared to both approaches of MBR. In spite of the low values of storage and repair bandwidth the approach is not much used because of the complex system design and maintenance of two different kinds of error correction techniques. TABLE III. QUANTITATIVE COMAPRISON FOR AND IN CONFIGURATION Parameter values for Storage Require- Disk access for reconstruction Disk Access for repair or Repair Bandwidth n and ment (in regeneration k MB) Erasure Codes MSR MBR (d k) MBR (d n )

5 Disk access for repair Repair Bandwidth Storage requirement Disk access for Reconstruction V. CONCLUSION AND FUTURE WORK With the increase in the amount of data over the years, there is a need to reduce the storage requirement without affecting the availability and reliability of data. Erasure codes provide a simple solution to reduce the storage. Different erasure codes have been designed to reduce encoding/decoding/update complexities and to tolerate burst error of up to three or four failures. But there was no significant improvement in terms of repair bandwidth as also shown in this paper. We have analyzed in this paper that approach makes use of replication along with erasure codes and has better performance in terms of repair bandwidth. Furthermore, it is also shown that the main drawback of using this approach is the complex architecture of the system due to maintenance of two different techniques. The future work is combining the optimal erasure technique with replication may provide better results compared to other approaches Erasure Codes n k Erasure Codes Fig. 2. Storage Requirement Fig. 3. Disk access for Reconstruction Erasure Codes n k Erasure Codes Fig. 4. Disk access for Repair/Regeneration REFERENCES [1] N. Cai and R. W. Yeung, Network coding and error correction, in Prco. IEEE Inf. Theory Workshop, Banglore, India, Oct , 2002, pp [2] Z. Zhang, Theory and applications of network error correction coding, Proc. IEEE, vol. 99, no. 3, pp , March [3] Z Zhang. Linear Network Error Correction codes in Packet Networks, IEEE Transaction on Information Theory. Vol 54,No.1. Jan Fig. 5. Repair Bandwidth [4] R. Ahlswede, N. Cai, S.-Y. R. Li, and R. W. Yeung, Network information flow, IEEE Trans. Information Theory, 46(4): , July 2000 [5] J. S. Plank, Erasure codes for storage applications, in Tutorial, FAST- 2005: 4th Usenix Conference on File and StorageTechnologies, (San Francisco, CA), December [online] plank/plank/papers/fast-2005.html. [6] A.G. Dimakis, P.B. Godfrey, Y. Wu, M.J.Wainwright, and K. Ramchandran, Netword coding for distributed storage systems, IEEE Trans. On Information Theory, 56(9) pp , Sept 2010.

6 [7] J. Araujo, F. Giroire, and J. Monteiro. Approaches for Distributed Storage Systems. In Proceedings of Fourth International Conference on Data Management in Grid and P2P Systems (Globe'11), Toulouse, France, September [8] H. Weatherspoon and J. D. Kubiatowicz, Erasure coding vs. replication: a quantitiative compariso, In Proc. IPTPS, Mar [9] Byung-Gon Chun, Frank Dabek, Andreas Haeberlen, Emil Sit, Hakim Weatherspoon, M. Frans Kaashoek, John Kubiatowicz, and Robert Morris, Efficient replica maintenance for distributed storage systems, In NSDI, [10] R. Rodrigues and B. Liskov, High availability in DHTs: Erasure coding vs. replication, In Proc. IPTPS, [11] A. G. Dimakis, P. B. Godfrey, M. J. Wainwright, and K. Ramchandran, "The benefits of network coding for peer-to-peer storage systems," in Third Workshop on Network Coding, Theory, and Applications, [12] S. Reed and G. Solomon, Polynomial Codes over Certain Finite Fields, J. SIAM, vol 8, no 10, pp , 1960 [13] M. Blaum, J. Brady, J. Bruck, J. Menon, and A. Vardy, The EVENODD code and its generalization,. in High Performance Mass Storage and Parallel I/O, pp John Wiley & Sons, INC., [14] C. Huang and L. Xu, STAR: An efficient coding scheme for correcting triple storage node failures, IEEE Transactions on Computers, vol. 57, no. 7, pp , [15] P. Corbett, B. English, A. Goel, T. Grcanac, S. Kleiman, J. Leong and S. Sankar, Row-Diagonal Parity for Double Disk Failure Correction, Proc. of USENIX FAST 2004, Mar. 31 to Apr. 2, San Francisco,CA, USA. [16] Lihao Xu, X-Code: MDS Array Codes with Optimal Encoding, IEEE Transactions on Information Theory, 45 (1): , January [17] J. Kubiatowicz, D. Bindel, Y. Chen, S. Czerwinski, P. Eaton, D. Geels, R. Gummadi, S. Rhea, H. Weatherspoon, W. Weimer, C. Wells, and B. Zhao, OceanStore: An Architecture for Global-scale Persistent Storage, In ASPLOS 00: Proc. of the 9thInternational Conference on Architectural Support for Programming Languages and Operating Systems,December [18] S. Ghemawat, H. Gobioff and S. Leung, The Google File system, In Proc. of 19th ACM Symposium on Operating System Principles (Oct. 2003). [19] A. Adya, W. J. Bolosky, M. Castro, G. Cermak,R. Chaiken, J. R. Douceur, J. Howell, J. R. Lorch, M. Theimer, and R. P. Wattenhofer, FARSITE: Federated, available, and reliable storage for an incompletely trusted environment, In Proc. OSDI, Boston, MA, Dec [20] F. Chang, M. Ji, S.-T. Leung, J. MacCormick, S. Perl, andl. Zhang, Myriad: Cost-effective disaster tolerance, In FAST 02:Proceedings of the 1st USENIX Conference on File and Storage Technologies, page 8, Berkeley, CA, USA, [21] B. Walker, G. Popek, R. English, C. Kline, and G. Thiel, The LOCUS distributed operating system, Proceedings Ninth Symposium on operating Systems Principles, Bretton Woods, New Hampshire, October 1983, pp [22] R. Bhagwan, K. Tati, Y. Cheng, C. Y, S. Savage and G. M. Voelker, Total Recall: System support for automated availability management, In Proc. of the 1st Symposium on Networked Systems Design and Implementation (Mar. 2004). [23] B. Liskov, S. Ghemawat, R. Gruber, P. Johnson, I Shrira and M. Williams, in the Harp File system, In Proc. of the 13th ACM Symposium on Operating System Principles (Oct. 1991), pp [24] Haeberlen, A. Mislove and P. Druschel, Glacier: Highly durable, decentralized storage despite massive correlated failures, In Proc. of the 2nd Symposium on Networked Systems Design and Implementation (May 2005). [25] Y. Hu, H. C. H. Chen, P. P. C. Lee, and Y. Tang, NCCloud: Applying network coding for the storage repair in a cloud-of-clouds, in Proc. of the 10th USENIX Conf. on File and Storage Tech. (FAST 12), San Jose, Feb [26] Nitin Rakesh and Vipin Tyagi, Linear-code multicast on parallel architectures Elsevier Advances in Engineering Software, vol. 42, pp , [27] Nitin Rakesh and Vipin Tyagi, Efficient Broadcasting in Parallel Networks Using Network Coding, in Proceedings of The First International conference on Parallel, Distributed Computing technologies and Applications, CCIS 203, pp , Springer-Verlag, [28] Nitin Rakesh and Nitin, Analysis of All to All Broadcast on Multi Mesh of Trees Using Genetic Algorithm International Workshop on Advances in Computer Networks, VLSI, ANVIT, St. Petersbutg, Russia, [29] Nitin Rakesh and Nitin, Analysis of Multi-Sort Algorithm on Multi- Mesh of Trees (MMT) Architecture, Springer Journal of Supercomputing, vol 57, no 3, , [30] Nitin Rakesh and Vipin Tyagi Linear Network Coding on Multi-Mesh of Trees using All to All Broadcast International Journal of Computer Science Issues, vol 8, no 3, , 2011.

Storage and Network Resource Usage in Reactive and Proactive Replicated Storage Systems

Storage and Network Resource Usage in Reactive and Proactive Replicated Storage Systems Storage and Network Resource Usage in Reactive and Proactive Replicated Storage Systems Rossana Motta and Joseph Pasquale Department of Computer Science and Engineering University of California San Diego

More information

Hybrid Approaches for Distributed Storage Systems

Hybrid Approaches for Distributed Storage Systems Hybrid Approaches for Distributed Storage Systems Julio Araujo, Frédéric Giroire, Julian Monteiro To cite this version: Julio Araujo, Frédéric Giroire, Julian Monteiro. Hybrid Approaches for Distributed

More information

CORE: Augmenting Regenerating-Coding-Based Recovery for Single and Concurrent Failures in Distributed Storage Systems

CORE: Augmenting Regenerating-Coding-Based Recovery for Single and Concurrent Failures in Distributed Storage Systems CORE: Augmenting Regenerating-Coding-Based Recovery for Single and Concurrent Failures in Distributed Storage Systems Runhui Li, Jian Lin, Patrick P. C. Lee Department of Computer Science and Engineering,

More information

Network Coding for Distributed Storage Systems

Network Coding for Distributed Storage Systems Network Coding for Distributed Storage Systems Alexandros G. Dimakis, P. Brighten Godfrey, Martin J. Wainwright and Kannan Ramchandran Department of Electrical Engineering and Computer Science, University

More information

On Coding Techniques for Networked Distributed Storage Systems

On Coding Techniques for Networked Distributed Storage Systems On Coding Techniques for Networked Distributed Storage Systems Frédérique Oggier frederique@ntu.edu.sg Nanyang Technological University, Singapore First European Training School on Network Coding, Barcelona,

More information

On Object Maintenance in Peer-to-Peer Systems

On Object Maintenance in Peer-to-Peer Systems On Object Maintenance in Peer-to-Peer Systems Kiran Tati and Geoffrey M. Voelker Department of Computer Science and Engineering University of California, San Diego 1. INTRODUCTION Storage is often a fundamental

More information

Self-organized Data Redundancy Management for Peer-to-Peer Storage Systems

Self-organized Data Redundancy Management for Peer-to-Peer Storage Systems Self-organized Data Redundancy Management for Peer-to-Peer Storage Systems Yaser Houri, Manfred Jobmann, and Thomas Fuhrmann Computer Science Department Technische Universität München Munich, Germany {houri,jobmann,fuhrmann}@in.tum.de

More information

International Journal of Innovations in Engineering and Technology (IJIET)

International Journal of Innovations in Engineering and Technology (IJIET) RTL Design and Implementation of Erasure Code for RAID system Chethan.K 1, Dr.Srividya.P 2, Mr.Sivashanmugam Krishnan 3 1 PG Student, Department Of ECE, R. V. College Engineering, Bangalore, India. 2 Associate

More information

Randomized Network Coding in Distributed Storage Systems with Layered Overlay

Randomized Network Coding in Distributed Storage Systems with Layered Overlay 1 Randomized Network Coding in Distributed Storage Systems with Layered Overlay M. Martalò, M. Picone, M. Amoretti, G. Ferrari, and R. Raheli Department of Information Engineering, University of Parma,

More information

Pyramid Codes: Flexible Schemes to Trade Space for Access Efficiency in Reliable Data Storage Systems

Pyramid Codes: Flexible Schemes to Trade Space for Access Efficiency in Reliable Data Storage Systems Pyramid Codes: Flexible Schemes to Trade Space for Access Efficiency in Reliable Data Storage Systems Cheng Huang, Minghua Chen, and Jin Li Microsoft Research, Redmond, WA 98052 Abstract To flexibly explore

More information

Performance Models of Access Latency in Cloud Storage Systems

Performance Models of Access Latency in Cloud Storage Systems Performance Models of Access Latency in Cloud Storage Systems Qiqi Shuai Email: qqshuai@eee.hku.hk Victor O.K. Li, Fellow, IEEE Email: vli@eee.hku.hk Yixuan Zhu Email: yxzhu@eee.hku.hk Abstract Access

More information

Coding Techniques for Distributed Storage Systems

Coding Techniques for Distributed Storage Systems Coding Techniques for Distributed Storage Systems Jung-Hyun Kim, {jh.kim06}@yonsei.ac.kr, Hong-Yeop Song Yonsei Univ. Seoul, KORE 3 rd CITW, Oct. 25. 1, 2013 Contents Introduction Codes for Distributed

More information

Enabling Node Repair in Any Erasure Code for Distributed Storage

Enabling Node Repair in Any Erasure Code for Distributed Storage Enabling Node Repair in Any Erasure Code for Distributed Storage K. V. Rashmi, Nihar B. Shah, and P. Vijay Kumar, Fellow, IEEE Abstract Erasure codes are an efficient means of storing data across a network

More information

Probabilistic Failure Detection for Efficient Distributed Storage Maintenance *

Probabilistic Failure Detection for Efficient Distributed Storage Maintenance * Probabilistic Failure Detection for Efficient Distributed Storage Maintenance * Jing Tian *, Zhi Yang *, Wei Chen, Ben Y. Zhao, Yafei Dai * * State Key Lab for Adv Opt Comm. Syst & Networks, Peking University,

More information

SCALING UP OF E-MSR CODES BASED DISTRIBUTED STORAGE SYSTEMS WITH FIXED NUMBER OF REDUNDANCY NODES

SCALING UP OF E-MSR CODES BASED DISTRIBUTED STORAGE SYSTEMS WITH FIXED NUMBER OF REDUNDANCY NODES SCALING UP OF E-MSR CODES BASED DISTRIBUTED STORAGE SYSTEMS WITH FIXED NUMBER OF REDUNDANCY NODES Haotian Zhao, Yinlong Xu and Liping Xiang School of Computer Science and Technology, University of Science

More information

Your Data is in the Cloud: Who Exactly is Looking After It?

Your Data is in the Cloud: Who Exactly is Looking After It? Your Data is in the Cloud: Who Exactly is Looking After It? P Vijay Kumar Dept of Electrical Communication Engineering Indian Institute of Science IISc Open Day March 4, 2017 1/33 Your Data is in the Cloud:

More information

Time-related replication for p2p storage system

Time-related replication for p2p storage system Seventh International Conference on Networking Time-related replication for p2p storage system Kyungbaek Kim E-mail: University of California, Irvine Computer Science-Systems 3204 Donald Bren Hall, Irvine,

More information

Network Coding for Distributed Storage Systems* Presented by Jayant Apte ASPITRG 7/9/13 & 7/11/13

Network Coding for Distributed Storage Systems* Presented by Jayant Apte ASPITRG 7/9/13 & 7/11/13 Network Coding for Distributed Storage Systems* Presented by Jayant Apte ASPITRG 7/9/13 & 7/11/13 *Dimakis, A.G.; Godfrey, P.B.; Wu, Y.; Wainwright, M.J.; Ramchandran, K. "Network Coding for Distributed

More information

Codes for Modern Applications

Codes for Modern Applications Codes for Modern Applications Current Research Topics in the Code and Signal Design Group P. Vijay Kumar Indian Institute of Science, Bangalore, India Nov. 16, 2018 1/28 Codes for Modern Applications:

More information

Minimization of Storage Cost in Distributed Storage Systems with Repair Consideration

Minimization of Storage Cost in Distributed Storage Systems with Repair Consideration Minimization of Storage Cost in Distributed Storage Systems with Repair Consideration Quan Yu Department of Electronic Engineering City University of Hong Kong Email: quanyu2@student.cityu.edu.hk Kenneth

More information

Resume Maintaining System for Referral Using Cloud Computing

Resume Maintaining System for Referral Using Cloud Computing Resume Maintaining System for Referral Using Cloud Computing Pranjali A. Pali 1, N.D. Kale 2 P.G. Student, Department of Computer Engineering, TSSM S PVPIT, Bavdhan, Pune, Maharashtra India 1 Associate

More information

EECS 121: Coding for Digital Communication & Beyond Fall Lecture 22 December 3. Introduction

EECS 121: Coding for Digital Communication & Beyond Fall Lecture 22 December 3. Introduction EECS 121: Coding for Digital Communication & Beyond Fall 2013 Lecture 22 December 3 Lecturer: K. V. Rashmi Scribe: Ajay Shanker Tripathi 22.1 Context Introduction Distributed storage is a deeply relevant

More information

Cooperative Pipelined Regeneration in Distributed Storage Systems

Cooperative Pipelined Regeneration in Distributed Storage Systems Cooperative ipelined Regeneration in Distributed Storage Systems Jun Li, in Wang School of Computer Science Fudan University, China Baochun Li Department of Electrical and Computer Engineering University

More information

Effec%ve Replica Maintenance for Distributed Storage Systems

Effec%ve Replica Maintenance for Distributed Storage Systems Effec%ve Replica Maintenance for Distributed Storage Systems USENIX NSDI2006 Byung Gon Chun, Frank Dabek, Andreas Haeberlen, Emil Sit, Hakim Weatherspoon, M. Frans Kaashoek, John Kubiatowicz, and Robert

More information

Combining Erasure-Code and Replication Redundancy Schemes for Increased Storage and Repair Efficiency in P2P Storage Systems

Combining Erasure-Code and Replication Redundancy Schemes for Increased Storage and Repair Efficiency in P2P Storage Systems Combining Erasure-Code and Replication Redundancy Schemes for Increased Storage and Repair Efficiency in P2P Storage Systems Roy Friedman and Yoav Kantor Computer Science Department Technion - Israel Institute

More information

An Agenda for Robust Peer-to-Peer Storage

An Agenda for Robust Peer-to-Peer Storage An Agenda for Robust Peer-to-Peer Storage Rodrigo Rodrigues Massachusetts Institute of Technology rodrigo@lcs.mit.edu Abstract Robust, large-scale storage is one of the main applications of DHTs and a

More information

Proactive replication for data durability

Proactive replication for data durability Proactive replication for data durability Emil Sit, Andreas Haeberlen, Frank Dabek, Byung-Gon Chun, Hakim Weatherspoon Robert Morris, M. Frans Kaashoek and John Kubiatowicz ABSTRACT Many wide-area storage

More information

Redundancy Management for P2P Storage

Redundancy Management for P2P Storage Redundancy Management for P2P Storage Chris Williams, Philippe Huibonhoa, JoAnne Holliday, Andy Hospodor, Thomas Schwarz Department of Computer Engineering, Santa Clara University, Santa Clara, CA 95053,

More information

LaRS: A Load-aware Recovery Scheme for Heterogeneous Erasure-Coded Storage Clusters

LaRS: A Load-aware Recovery Scheme for Heterogeneous Erasure-Coded Storage Clusters 214 9th IEEE International Conference on Networking, Architecture, and Storage LaRS: A Load-aware Recovery Scheme for Heterogeneous Erasure-Coded Storage Clusters Haibing Luo, Jianzhong Huang, Qiang Cao

More information

Cost-Bandwidth Tradeoff In Distributed Storage Systems

Cost-Bandwidth Tradeoff In Distributed Storage Systems Cost-Bandwidth Tradeoff In Distributed Storage Systems Soroush Ahlaghi, Abbas Kiani and ohammad Reza Ghanavati Department of Electrical Engineering Shahed University Tehran, Iran Email: {ahlaghi,aiani,ghanavati}@shahedacir

More information

Data Integrity Protection scheme for Minimum Storage Regenerating Codes in Multiple Cloud Environment

Data Integrity Protection scheme for Minimum Storage Regenerating Codes in Multiple Cloud Environment International Journal of Advanced Research in Computer Engineering & Technology (IJARCET) Data Integrity Protection scheme for Minimum Storage Regenerating Codes in Multiple Cloud Environment Saravana

More information

Giza: Erasure Coding Objects across Global Data Centers

Giza: Erasure Coding Objects across Global Data Centers Giza: Erasure Coding Objects across Global Data Centers Yu Lin Chen*, Shuai Mu, Jinyang Li, Cheng Huang *, Jin li *, Aaron Ogus *, and Douglas Phillips* New York University, *Microsoft Corporation USENIX

More information

Enabling High Data Availability in a DHT

Enabling High Data Availability in a DHT Enabling High Data Availability in a DHT Predrag Knežević 1, Andreas Wombacher 2, Thomas Risse 1 1 Fraunhofer IPSI 2 University of Twente Integrated Publication and Information Systems Institute Department

More information

Staggeringly Large File Systems. Presented by Haoyan Geng

Staggeringly Large File Systems. Presented by Haoyan Geng Staggeringly Large File Systems Presented by Haoyan Geng Large-scale File Systems How Large? Google s file system in 2009 (Jeff Dean, LADIS 09) - 200+ clusters - Thousands of machines per cluster - Pools

More information

An Improvement of Quasi-cyclic Minimum Storage Regenerating Codes for Distributed Storage

An Improvement of Quasi-cyclic Minimum Storage Regenerating Codes for Distributed Storage An Improvement of Quasi-cyclic Minimum Storage Regenerating Codes for Distributed Storage Chenhui LI*, Songtao LIANG* * Shanghai Key Laboratory of Intelligent Information Processing, Fudan University,

More information

Subway : Peer-To-Peer Clustering of Clients for Web Proxy

Subway : Peer-To-Peer Clustering of Clients for Web Proxy Subway : Peer-To-Peer Clustering of Clients for Web Proxy Kyungbaek Kim and Daeyeon Park Department of Electrical Engineering & Computer Science, Division of Electrical Engineering, Korea Advanced Institute

More information

Encoding-Aware Data Placement for Efficient Degraded Reads in XOR-Coded Storage Systems

Encoding-Aware Data Placement for Efficient Degraded Reads in XOR-Coded Storage Systems Encoding-Aware Data Placement for Efficient Degraded Reads in XOR-Coded Storage Systems Zhirong Shen, Patrick P. C. Lee, Jiwu Shu, Wenzhong Guo College of Mathematics and Computer Science, Fuzhou University

More information

DiskReduce: RAID for Data-Intensive Scalable Computing (CMU-PDL )

DiskReduce: RAID for Data-Intensive Scalable Computing (CMU-PDL ) Research Showcase @ CMU Parallel Data Laboratory Research Centers and Institutes 11-2009 DiskReduce: RAID for Data-Intensive Scalable Computing (CMU-PDL-09-112) Bin Fan Wittawat Tantisiriroj Lin Xiao Garth

More information

BASIC Regenerating Code: Binary Addition and Shift for Exact Repair

BASIC Regenerating Code: Binary Addition and Shift for Exact Repair BASIC Regenerating Code: Binary Addition and Shift for Exact Repair Hanxu Hou, Kenneth W. Shum, Minghua Chen and Hui Li Shenzhen Eng. Lab of Converged Networks Technology, Shenzhen Key Lab of Cloud Computing

More information

Short Code: An Efficient RAID-6 MDS Code for Optimizing Degraded Reads and Partial Stripe Writes

Short Code: An Efficient RAID-6 MDS Code for Optimizing Degraded Reads and Partial Stripe Writes : An Efficient RAID-6 MDS Code for Optimizing Degraded Reads and Partial Stripe Writes Yingxun Fu, Jiwu Shu, Xianghong Luo, Zhirong Shen, and Qingda Hu Abstract As reliability requirements are increasingly

More information

On the Speedup of Recovery in Large-Scale Erasure-Coded Storage Systems (Supplementary File)

On the Speedup of Recovery in Large-Scale Erasure-Coded Storage Systems (Supplementary File) 1 On the Speedup of Recovery in Large-Scale Erasure-Coded Storage Systems (Supplementary File) Yunfeng Zhu, Patrick P. C. Lee, Yinlong Xu, Yuchong Hu, and Liping Xiang 1 ADDITIONAL RELATED WORK Our work

More information

Cloud Storage Reliability for Big Data Applications: A State of the Art Survey

Cloud Storage Reliability for Big Data Applications: A State of the Art Survey Cloud Storage Reliability for Big Data Applications: A State of the Art Survey Rekha Nachiappan a, Bahman Javadi a, Rodrigo Calherios a, Kenan Matawie a a School of Computing, Engineering and Mathematics,

More information

QADR with Energy Consumption for DIA in Cloud

QADR with Energy Consumption for DIA in Cloud Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,

More information

Exact Optimized-cost Repair in Multi-hop Distributed Storage Networks

Exact Optimized-cost Repair in Multi-hop Distributed Storage Networks Exact Optimized-cost Repair in Multi-hop Distributed Storage Networks Majid Gerami, Ming Xiao Communication Theory Lab, Royal Institute of Technology, KTH, Sweden, E-mail: {gerami, mingx@kthse arxiv:14012774v1

More information

A survey on regenerating codes

A survey on regenerating codes International Journal of Scientific and Research Publications, Volume 4, Issue 11, November 2014 1 A survey on regenerating codes V. Anto Vins *, S.Umamageswari **, P.Saranya ** * P.G Scholar, Department

More information

Efficient Replica Maintenance for Distributed Storage Systems

Efficient Replica Maintenance for Distributed Storage Systems Efficient Replica Maintenance for Distributed Storage Systems Byung-Gon Chun, Frank Dabek, Andreas Haeberlen, Emil Sit, Hakim Weatherspoon, M. Frans Kaashoek, John Kubiatowicz, and Robert Morris MIT Computer

More information

Self-Adaptive Two-Dimensional RAID Arrays

Self-Adaptive Two-Dimensional RAID Arrays Self-Adaptive Two-Dimensional RAID Arrays Jehan-François Pâris 1 Dept. of Computer Science University of Houston Houston, T 77204-3010 paris@cs.uh.edu Thomas J. E. Schwarz Dept. of Computer Engineering

More information

Comparison of RAID-6 Erasure Codes

Comparison of RAID-6 Erasure Codes Comparison of RAID-6 Erasure Codes Dimitri Pertin, Alexandre Van Kempen, Benoît Parrein, Nicolas Normand To cite this version: Dimitri Pertin, Alexandre Van Kempen, Benoît Parrein, Nicolas Normand. Comparison

More information

ECS High Availability Design

ECS High Availability Design ECS High Availability Design March 2018 A Dell EMC white paper Revisions Date Mar 2018 Aug 2017 July 2017 Description Version 1.2 - Updated to include ECS version 3.2 content Version 1.1 - Updated to include

More information

Fast Erasure Coding for Data Storage: A Comprehensive Study of the Acceleration Techniques. Tianli Zhou & Chao Tian Texas A&M University

Fast Erasure Coding for Data Storage: A Comprehensive Study of the Acceleration Techniques. Tianli Zhou & Chao Tian Texas A&M University Fast Erasure Coding for Data Storage: A Comprehensive Study of the Acceleration Techniques Tianli Zhou & Chao Tian Texas A&M University 2 Contents Motivation Background and Review Evaluating Individual

More information

Candidate MDS Array Codes for Tolerating Three Disk Failures in RAID-7 Architectures

Candidate MDS Array Codes for Tolerating Three Disk Failures in RAID-7 Architectures Candidate MDS Array Codes for Tolerating Three Disk Failures in RAID-7 Architectures ABSTRACT Mayur Punekar Dept. of Computer Science and Eng. Qatar University Doha, Qatar mayur.punekar@ieee.org Yongge

More information

Capacity Assurance in Hostile Networks

Capacity Assurance in Hostile Networks PhD Dissertation Defense Wednesday, October 7, 2015 3:30 pm - 5:30 pm 3112 Engineering Building Capacity Assurance in Hostile Networks By: Jian Li Advisor: Jian Ren ABSTRACT Linear network coding provides

More information

Modern Erasure Codes for Distributed Storage Systems

Modern Erasure Codes for Distributed Storage Systems Modern Erasure Codes for Distributed Storage Systems Storage Developer Conference, SNIA, Bangalore Srinivasan Narayanamurthy Advanced Technology Group, NetApp May 27 th 2016 1 Everything around us is changing!

More information

Dynamically Estimating Reliability in a Volunteer-Based Compute and Data-Storage System

Dynamically Estimating Reliability in a Volunteer-Based Compute and Data-Storage System Dynamically Estimating Reliability in a Volunteer-Based Compute and Data-Storage System Muhammed Uluyol University of Minnesota Abstract Although cloud computing is a powerful tool for analyzing large

More information

PITR: An Efficient Single-failure Recovery Scheme for PIT-Coded Cloud Storage Systems

PITR: An Efficient Single-failure Recovery Scheme for PIT-Coded Cloud Storage Systems PITR: An Efficient Single-failure Recovery Scheme for PIT-Coded Cloud Storage Systems Peng Li, Jiaxiang Dong, Xueda Liu, Gang Wang, Zhongwei Li, Xiaoguang Liu Nankai-Baidu Joint Lab, College of Computer

More information

Large-Scale Byzantine Fault Tolerance: Safe but Not Always Live

Large-Scale Byzantine Fault Tolerance: Safe but Not Always Live Large-Scale Byzantine Fault Tolerance: Safe but Not Always Live Rodrigo Rodrigues INESC-ID and Technical University of Lisbon Petr Kouznetsov Max Planck Institute for Software Systems Bobby Bhattacharjee

More information

ABSTRACT I. INTRODUCTION

ABSTRACT I. INTRODUCTION International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2018 IJSRCSEIT Volume 3 Issue 3 ISS: 2456-3307 Hadoop Periodic Jobs Using Data Blocks to Achieve

More information

On the Speedup of Single-Disk Failure Recovery in XOR-Coded Storage Systems: Theory and Practice

On the Speedup of Single-Disk Failure Recovery in XOR-Coded Storage Systems: Theory and Practice On the Speedup of Single-Disk Failure Recovery in XOR-Coded Storage Systems: Theory and Practice Yunfeng Zhu, Patrick P. C. Lee, Yuchong Hu, Liping Xiang, and Yinlong Xu University of Science and Technology

More information

Pastis: a Highly-Scalable Multi-User Peer-to-Peer File System

Pastis: a Highly-Scalable Multi-User Peer-to-Peer File System Pastis: a Highly-Scalable Multi-User Peer-to-Peer File System Jean-Michel Busca 1, Fabio Picconi 2, and Pierre Sens 2 1 INRIA Rocquencourt Le Chesnay, France jean-michel.busca@inria.fr 2 LIP6, Université

More information

Authenticated Agreement

Authenticated Agreement Chapter 18 Authenticated Agreement Byzantine nodes are able to lie about their inputs as well as received messages. Can we detect certain lies and limit the power of byzantine nodes? Possibly, the authenticity

More information

Fractional Repetition Codes for Repair in Distributed Storage Systems

Fractional Repetition Codes for Repair in Distributed Storage Systems Fractional Repetition Codes for Repair in Distributed Storage Systems Salim El Rouayheb and Kannan Ramchandran Department of Electrical Engineering and Computer Sciences University of California, Bereley

More information

Optimal Exact-Regenerating Codes for Distributed Storage at the MSR and MBR Points via a Product-Matrix Construction

Optimal Exact-Regenerating Codes for Distributed Storage at the MSR and MBR Points via a Product-Matrix Construction IEEE TRANSACTIONS ON INFORMATION THEORY, VOL 57, NO 8, AUGUST 2011 5227 Optimal Exact-Regenerating Codes for Distributed Storage at the MSR and MBR Points via a Product-Matrix Construction K V Rashmi,

More information

On Secret Sharing Schemes based on Regenerating Codes

On Secret Sharing Schemes based on Regenerating Codes On Secret Sharing Schemes base on Regenerating Coes Masazumi Kurihara (University of Electro-Communications) 0 IEICE General Conference Okayama Japan 0- March 0. (Proc. of 0 IEICE General Conference AS--

More information

Understanding Availability

Understanding Availability Understanding Availability Ranjita Bhagwan, Stefan Savage and Geoffrey M. Voelker Department of Computer Science and Engineering University of California, San Diego Abstract This paper addresses a simple,

More information

MODERN storage systems, such as GFS [1], Windows

MODERN storage systems, such as GFS [1], Windows IEEE TRANSACTIONS ON COMPUTERS, VOL. 66, NO. 1, JANUARY 2017 127 Short Code: An Efficient RAID-6 MDS Code for Optimizing Degraded Reads and Partial Stripe Writes Yingxun Fu, Jiwu Shu, Senior Member, IEEE,

More information

Broadcast Repair for Wireless Distributed Storage Systems

Broadcast Repair for Wireless Distributed Storage Systems Broadcast Repair for Wireless Distributed Storage Systems Ping Hu Department of Electronic Engineering City University of Hong Kong Email: ping.hu@my.cityu.edu.hk Chi Wan Sung Department of Electronic

More information

Efficient Reliable Internet Storage

Efficient Reliable Internet Storage Efficient Reliable Internet Storage Robbert van Renesse Dept. of Computer Science, Cornell University rvr@cs.cornell.edu Abstract This position paper presents a new design for an Internetwide peer-to-peer

More information

Adaptive Replication and Replacement in P2P Caching

Adaptive Replication and Replacement in P2P Caching Adaptive Replication and Replacement in P2P Caching Jussi Kangasharju Keith W. Ross Abstract Caching large audio and video files in a community of peers is a compelling application for P2P. Assuming an

More information

RobuSTore: Robust Performance for Distributed Storage Systems

RobuSTore: Robust Performance for Distributed Storage Systems RobuSTore: Robust Performance for Distributed Storage Systems Huaxia Xia and Andrew A. Chien University of California, San Diego {hxia, achien}@ucsd.edu Abstract *1 Emerging large-scale scientific applications

More information

APPLICATIONS OF LOCAL-RANGE NETWORK THEORY IN DESIGNED P2P ROUTING ALGORITHM

APPLICATIONS OF LOCAL-RANGE NETWORK THEORY IN DESIGNED P2P ROUTING ALGORITHM APPLICATIONS OF LOCAL-RANGE NETWORK THEORY IN DESIGNED P2P ROUTING ALGORITHM 1 YUNXIA PEI, 2 GUANGCHUN FU 1 Department of Math and Computer Science, Zhengzhou University of light Industry, Zhengzhou 450002,

More information

PCM: A Parity-check Matrix Based Approach to Improve Decoding Performance of XOR-based Erasure Codes

PCM: A Parity-check Matrix Based Approach to Improve Decoding Performance of XOR-based Erasure Codes 15 IEEE th Symposium on Reliable Distributed Systems : A Parity-check Matrix Based Approach to Improve Decoding Performance of XOR-based Erasure Codes Yongzhe Zhang, Chentao Wu, Jie Li, Minyi Guo Shanghai

More information

Evaluation of Distributed Recovery in Large-Scale Storage Systems

Evaluation of Distributed Recovery in Large-Scale Storage Systems Evaluation of Distributed Recovery in Large-Scale Storage Systems Qin Xin Ethan L. Miller Storage Systems Research Center University of California, Santa Cruz qxin, elm @cs.ucsc.edu Thomas J. E. Schwarz,

More information

Disk Infant Mortality in Large Storage Systems

Disk Infant Mortality in Large Storage Systems Disk Infant Mortality in Large Storage Systems Qin Xin 1 qxin@cs.ucsc.edu Thomas J. E. Schwarz, S. J. 1,2 tjschwarz@scu.edu Ethan L. Miller 1 elm@cs.ucsc.edu 1 Storage Systems Research Center, University

More information

A Comparison Of Replication Strategies for Reliable Decentralised Storage

A Comparison Of Replication Strategies for Reliable Decentralised Storage 36 JOURNAL OF NETWORKS, VOL. 1, NO. 6, NOVEMBER/DECEMBER 26 A Comparison Of Replication Strategies for Reliable Decentralised Storage Matthew Leslie 1,2, Jim Davies 1, and Todd Huffman 2 1 Oxford University

More information

Repair Pipelining for Erasure-Coded Storage

Repair Pipelining for Erasure-Coded Storage Repair Pipelining for Erasure-Coded Storage Runhui Li, Xiaolu Li, Patrick P. C. Lee, Qun Huang The Chinese University of Hong Kong USENIX ATC 2017 1 Introduction Fault tolerance for distributed storage

More information

Efficient Recovery in Harp

Efficient Recovery in Harp Efficient Recovery in Harp Barbara Liskov Sanjay Ghemawat Robert Gruber Paul Johnson Liuba Shrira Laboratory for Computer Science Massachusetts Institute of Technology 1. Introduction Harp is a replicated

More information

SCAR - Scattering, Concealing and Recovering data within a DHT

SCAR - Scattering, Concealing and Recovering data within a DHT 41st Annual Simulation Symposium SCAR - Scattering, Concealing and Recovering data within a DHT Bryan N. Mills and Taieb F. Znati, Department of Computer Science, Telecommunications Program University

More information

LessLog: A Logless File Replication Algorithm for Peer-to-Peer Distributed Systems

LessLog: A Logless File Replication Algorithm for Peer-to-Peer Distributed Systems LessLog: A Logless File Replication Algorithm for Peer-to-Peer Distributed Systems Kuang-Li Huang, Tai-Yi Huang and Jerry C. Y. Chou Department of Computer Science National Tsing Hua University Hsinchu,

More information

A SECURED FRAMEWORK FOR CLOUD REPOSITORY WITH CLUSTERED SERVERS

A SECURED FRAMEWORK FOR CLOUD REPOSITORY WITH CLUSTERED SERVERS INTERNATIONAL JOURNAL OF RESEARCH IN COMPUTER APPLICATIONS AND ROBOTICS ISSN 2320-7345 A SECURED FRAMEWORK FOR CLOUD REPOSITORY WITH CLUSTERED SERVERS Ashok kumar. C 1 and Sathyadevi. S 2 1 Department

More information

A Survey on Cloud Computing Storage and Security

A Survey on Cloud Computing Storage and Security Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 1, January 2015,

More information

Providing High Reliability in a Minimum Redundancy Archival Storage System

Providing High Reliability in a Minimum Redundancy Archival Storage System Providing High Reliability in a Minimum Redundancy Archival Storage System Deepavali Bhagwat 1 Kristal Pollack 1 Darrell D. E. Long 1 Thomas Schwarz, S.J. 1,2 Ethan L. Miller 1 Jehan-François Pâris 3 1

More information

Modern Erasure Codes for Distributed Storage Systems

Modern Erasure Codes for Distributed Storage Systems Modern Erasure Codes for Distributed Storage Systems Srinivasan Narayanamurthy (Srini) NetApp Everything around us is changing! r The Data Deluge r Disk capacities and densities are increasing faster than

More information

Parallel Routing Method in Churn Tolerated Resource Discovery

Parallel Routing Method in Churn Tolerated Resource Discovery in Churn Tolerated Resource Discovery E-mail: emiao_beyond@163.com Xiancai Zhang E-mail: zhangxiancai12@sina.com Peiyi Yu E-mail: ypy02784@163.com Jiabao Wang E-mail: jiabao_1108@163.com Qianqian Zhang

More information

Autonomous Replication for High Availability in Unstructured P2P Systems

Autonomous Replication for High Availability in Unstructured P2P Systems Appears in the 22nd IEEE International Symposium on Reliable Distributed Systems, 23 1 Autonomous Replication for High Availability in Unstructured P2P Systems Francisco Matias Cuenca-Acuna, Richard P.

More information

A New HadoopBased Network Management System with Policy Approach

A New HadoopBased Network Management System with Policy Approach Computer Engineering and Applications Vol. 3, No. 3, September 2014 A New HadoopBased Network Management System with Policy Approach Department of Computer Engineering and IT, Shiraz University of Technology,

More information

ABSTRACT. Web Service Atomic Transaction (WS-AT) is a standard used to implement distributed

ABSTRACT. Web Service Atomic Transaction (WS-AT) is a standard used to implement distributed ABSTRACT Web Service Atomic Transaction (WS-AT) is a standard used to implement distributed processing over the internet. Trustworthy coordination of transactions is essential to ensure proper running

More information

Providing High Reliability in a Minimum Redundancy Archival Storage System

Providing High Reliability in a Minimum Redundancy Archival Storage System Providing High Reliability in a Minimum Redundancy Archival Storage System Deepavali Bhagwat 1 Kristal Pollack 1 Darrell D. E. Long 1 Thomas Schwarz, S.J. 1,2 Ethan L. Miller 1 Jehan-François Pâris 3 1

More information

Coding and Scheduling for Efficient Loss-Resilient Data Broadcasting

Coding and Scheduling for Efficient Loss-Resilient Data Broadcasting Coding and Scheduling for Efficient Loss-Resilient Data Broadcasting Kevin Foltz Lihao Xu Jehoshua Bruck California Institute of Technology Department of Computer Science Department of Electrical Engineering

More information

Building a low-latency, proximity-aware DHT-based P2P network

Building a low-latency, proximity-aware DHT-based P2P network Building a low-latency, proximity-aware DHT-based P2P network Ngoc Ben DANG, Son Tung VU, Hoai Son NGUYEN Department of Computer network College of Technology, Vietnam National University, Hanoi 144 Xuan

More information

Collaborative Multi-Source Scheme for Multimedia Content Distribution

Collaborative Multi-Source Scheme for Multimedia Content Distribution Collaborative Multi-Source Scheme for Multimedia Content Distribution Universidad Autónoma Metropolitana-Cuajimalpa, Departament of Information Technology, Mexico City, Mexico flopez@correo.cua.uam.mx

More information

A Performance Evaluation of Open Source Erasure Codes for Storage Applications

A Performance Evaluation of Open Source Erasure Codes for Storage Applications A Performance Evaluation of Open Source Erasure Codes for Storage Applications James S. Plank Catherine D. Schuman (Tennessee) Jianqiang Luo Lihao Xu (Wayne State) Zooko Wilcox-O'Hearn Usenix FAST February

More information

Availability-based methods for distributed storage systems

Availability-based methods for distributed storage systems 2012 31st International Symposium on Reliable Distributed Systems Availability-based methods for distributed storage systems Anne-Marie Kermarrec INRIA Bretagne Atlantique Erwan Le Merrer Technicolor Gilles

More information

Tree-structured Data Regeneration with Network Coding in Distributed Storage Systems

Tree-structured Data Regeneration with Network Coding in Distributed Storage Systems Tree-structured Data Regeneration with Networ Coding in Distributed Storage Systems Jun Li, Shuang Yang, Xin Wang, Xiangyang Xue School of Computer Science Fudan University, China {57, 6377, xinw, xyxue}@fudaneducn

More information

An Area-Efficient BIRA With 1-D Spare Segments

An Area-Efficient BIRA With 1-D Spare Segments 206 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 26, NO. 1, JANUARY 2018 An Area-Efficient BIRA With 1-D Spare Segments Donghyun Kim, Hayoung Lee, and Sungho Kang Abstract The

More information

DYNAMIC TREE-LIKE STRUCTURES IN P2P-NETWORKS

DYNAMIC TREE-LIKE STRUCTURES IN P2P-NETWORKS DYNAMIC TREE-LIKE STRUCTURES IN P2P-NETWORKS Herwig Unger Markus Wulff Department of Computer Science University of Rostock D-1851 Rostock, Germany {hunger,mwulff}@informatik.uni-rostock.de KEYWORDS P2P,

More information

Design and Implementation of a Storage Repository Using Commonality Factoring

Design and Implementation of a Storage Repository Using Commonality Factoring Design and Implementation of a Storage Repository Using Commonality Factoring Dr. Jim Hamilton Avamar Technologies jhamilton@avamar.com Eric W. Olsen Avamar Technologies ewo@avamar.com Abstract * In this

More information

Parallel Particle Swarm Optimization for Reducing Data Redundancy in Heterogeneous Cloud Storage

Parallel Particle Swarm Optimization for Reducing Data Redundancy in Heterogeneous Cloud Storage Parallel Particle Swarm Optimization for Reducing Data Redundancy in Heterogeneous Cloud Storage M.Vidhya Mr.N.Sadhasivam, Department of Computer Science and Engineering, Department of Computer Science

More information

Codes for distributed storage from 3 regular graphs

Codes for distributed storage from 3 regular graphs Codes for distributed storage from 3 regular graphs Shuhong Gao, Fiona Knoll, Felice Manganiello, and Gretchen Matthews Clemson University arxiv:1610.00043v1 [cs.it] 30 Sep 2016 October 4, 2016 Abstract

More information

Shaking Service Requests in Peer-to-Peer Video Systems

Shaking Service Requests in Peer-to-Peer Video Systems Service in Peer-to-Peer Video Systems Ying Cai Ashwin Natarajan Johnny Wong Department of Computer Science Iowa State University Ames, IA 500, U. S. A. E-mail: {yingcai, ashwin, wong@cs.iastate.edu Abstract

More information

GridBlocks DISK - Distributed Inexpensive Storage with K-availability

GridBlocks DISK - Distributed Inexpensive Storage with K-availability GridBlocks DISK - Distributed Inexpensive Storage with K-availability Mikko Pitkanen Helsinki Institute of Physics Technology Programme mikko.pitkanen@hip.fi Juho Karppinen Helsinki Institute of Physics

More information