A Comprehensive Study on RAID-6 Codes: Horizontal vs. Vertical

Size: px
Start display at page:

Download "A Comprehensive Study on RAID-6 Codes: Horizontal vs. Vertical"

Transcription

1 2011 Sixth IEEE International Conference on Networking, Architecture, and Storage A Comprehensive Study on RAID-6 Codes: Horizontal vs. Vertical Chao Jin, Dan Feng, Hong Jiang, Lei Tian School of Computer Science & Technology, Huazhong University of Science & Technology Wuhan National Laboratory for Optoelectronics University of Nebraska-Lincoln chjinhust@gmail.com, dfeng@hust.edu.cn, jiang@cse.unl.edu, ltian@hust.edu.cn Abstract The RAID-6 architecture is playing an increasingly important role in modern storage systems. There are generally two kinds of RAID-6 codes, horizontal codes and vertical codes. Horizontal codes have been extensively studied and widely implemented, while vertical codes have not gained the equal attention. In this paper, we investigate the state-of-theart horizontal and vertical RAID-6 codes and select two representative ones, RDP for horizontal codes and P-Code for vertical codes, to compare their performance. Since the code lengths of vertical codes are usually restricted, we first provide two efficient code shortening algorithms for vertical codes, by which the length of a vertical code can be extended to an arbitrary given one. In the context of our code shortening algorithms for vertical codes, we compare the theoretical performance of RDP and P-Code at consecutive lengths, and examine their practical behaviors in the real environment. Theoretical analysis and experimental evaluation results have demonstrated that vertical codes can provide comparable, and sometimes even better, performance than horizontal codes. Keywords RAID-6; horizontal codes; vertical codes; code shortening; performance comparison I. INTRODUCTION The RAID (Redundant Array of Independent Disks) Architecture [1] has been popular in the storage systems for years due largely to its two major advantages of high performance and fault tolerance. RAID achieves high performance through parallel IO across an array of component disks and provides fault tolerance through data redundancy and erasure codes. The latter, addressing the critical issue of data reliability and availability, has been drawing increasing attention from the academia and industry lately. There are three major reasons behind this. First, recent findings from real world by researchers have reported that partial or complete disk failure rate is actually much higher than previously and commonly estimated [2]. Second, while the number and capacity of disks have been growing exponentially, individual disk failure rates remain largely unchanged [3]. Third, the disk failures usually show a sign of correlations, e.g., after one disk fails, another disk failure will likely occur soon [2]. All these reveal the fact that in today s data centers, where there are thousands of hard disks and two or more concurrent disk failures are no longer rare, the ability to tolerate multiple disk failures becomes ever more important. Among all the RAID levels, RAID-6 outperforms the others in disk failure tolerance due to its ability to tolerate arbitrary two concurrent disk failures in the array. Thus, many storage companies, as well as academia research groups, are conducting active research on RAID-6 codes. RAID-6 codes can be roughly divided into two categories, horizontal codes and vertical codes. Horizontal codes such as Reed-Solomon codes and EVENODD have been studied extensively and implemented widely in storage systems. Several open-source horizontal codes are introduced and evaluated in details in [6]. Their implementations can be obtained through open-source libraries like Jerasure [7]. On the other hand, vertical codes have gained less attention than their horizontal counterparts. Few applications of vertical codes have been seen in the real environment, and their performance behaviors and properties remain largely unexplored. In this paper, we examine the key properties of vertical RAID-6 codes, and compared them comprehensively with those of the horizontal RAID-6 codes. In particular, we choose two most representative codes, RDP [8] for horizontal codes, and P-Code [9] for vertical codes, for detailed and quantitative comparisons. We analyze their computational complexity, update complexity, and storage efficiency for all array sizes within the typical array size range. To reveal their behaviors in a real environment, we implement them in our storage platforms, and evaluate performance with IO benchmark tools. The main contributions of this paper are summarized as follows. We investigate the state-of-the-art horizontal and vertical RAID-6 codes, and select two most representative ones, namely RDP for horizontal codes and P-Code for vertical codes, to compare them comprehensively. Horizontal codes are very easy to be extended by code shortening, while vertical codes not so easy to be shortened. We introduce two effective code shortening schemes for vertical codes by taking P-Code as an example. The two shortening schemes provide flexible design choices for P-Code. The first scheme maintains the MDS (i.e., optimal storage efficiency) property of P-Code, but it may increase the computational complexity and update complexity. The second scheme loses the optimal storage efficiency, but attains the optimal update complexity and even lower computational complexity /11 $ IEEE DOI /NAS

2 We analyze theoretically the key performance metrics of computational complexity, update complexity, and storage efficiency of RDP and P-Code respectively at consecutive array sizes within the typical array size range, and show that P-Code provides comparable, and sometimes even better, performance than RDP. We discuss the design and implementation issues of RDP and P-Code in the context of practical implementations. We implement them in our storage platforms, and measure their performances under different design parameters in a real environment. We show that the design and implementation issues may have a significant impact on the performance of the RAID-6 array, and demonstrate that the theoretic performance analysis is consistent with the practical measurements in general. The rest of the paper is organized as follows. The next section reviews the state-of-the-art horizontal and vertical codes respectively. We discuss the code shortening techniques and present two shortening schemes for vertical codes in Section 3. In Section 4 we analyze the key performance metrics for RDP and P-Code. Section 5 addresses the design and implementation issues when implementing a RAID-6 code in real platforms. We measure and evaluate the performances of RDP and P-Code in practical implementations in Section 6 and summarize this paper in Section 7. II. REPRESENTATIVE RAID-6 CODES The main difference between horizontal and vertical RAID-6 codes lies in the placement of the parity blocks inside their code structures. For horizontal codes, the last two columns are dedicated parity columns, and the other columns are data columns. The first parity column simply holds row parity blocks, and the second parity column is filled with parity blocks constructed via a certain algorithm. For vertical codes, on the other hand, there is no dedicated parity column, the data and parity blocks are spread across all the columns. A. Horizontal RAID-6 Codes Reed-Solomon code is a very powerful general-purpose horizontal code [10]. It is suitable for any array size and can be generalized to tolerate arbitrary number of disk failures. Its key shortcoming is that the second parity column is constructed via finite field arithmetic, which is very computationally intensive. Many efforts have been made to reduce its computational complexity by proposing new codes on the basis of Reed-Solomon code, such as the Cauchy Reed- Solomon code [11]. Besides these general-purpose codes, there are some other codes called special-purpose codes, which can not be easily generalized to tolerate more disk failures. However, these special-purpose RAID-6 codes, such as EVENODD, RDP, and the Liberation codes [12], vastly outperform their general-purpose counterparts. We mainly focus on special-purpose codes in this paper for their ease of implementation and low computational complexity for RAID-6. In the following discussions, p represents a prime number. EVENODD. A standard EVENODD code [5] has (p+2) columns and (p-1) rows. There are p diagonal parity chains across the data columns inside the code structure, each with a diagonal parity block. One of the diagonal parity blocks is XORed into the other (p-1) blocks as an adjusting factor. These adjusted diagonal parity blocks are stored in the second parity column. When rebuilding from double disk failures, it is capable of recovering the original p diagonal parity blocks. But it must be noted that the adjusting diagonal parity chain aggravates the computational complexity during construction and reconstruction, resulting in poor small-write performance. RDP. A standard RDP code [8] has (p-1) rows and (p+1) columns. There are (p-1) diagonal parity chains across the data columns and the first parity column (the blocks that do not appear in any diagonal parity chain form the missing diagonal parity chain). The second parity column holds the (p-1) diagonal parity blocks. RDP is proven to perform the best among the RAID-6 horizontal codes [6]. Liberation code. A standard Liberation code [12] has p rows and (p+2) columns. The last two parity columns are constructed from the multiplication of a specified CDM (Coding Distribution Matrix) and the data vector. In fact, each horizontal RAID-6 code has a corresponding CDM. The CDM of Liberation code has the minimal number of ones, meaning that it has the lowest update complexity among all the horizontal RAID-6 codes. Table 1 presents a characteristic comparison of the abovementioned horizontal codes. The standard array sizes of a code refer to the array sizes (i.e., number of columns) in which the code is originally defined. A code usually performs the best at its standard array sizes. The brief evaluation of computational complexity, update complexity and storage efficiency in Table 1 all refer to the performance of the codes at their standard array sizes. Detailed comparison of the three codes can be found in [12]. All the horizontal codes are easy to be extended (i.e., shortened) to arbitrary array sizes. Generally, RDP performs better than the other horizontal codes. Moreover, RDP has a clear geometrical structure, making it very easy to implement. Thus, we select RDP to be the representative horizontal code to compare with vertical codes. B. Vertical RAID-6 Codes For most vertical RAID-6 codes, one data block participates in the calculation of, and is protected by, exactly two parity blocks. Moreover, all the parity blocks are independent from one another. Thus, they have attained the optimal update complexity and the optimal computational complexity during construction or reconstruction. Some representative vertical codes are X-Code [13], B-Code [14], and P- Code [9]. X-Code. X-Code has a structure of p rows and p columns. The data blocks, held in the first (p-2) rows, are covered by p diagonal parity chains along slope 1 and another p 103

3 TABLE I. CHARACTERISTIC COMPARISON OF REPRESENTATIVE HORIZONTAL CODES Horizontal Standard Computational Update Storage Implementation Extendibility Codes Array Size Complexity Complexity Efficiency Complexity EVENODD prime+2 high high optimal easy low RDP prime, prime+1 optimal high optimal easy low Liberation easy prime+2 low low optimal Code medium TABLE II. CHARACTERISTIC COMPARISON OF REPRESENTATIVE VERTICAL CODES. Vertical Standard Computational Update Storage Implementation Extendibility Codes Array Size Complexity Complexity Efficiency Complexity B-Code all (not proved) optimal optimal optimal n/a high X-Code prime optimal optimal optimal easy low P-Code prime-1, prime optimal optimal optimal easy low diagonal parity chains along slope -1. The parity blocks of the parity chains are stored in the last two rows. B-Code. Xu et al. found the equivalence between the construction of a new kind of RAID-6 code, called B-Code, and the perfect one-factorization of the complete graphs. The structure of B-Code consists of N columns, N/2 rows when N is even or (N-1)/2 rows when N is odd. The existence of perfect one-factorizations for every complete graph with an even number of nodes is a famous conjecture in graph theory [15]. However, the conjecture has not been proved, and the possibility of constructing B-Code with arbitrary array size N can not be affirmed. One disadvantage of B-Code is that, inside its code structure, the data-parity patterns are not regular. So in a practical implementation, it might be necessary to use a table-driven technique that requires a large mapping table to store the data-parity information. P-Code. A standard P-Code has (p-1)/2 rows and (p-1) columns. The first row holds the parity blocks, and the rest (p-3)/2 rows hold data blocks. P-Code has similar structures as B-Code, except that its array length is limited to prime or (prime-1). However, P-Code is easy to be extended. In Section 3, we will introduce two code shortening schemes for vertical codes by taking P-Code as an example. One distinctive difference between P-Code and B-Code is that P-Code uses a very simple and clear algorithm to describe the dataparity patterns in its code structure, and a mapping table similar to B-Code s will not be necessary in any implementation. A characteristic comparison of the abovementioned vertical codes is given in Table 2. All the vertical codes have the optimal computational complexity, update complexity and storage efficiency at their corresponding standard array sizes. B-Code can support continuously all the array sizes within the typical disk-array-size range. However, they require a mapping table that may complicate the implementation, and adversely influence the performance. X-Code and P-Code both have clear geometrical structures, and easy to implement. Moreover, since P-Code has only one, while X- Code has two, parity block in each column, P-Code can be extended to arbitrary array size more easily. We select P- Code to be the representative vertical codes to compare with horizontal codes. III. CODE SHORTENING ALGORITHMS FOR VERTICAL RAID-6 CODES It must be noted that the special-purpose RAID-6 codes usually have array-size limitations, namely, the number of columns in the code structure must be some discrete values (e.g., prime or near-prime). For instance, the array size of the standard RDP code is prime+1, and the array size of the standard P-Code is prime or prime-1. This restriction makes these codes somewhat impractical in the real environment, since the administrator might be expecting to configure a disk array with an arbitrary number of disks. Fortunately, through code shortening, horizontal codes and vertical codes all can be extended to arbitrary array sizes. Shortening horizontal codes is straightforward. For a standard horizontal code, we can simply remove some of the data columns from its code structure, by assuming that the removed data columns contain only imaginary zeros. It has been shown in [8] that RDP can be configured for any array size in this way. On the other hand, vertical codes are not so easy to be shortened. In the structure of a vertical code, each column usually contains not only data blocks but also parity blocks. The problem is that, if we remove one column, the parity blocks in this column should also be removed, and then the corresponding parity chains may be in an inconsistent state. Generally, there are two algorithms to handle this problem. We will illustrate them by taking P-Code as an example. It must be noted that the algorithms are also applicable to other vertical RAID-6 codes such as X-Code [16]. Figure 1 illustrates the two shortening schemes for P- Code. dn denotes the index of the n-th column. The first scheme is inspired by the method proposed in [17]. As shown in Figure 1b, in the structure of the standard P-Code with 6 columns (Figure 1a), when the parity block (6) is removed with column d6, we select a data block in the same 104

4 Figure 1. The two shortening schemes for P-Code. parity chain (e.g., data block (2,6) in column d1) to be the new parity block. The data-parity pattern of the remaining columns is the same as before, namely, the labels of the remaining blocks stay unchanged. The remaining structure is a shortened P-Code with 5 columns. Obviously, the shortened P-Code can also be rebuilt from any two column erasures, and its reconstruction algorithm is very similar to that of a standard P-Code. We can further shorten P-Code by removing more columns in this way. The second scheme is to remove the entire parity chain whose parity block has been removed. As shown in Figure 1c, when the parity block (6) is removed with column d6, we remove all the data blocks in parity chain P(6) (i.e., the data blocks whose labels contain the integer 6). Pay attention to the fact that there is no data block of P(6) in the column d5, and in order to maintain an equal number of rows in each column, we remove the block (2,3) additionally. When rebuilding, we just image the removed blocks still exist but have zero values in them. The resulting structures of the two shortening schemes have different properties. For the first scheme, the number of parity blocks and the number of rows in each column remain unchanged after shortening. A desirable property for a RAID-6 code to have is the Maximum-Distance-Separable (MDS) property that assures the code s storage efficiency optimality (see Section 4.3). The necessary and sufficient condition for a RAID-6 code to be MDS is that the number of parity blocks in its code structure equals exactly twice the number of rows in each column. It is easy to see that the shortened P-Code is still an MDS code, namely, it has the optimal storage efficiency. However, the update complexity of the shortened P-Code becomes non-optimal. In Figure 1b, the new parity block of the parity chain P(6), namely block (2,6), also participates in the parity chain P(2). When a data block, say block (3,6), in the parity chain P(6) is updated, the parity block (3) and block (2,6) should be updated, and since (2,6) is updated, the parity block (2) should also be updated. Thus, the update complexity of the data blocks in the parity chain P(6) is 3, above the optimal of 2. At the same time, the computational complexity of the shortened P-Code is also not optimal. We will examine it in details in Section 4. For the second scheme, the number of parity blocks and the number of rows in each column are reduced after shortening. Generally, when we remove one more column from the structure of P-Code, the number of parity blocks and the number of rows per column each will be reduced by one. P- Code shortened in this way is no longer MDS, since it does not satisfy the aforementioned condition. However, this scheme is still suitable for a practical implementation for several reasons. First, the standard P-Code covers a major part of the typical array size range (e.g., 4-37 disks), and the interval between two standard P-Code array sizes is relatively small, so the shortening from the nearest standard P-Code will not be significant, and the storage efficiency of the shortened P-Code will be within an acceptable factor of the optimal. Second, the capacity of modern disks keeps growing at a steady but fast rate, making the storage efficiency less of a concern for a storage system administrator. Third, the construction and reconstruction computational complexity of this shortened P-Code is even lower than that of an MDS code with the same array size (see Section 4.1). And finally, the shortened P-Code still has the optimal update complexity of 2 (see Section 4.2). IV. THEORETICAL PERFROMANCE METIRCS In this section, we analyze the key theoretical performance metrics of computational complexity, update complexity and storage efficiency of RDP and P-Code based on their code structures. We examine these performance properties in all the array sizes within the typical array size range, including those obtained by code shortening. A. Computational Complexity It has been proven in [9] that the optimal construction complexity for any MDS RAID-6 code, in terms of the average number of XOR operations per data block, is 2-2/(n- 2), and the optimal reconstruction complexity, in terms of the average number of XOR operations per lost block regeneration, is (n-3), where n is the array size. Among the aforementioned horizontal codes, all the codes except RDP fail to attain the optimal construction complexity [9]. RDP performs optimally when its array size n is prime or prime+1 [12]. However, shortened RDP loses this optimality, i.e., its construction complexity is above the optimal. Suppose that the standard RDP has p columns (p is prime). From the structure of RDP we can see that, when we shorten RDP by one column, each of the p-1 horizontal pari- 105

5 ty chains, and p-2 out of the p-1 diagonal parity chains (i.e., except the missing diagonal parity chain), are shortened by one data block. Thus the total number of XOR operations needed in the construction process is reduced by 2p-3, and the total number of data blocks is reduced by p-1. The construction computational complexity of the shortened RDP is therefore, 2( p 1)( p 3) N(2 p 3) (1) ( p 1)( p 2) N( p 1) In the above equation, N is the number of shortened columns (i.e., the array size n is p-n). Similarly, the reconstruction computational complexity of the shortened RDP is, 2( p 1)( p 3) N(2 p 3) (2) 2( p 1) On the other hand, each of the aforementioned vertical codes has its corresponding optimal computational complexity at its corresponding standard array sizes. P-Code attains the optimal computational complexity at the array size of prime or prime-1. However, the shortened P-Code performs differently for the two different shortening schemes. Suppose that the standard P-Code has p-1 columns. For the first shortening scheme, when we shorten P- Code by one column, as shown in Figure 1b, p-2 out of the p-1 parity chains are each shortened by one data block, thus the total number of XOR operations is reduced by p-2, and the total number of data blocks is reduced by (p-1)/2. The construction computational complexity of the shortened P- Code by the first shortening scheme is thus given in Expression (3) below. ( p 1)( p 4) N( p 2) (3) ( p 1)( p 3)/2 N( p 1)/2 Similar as Equation (1), here N is the number of shortened columns (i.e., the array size n is p-n). On the other hand, the reconstruction computational complexity of the shortened P-Code by the first scheme is given in Expression (4). ( p 1)( p 4) N( p 2) (4) ( p 1) For the second shortening scheme, when the standard P- Code with p-1 columns is shortened by one column, as shown in Figure 1c, the data blocks in the last column are removed first, and each remaining column should remove one data block additionally. Thus, when the standard P- Code is shortened by N columns, the total number of data blocks that should be removed from its code structure, denoted R, is given in Equation (5) below. R = N( p 3)/2 + N( p 1 N) (5) Since each data block participates in exactly two parity chains, removing one data block causes two fewer XOR in the construction process. Thus the total number of XOR operations needed in the construction process is reduced by 2R, and the construction computational complexity of the Figure 2. Normalized construction computational complexity for RDP and P-Code. Figure 3. Normalized reconstruction computational complexity for RDP and P-Code. shortened P-Code by the second scheme is shown in Expression (6). ( p 1)( p 4) 2R (6) ( p 1)( p 3)/2 R Similarly, the reconstruction computational complexity of the shortened P-Code by the second scheme is given in Expression (7). ( p 1)( p 4) 2R (7) p 1 N The construction and reconstruction computational complexity of RDP and P-Code, normalized to the optimal complexity of MDS codes, at all array sizes within the typical array size range are shown in Figure 2 and Figure 3 respectively. P-Code first/second stands for P-Code with the first/second shortening scheme respectively. From the figures we can see that all the three codes perform optimally at their corresponding standard array sizes. The computational complexity of RDP and shortened P-Code with the first scheme is very close to the optimal, within only very small factors of the optimal. The computational complexity of the shortened P-Code with the second scheme is below the optimal, and the gap widens as more columns are shortened. 106

6 B. Update Complexity In an erasure-coded disk array, when a data block is updated, the associated parity blocks should also be updated to maintain parity consistency. The update complexity indicates how many parity blocks are associated with a data block on average. The lower the update complexity is, the smaller the write penalty it incurs. It has been known that the optimal (i.e., lowest) update complexity for a RAID-6 code is 2. However, horizontal codes are not able to attain this optimal bound. The reason is that, in their code structures, either the parity blocks are not independent from one another (e.g., RDP), or a data block is associated with more than two parity blocks on average (e.g., EVENODD). It is shown in [12] that the Liberation codes have attained the lowest update complexity among all the horizontal RAID-6 codes, but it is still larger than 2. In the code structure of RDP, the data blocks in the first row parity chain and the missing diagonal parity chain have an update complexity of 2, and the others have an update complexity of 3. It is easy to see that the standard RDP with p+1 columns has the average update complexity as follows. 2 2(2 p 3) + 3[( p 1) (2 p 3)] 2 (8) ( p 1) For the shortened RDP, in each data column, two out of p-1 data blocks have an update complexity of 2, and the other p-3 have an update complexity of 3. Thus the average update complexity for the shortened RDP can be represented by Expression (9) ( p 3) (9) ( p 1) Generally, vertical codes outperform their horizontal counterparts in update complexity. All the aforementioned vertical codes have the lowest update complexity of 2 at their corresponding standard array sizes. P-Code attains this optimality at the array sizes of p and p-1, where p is prime. As for the shortened P-Code, it is easy to see that the P- Code shortened with the second scheme still has the optimal update complexity of 2. However, the P-Code shortened with the first scheme no longer has the optimal update complexity. We have explained the reason with an example in Section 3. It is hard to summarize in a closed-form expression for the update complexity of the shortened P-Code with the first scheme, so we have worked it out manually for all the array sizes from 4 to 37 columns. Figure 4 plots the update complexity of RDP and P- Code within the typical array size range. As previously analyzed, P-Code with the second shortening scheme always has the optimal update complexity. The update complexity of RDP is the highest of the three. Moreover, it increases as the array size increases, with the asymptotic value of 3. On the other hand, the shortened P-Code with the first scheme has the optimal update complexity at its standard array sizes, and its update complexity increases as more columns are shortened. Figure 4. Update complexity for RDP and P-Code. Figure 5. Normalized storage efficiency for RDP and P-Code. C. Storage Efficiency Storage efficiency measures the percentage of data blocks in the code structure of an erasure code. The Singleton formula [18] gives the optimal bound for the storage efficiency of any kind of erasure code. If a code attains the Singleton bound, it is called a Maximum-Distance- Separable (MDS) code, and its storage efficiency is optimal. As discussed before, we have a simple and convenient way to decide whether a RAID-6 code is an MDS code, namely, in the structure of an MDS RAID-6 code, the number of parity blocks equals exactly twice the number of rows in each column. Thus, the optimal storage efficiency for a RAID-6 code with array size n is (n-2)/n. It is easy to see that the standard or shortened RDP is always an MDS code. The standard P-Code and shortened P-Code with the first scheme are also MDS codes. The shortened P-Code with the second scheme is not MDS code. Suppose it is shortened from the standard P- Code with p-1 columns by N columns, it is easy to see that there are (p-1)/2-n rows left in its structure, with the first row holding parity blocks. Thus its storage efficiency is given in Expression (10) below. 107

7 ( p 1)/2 N 1 (10) ( p 1)/2 N The storage efficiency of RDP and P-Code, normalized to the optimal storage efficiency of MDS codes, is shown in Figure 5.We can see that RDP and P-Code with the first shortening scheme always have the optimal storage efficiency. The storage efficiency of the shortened P-Code with the second scheme is not optimal, and it decreases as more columns are shortened from the standard array size. However, it remains within an acceptable factor (e.g., 96%) of the optimal most of the time. V. DESIGN AND IMPLEMENTATION ISSUES So far, we have discussed the theoretical performance metrics of the RAID-6 codes based on their code structures. When implementing the RAID-6 codes in real storage systems, some practical matters, such as the effects of cache and memory, the interactions with the file systems, and the characteristics of the IO workload, will influence the performance of a disk array significantly. In this section, we will discuss the design issues that must be considered in a practical implementation. A. Single-Stripe vs. Multiple-Stripe Implementation Due to the memory management mechanism of the operating systems, such as Linux, the main memory is structured into pages, the size of which is usually an integral power of two bytes (e.g., 4KB). A stripe is an IO buffer that caches a codeword (i.e., parity group of disk blocks) in the memory. A stripe is composed of n strips, where n is the array size, with each strip corresponding to a disk/column. A single-stripe means that each of its strips is composed of one single memory page. On the contrary, a multiple-stripe means that each of its strips is composed of multiple memory pages. For a single-stripe implementation of a RAID-6 code, the number of rows in its code structure must be an integral power of two, due to the size of the memory pages. On the other hand, multiple-stripe implementations do not have this restriction. For instance, a block in a column in the codeword can be directly mapped to a memory page of a strip in the stripe cache. For Reed-Solomon-like codes, since there is just one row in their code structures, single-stripe implementation would be a natural choice. Special-purpose RAID-6 codes, such as RDP and P-Code, can also be implemented as single-stripe mode at all array sizes. For instance, by selecting p=257, RDP can be implemented in the single-stripe mode directly. Pay attention to the fact that shortening RDP will not change the number of rows in its code structure, thus all the array sizes below 257 and above the next smallest prime can be implemented on the basis of 257 through code shortening. On the other hand, P-Code can be implemented in a similar way like RDP by the first shortening scheme. Moreover, as a vertical code, P-Code has another way to implement the single stripe mode, through vertical shortening. For instance, when p=11, the number of rows in P-Code s structure is 5, and it is not suitable for a single-stripe implementation, for it is not a power of 2. However, we can remove the last row by assuming that it contains only zeros, and the remaining structure has 4 rows, satisfying the condition. The advantage of a single-stripe implementation is that, since the individual stripe cache size is small (i.e., single memory page width), there might be more stripe caches available when the total amount of memory cache is limited. Also, this implementation mode allows us to use the existing software and techniques for reading and writing a single stripe, thus simplifying the implementation [8]. However, nearly all the array sizes, except for a few isolated ones, can not be implemented in single-stripe mode straightforwardly, but must be done through code shortening. For instance, if we want to implement a 20-disk RDP array, we have no choice but to shorten from a 257-disk RDP array, since 257 is the smallest prime number above 20 that satisfies 2n+1. As shown in Section 4, shortening a large number of columns may do harm to the performance of the disk array severely. On the contrary, the multiple-stripe mode, while a little more complicated to implement, can be applied to a special-purpose RAID-6 code at all array sizes straightforwardly. Moreover, it allows the vertical codes to lay out the data flexibly on the disks. We will discuss this issue in the next subsection. B. Data Layout A distinctive difference between a horizontal RAID-6 array and a vertical RAID-6 array is the data layout on each component disk. In a horizontal RAID-6 array, the data and parity blocks are placed sequentially on the data and parity disks respectively. While in a vertical RAID-6 array, the data and parity blocks are placed on each disk in an interleaved manner. Thus, under a sequential-access-dominated workload, vertical RAID-6 array may perform worse than horizontal RAID-6 array due to the overhead of frequent head seeks of the former. For a vertical RAID-6 array that implements in the single-stripe mode, a read from the disk to the memory cache always contains both data and parity. This may degrade the read performance since the parity data is considered useless when serving a read request. On the other hand, in a multiple-stripe implementation, the data blocks and the parity blocks can be independently accessed. This feature allows us to flexibly change the data layout on the disks to adapt to the access patterns of the workload. The capacity of a disk array is divided into many chunks, and a chunk is composed of several contiguous parity groups across the component disks. We can further divide a chunk into two regions, the data region and the parity region. The data blocks of all the parity groups in the chunk are placed sequentially in the data region, while the parity blocks are placed in the parity region. By tuning the chunk size, we can change the data layout to adapt to the workload for sequential or random disk accesses. We will 108

8 examine the impact of the data layout schemes in details in Section 6. C. Write Strategy There are generally two kinds of strategy to serve a write request, namely, Read-Modify-Write (RMW) and Reconstruction Write (RCW) [20]. Suppose a write request comes for a data block in a parity group. Under the RMW policy, first, the old content of that data block and the corresponding parity blocks are read into the buffer, second, the corresponding parity blocks are reconstructed in the buffer, and last, the new data and parity blocks are written onto the disks. Under the RCW policy, on the other hand, all the other data blocks in the same parity group are read into the buffer, and then all the parity blocks in the parity group are reconstructed in the buffer, finally the new data block and all the new parity blocks are written onto the disks. RMW works well in the small-random-access environment, where only a minor portion of data blocks in a parity group should be updated at the same time. Its performance is directly related to the update complexity of the underlying RAID-6 code. On the other hand, RCW is suitable for the long-sequential-access dominated workloads. Also, if the update complexity of the RAID-6 code is high, RCW would also be a better choice. The computational complexity of the RAID-6 code has an impact on the performance of RCW, since a parity group always reconstructs all its parity blocks when a write request comes for it. In a practical implementation, we can combine the two strategies together, and dynamically and adaptively select one of them at run time to minimize the write penalty [21]. VI. EXPERIMENTAL EVALUATION We have implemented the two representative RAID-6 codes, RDP and P-Code, in our practical RAID-6 systems. In this section, we conduct extensive experiments on them, to compare their practical performances under different design strategies and access patterns. A. Experimental Setup We have implemented RDP and P-Code by embedding them into the Linux Software RAID (MD) as loadable modules. We carry out the experiments on our storage platform of server-class hardware with Intel Xeon 3.0GHz processor and 1GB DDR memory. We use one HighPoint RocketRA- ID 2220 SATA card to house 8 Seagate ST AS SATA disks. The rotational speed of these disks is 7200 RPM, with a peek transfer rate of 78MB/s. An additional IDE disk is used to hold the operating system (Fedora Core 4 Linux, Kernel Version ) and other software (MD and mdadm). B. Construction and Reconstruction Performance Comparison We configure the RDP and P-Code array each with 8 disks, 2 of which are uses as spare disks. As a reference, the Figure 6. Construction speed of RS, RDP, and P-Code. open-source Reed-Solomon codes included in the Linux Software RAID is also evaluated with the same configuration. In this section we examine their construction and reconstruction performances. The disk array starts the construction process when it is created. This process is also known as RAID synchronization. On the other hand, the disk array starts the reconstruction process when disk failures occur inside the array. In our experiment, we use the set-faulty functionality of mdadm to disable two disks in each of the three disk array, and start their reconstruction threads. Since the construction and reconstruction processes have the same pattern, we only present the measured construction speed for the three codes in Figure 6. From Figure 6 we can see that RDP and P-Code have comparable construction speeds, and both outperform RS. The reason lies in the fact that RDP and P-Code use only XOR operation and have the same computational complexity, while RS uses a much more complicated one, namely the finite field operation. However, the gap is not significant, since our server-class CPU is powerful, making the parity computation less of an obvious bottleneck of the entire system. C. Impact of Data Layout on P-Code We have implemented P-Code in the multiple-stripe mode, and it allows us to flexibly change the data layout on the disks. We have discussed this issue in Section 5.2. In this section, we examine the performance of P-Code under the two different data layout designs, where one separates data blocks and parity blocks into two different regions in a data chunk (denoted Read-con ) while the other does not (denoted Read-int ). User IO requests with different access patterns are generated by IOmeter [19]. Figure 7 shows the read throughput of P-Code under different access patterns from small random access (i.e., 0%seq) to large sequential access (i.e., 100%seq). When the workload is small-random-access dominated, the two schemes perform almost equally poorly. As the workload becomes more sequential, both the schemes perform better, but the scheme that concentrates the data blocks inside a chunk (denoted as Read-con in Figure 7) outperforms the other scheme gradually, which verifies the superiority of judicious data layout discussed in Section

9 Figure 7. Read performance comparison for the two different data layout schemes. Figure 9. Read performance comparison of RDP and P-Code. Figure 8. Write performance comparison for the two different write strategies D. Impact of Write Strategy on P-Code We have discussed in Section 5.3 that the two different write strategies, namely RCW and RMW, may have a different impact on the write performance of the P-Code array. We have separately implemented a RCW version and a version that combines RCW and RMW. Figure 8 shows the write performance of P-Code under the two different write strategies. The hybrid strategy of combining RCW and RMW always chooses the strategy that has the smallest write penalty between the two. Thus, as we can see from the figure, the hybrid strategy outperforms RCW under random workload, and performs comparably with RCW under sequential workload. E. IO Performance Comparison on RDP and P-Code In this section, we compare the read and write throughput of RDP and P-Code under different access patterns. They are each configured in the multiple-stripe mode with the hybrid RCW+RMW write strategy. Figure 9 shows the read performance comparison of RDP and P-Code. Under the sequential workload, RDP performs better than P-Code, and the gap narrows and then reverses as the workload becomes less sequential and more random. This is due to the fact that P-Code always has both data and parity blocks inside a chunk on each disk, and this may slow down the transfer rate of large blocks. Figure 10. Write performance comparison of RDP and P-Code. On the other hand, P-Code has a generally better write performance than RDP, especially under the workload of small random accesses. The reason is obvious. The two codes use RCW under large sequential accesses and RMW under small random accesses. Since P-Code has lower update complexity than RDP, it requires fewer IO operations to update the parity blocks under the RMW strategy. VII. CONCLUSION This paper aims to give a comprehensive comparison between horizontal and vertical RAID-6 codes from both the theoretical and practical aspects. We proposed two efficient code shortening algorithm for vertical codes, and both of them are capable of extending a vertical code to an arbitrary length. In the context of our code shortening algorithms for vertical codes, we compared the theoretical performance of the representative horizontal code RDP and vertical code P- Code at consecutive lengths, and demonstrated that P-Code can provide comparable, and sometimes even better, performance than RDP. Then we discussed the design and implementation issues of RDP and P-Code in the context of practical implementations. We also implemented them in our storage platforms, and measured their performances under different design parameters in the real environment. Experimental results showed that the practical performance behavior is consistent with the theoretic performance analysis in general. 110

10 As a direction for our future work, we plan to apply the erasure codes to the solid state disks (SSD) and storage class memory (SCM). Since SSD and SCM have different physical features with traditional disks, they may require different strategies to boost performance. On the other hand, to fully explore the potential computational ability of modern multicore processor or GPU would also be valuable future work for the high performance of the erasure coded storage systems. ACKNOWLEDGMENT This work is supported by the National Basic Research 973 Program of China under Grant No. 2011CB302301; 863 Project 2009AA01A401 and 2009AA01A402; NSFC No , , ; Changjiang innovative group of Education of China No. IRT0725; and the US NSF Grant IIS , CCF , CNS REFERENCES [1] Patterson D, Gibson G, Katz R. A Case for Redundant Arrays of Inexpensive Disks (RAID). in: Proceedings of the International Conference on Management of Data (SIGMOD'98), Chicago, IL, 1988, [2] Schroeder B, Gibson G. Disk Failures in the Real World: What Does an MTTF of 1,000,000 Hours Mean to You? in: Proceedings of the 5th USENIX Conference on File and Storage Technologies (FAST'07), San Jose, CA, 2007, [3] Pinheiro E, Weber W D, Barroso L A. Failure Trends in a Large Disk Drive Population. in: Proceedings of the 5th USENIX Conference on File and Storage Technologies (FAST'07), San Jose, CA, 2007, [4] Plank J S. A Tutorial on Reed-Solomon Coding for Fault-Tolerance in RAID-like Systems. Software Practice and Experience, 1997, 27(9): [5] Blaum M, Brady J, Bruck J. EVENODD: An efficient scheme for tolerating double disk failures in RAID architectures. IEEE Transactions on Computers, 1995, 44(2): [6] Plank J S, Luo J, Schuman C D, et al. A Performance Evaluation and Examination of Open-Source Erasure Coding Libraries For Storage. in: Proccedings of the 7th UNENIX Conference on File and Storage Technologies (FAST'09), San Francisco, CA, 2009, [7] Plank J S, Simmerman S, Schuman C D. Jerasure: A Library in C/C++ Facilitating Erasure Coding for Storage Applications. Technical Report CS , Department of Electrical Engineering and Computer Science, University of Tennessee, [8] Corbett P, English B, Goel A. Row-Diagonal Parity for Double Disk Failure Correction. in: Proceedings of the 3rd USENIX Conference on File and Storage Technologies (FAST'04), San Francisco, CA, 2004, [9] Jin C, Jiang H, Feng D, et al. P-Code: A New RAID-6 Code with Optimal Properties. in: Proceedings of the 23rd ACM International Conference on Supercomputing (ICS'09), New York, NY, 2009, [10] Reed I S, Solomon G. Polynomial codes over certain finite fields. Journal of the Society for Industrial and Applied Mathematics, 1960, 8(2): [11] Plank J S, Xu L. Optimizing Cauchy Reed-Solomon codes for faulttolerant network storage applications. in: Proceedings of the 5th IEEE International Symposium on Network Computing Applications (NCA'06), Cambridge, MA, 2006, [12] Plank J S. The RAID-6 Liberation Codes. in: Proceedings of the 6th USENIX Conference on File and Storage Technologies (FAST'08), San Jose, CA, 2008, [13] Xu L, Bruck J. X-Code: MDS array codes with optimal encoding. IEEE Transactions on Information Theory, 1999, 45(1): [14] Xu L, Bohossian J, Bruck J, et al. Low-density MDS codes and factors of complete graphs. IEEE Transactions on Information Theory, 1999, 45(6): [15] Wagner D. On the perfect one-factorization conjecture. Discrete Mathematics, 1992, 104(2): [16] Jin C, Feng D, Liu J. Extending and Analysis of X-Code. Journal of Shanghai University (English Edition), 2011, 15(3): [17] Bohossian V, Bruck J. Shortening Array Codes and the Perfect 1- Factorization Conjecture. in: Proceedings of the IEEE International Symposium on Information Theory (ISIT 06), Seattle, WA, 2006, [18] Blaum M, Roth R M. On lowest density MDS codes. IEEE Transactions on Information Theory, 1999, 45(1): [19] IOmeter. [20] Jin C, Feng D, Jiang H, et al. TRIP: Temporal Redundancy Integrated Performance Booster for Parity-Based RAID Storage Systems. in: Proceedings of the 16th International Conference on Parallel and Distributed Systems (ICPADS'10), Shanghai, China, 2010, [21] Jin C, Feng D, Jiang H, et al. RAID6L: A Log-Assisted RAID6 Storage Architecture with Improved Write Performance. in: Proceedings of the 27th IEEE Symposium on Massive Storage Systems and Technologies (MSST'11), Denver, CO,

RAID6L: A Log-Assisted RAID6 Storage Architecture with Improved Write Performance

RAID6L: A Log-Assisted RAID6 Storage Architecture with Improved Write Performance RAID6L: A Log-Assisted RAID6 Storage Architecture with Improved Write Performance Chao Jin, Dan Feng, Hong Jiang, Lei Tian School of Computer, Huazhong University of Science and Technology Wuhan National

More information

P-Code: A New RAID-6 Code with Optimal Properties

P-Code: A New RAID-6 Code with Optimal Properties University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln CSE Conference and Workshop Papers Computer Science and Engineering, Department of 6-2009 P-Code: A New RAID-6 Code with

More information

TRIP: Temporal Redundancy Integrated Performance Booster for Parity-Based RAID Storage Systems

TRIP: Temporal Redundancy Integrated Performance Booster for Parity-Based RAID Storage Systems 200 6th International Conference on Parallel and Distributed Systems TRIP: Temporal Redundancy Integrated Performance Booster for Parity-Based RAID Storage Systems Chao Jin *, Dan Feng *, Hong Jiang, Lei

More information

Fast Erasure Coding for Data Storage: A Comprehensive Study of the Acceleration Techniques. Tianli Zhou & Chao Tian Texas A&M University

Fast Erasure Coding for Data Storage: A Comprehensive Study of the Acceleration Techniques. Tianli Zhou & Chao Tian Texas A&M University Fast Erasure Coding for Data Storage: A Comprehensive Study of the Acceleration Techniques Tianli Zhou & Chao Tian Texas A&M University 2 Contents Motivation Background and Review Evaluating Individual

More information

A Performance Evaluation of Open Source Erasure Codes for Storage Applications

A Performance Evaluation of Open Source Erasure Codes for Storage Applications A Performance Evaluation of Open Source Erasure Codes for Storage Applications James S. Plank Catherine D. Schuman (Tennessee) Jianqiang Luo Lihao Xu (Wayne State) Zooko Wilcox-O'Hearn Usenix FAST February

More information

I/O CANNOT BE IGNORED

I/O CANNOT BE IGNORED LECTURE 13 I/O I/O CANNOT BE IGNORED Assume a program requires 100 seconds, 90 seconds for main memory, 10 seconds for I/O. Assume main memory access improves by ~10% per year and I/O remains the same.

More information

Comparison of RAID-6 Erasure Codes

Comparison of RAID-6 Erasure Codes Comparison of RAID-6 Erasure Codes Dimitri Pertin, Alexandre Van Kempen, Benoît Parrein, Nicolas Normand To cite this version: Dimitri Pertin, Alexandre Van Kempen, Benoît Parrein, Nicolas Normand. Comparison

More information

Definition of RAID Levels

Definition of RAID Levels RAID The basic idea of RAID (Redundant Array of Independent Disks) is to combine multiple inexpensive disk drives into an array of disk drives to obtain performance, capacity and reliability that exceeds

More information

V 2 -Code: A New Non-MDS Array Code with Optimal Reconstruction Performance for RAID-6

V 2 -Code: A New Non-MDS Array Code with Optimal Reconstruction Performance for RAID-6 V -Code: A New Non-MDS Array Code with Optimal Reconstruction Performance for RAID-6 Ping Xie 1, Jianzhong Huang 1, Qiang Cao 1, Xiao Qin, Changsheng Xie 1 1 School of Computer Science & Technology, Wuhan

More information

ARC: An Approach to Flexible and Robust RAID Systems

ARC: An Approach to Flexible and Robust RAID Systems ARC: An Approach to Flexible and Robust RAID Systems Ba-Quy Vuong and Yiying Zhang Computer Sciences Department, University of Wisconsin-Madison Abstract RAID systems increase data storage reliability

More information

Mladen Stefanov F48235 R.A.I.D

Mladen Stefanov F48235 R.A.I.D R.A.I.D Data is the most valuable asset of any business today. Lost data, in most cases, means lost business. Even if you backup regularly, you need a fail-safe way to ensure that your data is protected

More information

RAID (Redundant Array of Inexpensive Disks)

RAID (Redundant Array of Inexpensive Disks) Magnetic Disk Characteristics I/O Connection Structure Types of Buses Cache & I/O I/O Performance Metrics I/O System Modeling Using Queuing Theory Designing an I/O System RAID (Redundant Array of Inexpensive

More information

I/O CANNOT BE IGNORED

I/O CANNOT BE IGNORED LECTURE 13 I/O I/O CANNOT BE IGNORED Assume a program requires 100 seconds, 90 seconds for main memory, 10 seconds for I/O. Assume main memory access improves by ~10% per year and I/O remains the same.

More information

RAID SEMINAR REPORT /09/2004 Asha.P.M NO: 612 S7 ECE

RAID SEMINAR REPORT /09/2004 Asha.P.M NO: 612 S7 ECE RAID SEMINAR REPORT 2004 Submitted on: Submitted by: 24/09/2004 Asha.P.M NO: 612 S7 ECE CONTENTS 1. Introduction 1 2. The array and RAID controller concept 2 2.1. Mirroring 3 2.2. Parity 5 2.3. Error correcting

More information

An Architectural Approach to Improving the Availability of Parity-Based RAID Systems

An Architectural Approach to Improving the Availability of Parity-Based RAID Systems Computer Science and Engineering, Department of CSE Technical reports University of Nebraska - Lincoln Year 2007 An Architectural Approach to Improving the Availability of Parity-Based RAID Systems Lei

More information

HP AutoRAID (Lecture 5, cs262a)

HP AutoRAID (Lecture 5, cs262a) HP AutoRAID (Lecture 5, cs262a) Ali Ghodsi and Ion Stoica, UC Berkeley January 31, 2018 (based on slide from John Kubiatowicz, UC Berkeley) Array Reliability Reliability of N disks = Reliability of 1 Disk

More information

Linux Software RAID Level 0 Technique for High Performance Computing by using PCI-Express based SSD

Linux Software RAID Level 0 Technique for High Performance Computing by using PCI-Express based SSD Linux Software RAID Level Technique for High Performance Computing by using PCI-Express based SSD Jae Gi Son, Taegyeong Kim, Kuk Jin Jang, *Hyedong Jung Department of Industrial Convergence, Korea Electronics

More information

On the Speedup of Single-Disk Failure Recovery in XOR-Coded Storage Systems: Theory and Practice

On the Speedup of Single-Disk Failure Recovery in XOR-Coded Storage Systems: Theory and Practice On the Speedup of Single-Disk Failure Recovery in XOR-Coded Storage Systems: Theory and Practice Yunfeng Zhu, Patrick P. C. Lee, Yuchong Hu, Liping Xiang, and Yinlong Xu University of Science and Technology

More information

Virtual Memory. Reading. Sections 5.4, 5.5, 5.6, 5.8, 5.10 (2) Lecture notes from MKP and S. Yalamanchili

Virtual Memory. Reading. Sections 5.4, 5.5, 5.6, 5.8, 5.10 (2) Lecture notes from MKP and S. Yalamanchili Virtual Memory Lecture notes from MKP and S. Yalamanchili Sections 5.4, 5.5, 5.6, 5.8, 5.10 Reading (2) 1 The Memory Hierarchy ALU registers Cache Memory Memory Memory Managed by the compiler Memory Managed

More information

Rethinking Erasure Codes for Cloud File Systems: Minimizing I/O for Recovery and Degraded Reads

Rethinking Erasure Codes for Cloud File Systems: Minimizing I/O for Recovery and Degraded Reads Rethinking Erasure Codes for Cloud File Systems: Minimizing I/O for Recovery and Degraded Reads Osama Khan, Randal Burns, James S. Plank, William Pierce and Cheng Huang FAST 2012: 10 th USENIX Conference

More information

SYSTEM UPGRADE, INC Making Good Computers Better. System Upgrade Teaches RAID

SYSTEM UPGRADE, INC Making Good Computers Better. System Upgrade Teaches RAID System Upgrade Teaches RAID In the growing computer industry we often find it difficult to keep track of the everyday changes in technology. At System Upgrade, Inc it is our goal and mission to provide

More information

On the Speedup of Recovery in Large-Scale Erasure-Coded Storage Systems (Supplementary File)

On the Speedup of Recovery in Large-Scale Erasure-Coded Storage Systems (Supplementary File) 1 On the Speedup of Recovery in Large-Scale Erasure-Coded Storage Systems (Supplementary File) Yunfeng Zhu, Patrick P. C. Lee, Yinlong Xu, Yuchong Hu, and Liping Xiang 1 ADDITIONAL RELATED WORK Our work

More information

Storage Devices for Database Systems

Storage Devices for Database Systems Storage Devices for Database Systems 5DV120 Database System Principles Umeå University Department of Computing Science Stephen J. Hegner hegner@cs.umu.se http://www.cs.umu.se/~hegner Storage Devices for

More information

HP AutoRAID (Lecture 5, cs262a)

HP AutoRAID (Lecture 5, cs262a) HP AutoRAID (Lecture 5, cs262a) Ion Stoica, UC Berkeley September 13, 2016 (based on presentation from John Kubiatowicz, UC Berkeley) Array Reliability Reliability of N disks = Reliability of 1 Disk N

More information

Chapter 6 - External Memory

Chapter 6 - External Memory Chapter 6 - External Memory Luis Tarrataca luis.tarrataca@gmail.com CEFET-RJ L. Tarrataca Chapter 6 - External Memory 1 / 66 Table of Contents I 1 Motivation 2 Magnetic Disks Write Mechanism Read Mechanism

More information

Lenovo RAID Introduction Reference Information

Lenovo RAID Introduction Reference Information Lenovo RAID Introduction Reference Information Using a Redundant Array of Independent Disks (RAID) to store data remains one of the most common and cost-efficient methods to increase server's storage performance,

More information

The term "physical drive" refers to a single hard disk module. Figure 1. Physical Drive

The term physical drive refers to a single hard disk module. Figure 1. Physical Drive HP NetRAID Tutorial RAID Overview HP NetRAID Series adapters let you link multiple hard disk drives together and write data across them as if they were one large drive. With the HP NetRAID Series adapter,

More information

ASEP: An Adaptive Sequential Prefetching Scheme for Second-level Storage System

ASEP: An Adaptive Sequential Prefetching Scheme for Second-level Storage System ASEP: An Adaptive Sequential Prefetching Scheme for Second-level Storage System Xiaodong Shi Email: shixd.hust@gmail.com Dan Feng Email: dfeng@hust.edu.cn Wuhan National Laboratory for Optoelectronics,

More information

I/O, Disks, and RAID Yi Shi Fall Xi an Jiaotong University

I/O, Disks, and RAID Yi Shi Fall Xi an Jiaotong University I/O, Disks, and RAID Yi Shi Fall 2017 Xi an Jiaotong University Goals for Today Disks How does a computer system permanently store data? RAID How to make storage both efficient and reliable? 2 What does

More information

Short Code: An Efficient RAID-6 MDS Code for Optimizing Degraded Reads and Partial Stripe Writes

Short Code: An Efficient RAID-6 MDS Code for Optimizing Degraded Reads and Partial Stripe Writes : An Efficient RAID-6 MDS Code for Optimizing Degraded Reads and Partial Stripe Writes Yingxun Fu, Jiwu Shu, Xianghong Luo, Zhirong Shen, and Qingda Hu Abstract As reliability requirements are increasingly

More information

An Introduction to RAID

An Introduction to RAID Intro An Introduction to RAID Gursimtan Singh Dept. of CS & IT Doaba College RAID stands for Redundant Array of Inexpensive Disks. RAID is the organization of multiple disks into a large, high performance

More information

Matrix Methods for Lost Data Reconstruction in Erasure Codes

Matrix Methods for Lost Data Reconstruction in Erasure Codes Matrix Methods for Lost Data Reconstruction in Erasure Codes James Lee Hafner, Veera Deenadhayalan, and KK Rao John A Tomlin IBM Almaden Research Center Yahoo! Research hafner@almadenibmcom, [veerad,kkrao]@usibmcom

More information

Chapter 11. I/O Management and Disk Scheduling

Chapter 11. I/O Management and Disk Scheduling Operating System Chapter 11. I/O Management and Disk Scheduling Lynn Choi School of Electrical Engineering Categories of I/O Devices I/O devices can be grouped into 3 categories Human readable devices

More information

CS2410: Computer Architecture. Storage systems. Sangyeun Cho. Computer Science Department University of Pittsburgh

CS2410: Computer Architecture. Storage systems. Sangyeun Cho. Computer Science Department University of Pittsburgh CS24: Computer Architecture Storage systems Sangyeun Cho Computer Science Department (Some slides borrowed from D Patterson s lecture slides) Case for storage Shift in focus from computation to communication

More information

5.11 Parallelism and Memory Hierarchy: Redundant Arrays of Inexpensive Disks 485.e1

5.11 Parallelism and Memory Hierarchy: Redundant Arrays of Inexpensive Disks 485.e1 5.11 Parallelism and Memory Hierarchy: Redundant Arrays of Inexpensive Disks 485.e1 5.11 Parallelism and Memory Hierarchy: Redundant Arrays of Inexpensive Disks Amdahl s law in Chapter 1 reminds us that

More information

Today s Papers. Array Reliability. RAID Basics (Two optional papers) EECS 262a Advanced Topics in Computer Systems Lecture 3

Today s Papers. Array Reliability. RAID Basics (Two optional papers) EECS 262a Advanced Topics in Computer Systems Lecture 3 EECS 262a Advanced Topics in Computer Systems Lecture 3 Filesystems (Con t) September 10 th, 2012 John Kubiatowicz and Anthony D. Joseph Electrical Engineering and Computer Sciences University of California,

More information

Concepts Introduced. I/O Cannot Be Ignored. Typical Collection of I/O Devices. I/O Issues

Concepts Introduced. I/O Cannot Be Ignored. Typical Collection of I/O Devices. I/O Issues Concepts Introduced I/O Cannot Be Ignored Assume a program requires 100 seconds, 90 seconds for accessing main memory and 10 seconds for I/O. I/O introduction magnetic disks ash memory communication with

More information

Frequently asked questions from the previous class survey

Frequently asked questions from the previous class survey CS 370: OPERATING SYSTEMS [MASS STORAGE] Shrideep Pallickara Computer Science Colorado State University L29.1 Frequently asked questions from the previous class survey How does NTFS compare with UFS? L29.2

More information

Performance of Multicore LUP Decomposition

Performance of Multicore LUP Decomposition Performance of Multicore LUP Decomposition Nathan Beckmann Silas Boyd-Wickizer May 3, 00 ABSTRACT This paper evaluates the performance of four parallel LUP decomposition implementations. The implementations

More information

RAID. Redundant Array of Inexpensive Disks. Industry tends to use Independent Disks

RAID. Redundant Array of Inexpensive Disks. Industry tends to use Independent Disks RAID Chapter 5 1 RAID Redundant Array of Inexpensive Disks Industry tends to use Independent Disks Idea: Use multiple disks to parallelise Disk I/O for better performance Use multiple redundant disks for

More information

EC-Bench: Benchmarking Onload and Offload Erasure Coders on Modern Hardware Architectures

EC-Bench: Benchmarking Onload and Offload Erasure Coders on Modern Hardware Architectures EC-Bench: Benchmarking Onload and Offload Erasure Coders on Modern Hardware Architectures Haiyang Shi, Xiaoyi Lu, and Dhabaleswar K. (DK) Panda {shi.876, lu.932, panda.2}@osu.edu The Ohio State University

More information

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier Some material adapted from Mohamed Younis, UMBC CMSC 6 Spr 23 course slides Some material adapted from Hennessy & Patterson / 23 Elsevier Science Characteristics IBM 39 IBM UltraStar Integral 82 Disk diameter

More information

An Efficient Penalty-Aware Cache to Improve the Performance of Parity-based Disk Arrays under Faulty Conditions

An Efficient Penalty-Aware Cache to Improve the Performance of Parity-based Disk Arrays under Faulty Conditions 1 An Efficient Penalty-Aware Cache to Improve the Performance of Parity-based Disk Arrays under Faulty Conditions Shenggang Wan, Xubin He, Senior Member, IEEE, Jianzhong Huang, Qiang Cao, Member, IEEE,

More information

PCM: A Parity-check Matrix Based Approach to Improve Decoding Performance of XOR-based Erasure Codes

PCM: A Parity-check Matrix Based Approach to Improve Decoding Performance of XOR-based Erasure Codes 15 IEEE th Symposium on Reliable Distributed Systems : A Parity-check Matrix Based Approach to Improve Decoding Performance of XOR-based Erasure Codes Yongzhe Zhang, Chentao Wu, Jie Li, Minyi Guo Shanghai

More information

SPA: On-Line Availability Upgrades for Paritybased RAIDs through Supplementary Parity Augmentations

SPA: On-Line Availability Upgrades for Paritybased RAIDs through Supplementary Parity Augmentations University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln CSE Technical reports Computer Science and Engineering, Department of 2-20-2009 SPA: On-Line Availability Upgrades for Paritybased

More information

MODERN storage systems, such as GFS [1], Windows

MODERN storage systems, such as GFS [1], Windows IEEE TRANSACTIONS ON COMPUTERS, VOL. 66, NO. 1, JANUARY 2017 127 Short Code: An Efficient RAID-6 MDS Code for Optimizing Degraded Reads and Partial Stripe Writes Yingxun Fu, Jiwu Shu, Senior Member, IEEE,

More information

Computer Architecture 计算机体系结构. Lecture 6. Data Storage and I/O 第六讲 数据存储和输入输出. Chao Li, PhD. 李超博士

Computer Architecture 计算机体系结构. Lecture 6. Data Storage and I/O 第六讲 数据存储和输入输出. Chao Li, PhD. 李超博士 Computer Architecture 计算机体系结构 Lecture 6. Data Storage and I/O 第六讲 数据存储和输入输出 Chao Li, PhD. 李超博士 SJTU-SE346, Spring 2018 Review Memory hierarchy Cache and virtual memory Locality principle Miss cache, victim

More information

All About Erasure Codes: - Reed-Solomon Coding - LDPC Coding. James S. Plank. ICL - August 20, 2004

All About Erasure Codes: - Reed-Solomon Coding - LDPC Coding. James S. Plank. ICL - August 20, 2004 All About Erasure Codes: - Reed-Solomon Coding - LDPC Coding James S. Plank Logistical Computing and Internetworking Laboratory Department of Computer Science University of Tennessee ICL - August 2, 24

More information

Lecture 21: Reliable, High Performance Storage. CSC 469H1F Fall 2006 Angela Demke Brown

Lecture 21: Reliable, High Performance Storage. CSC 469H1F Fall 2006 Angela Demke Brown Lecture 21: Reliable, High Performance Storage CSC 469H1F Fall 2006 Angela Demke Brown 1 Review We ve looked at fault tolerance via server replication Continue operating with up to f failures Recovery

More information

Administrivia. CMSC 411 Computer Systems Architecture Lecture 19 Storage Systems, cont. Disks (cont.) Disks - review

Administrivia. CMSC 411 Computer Systems Architecture Lecture 19 Storage Systems, cont. Disks (cont.) Disks - review Administrivia CMSC 411 Computer Systems Architecture Lecture 19 Storage Systems, cont. Homework #4 due Thursday answers posted soon after Exam #2 on Thursday, April 24 on memory hierarchy (Unit 4) and

More information

Screaming Fast Galois Field Arithmetic Using Intel SIMD Instructions. James S. Plank USENIX FAST. University of Tennessee

Screaming Fast Galois Field Arithmetic Using Intel SIMD Instructions. James S. Plank USENIX FAST. University of Tennessee Screaming Fast Galois Field Arithmetic Using Intel SIMD Instructions James S. Plank University of Tennessee USENIX FAST San Jose, CA February 15, 2013. Authors Jim Plank Tennessee Kevin Greenan EMC/Data

More information

Storing Data: Disks and Files

Storing Data: Disks and Files Storing Data: Disks and Files Chapter 7 (2 nd edition) Chapter 9 (3 rd edition) Yea, from the table of my memory I ll wipe away all trivial fond records. -- Shakespeare, Hamlet Database Management Systems,

More information

Solid-state drive controller with embedded RAID functions

Solid-state drive controller with embedded RAID functions LETTER IEICE Electronics Express, Vol.11, No.12, 1 6 Solid-state drive controller with embedded RAID functions Jianjun Luo 1, Lingyan-Fan 1a), Chris Tsu 2, and Xuan Geng 3 1 Micro-Electronics Research

More information

A RANDOMLY EXPANDABLE METHOD FOR DATA LAYOUT OF RAID STORAGE SYSTEMS. Received October 2017; revised February 2018

A RANDOMLY EXPANDABLE METHOD FOR DATA LAYOUT OF RAID STORAGE SYSTEMS. Received October 2017; revised February 2018 International Journal of Innovative Computing, Information and Control ICIC International c 2018 ISSN 1349-4198 Volume 14, Number 3, June 2018 pp. 1079 1094 A RANDOMLY EXPANDABLE METHOD FOR DATA LAYOUT

More information

I, J A[I][J] / /4 8000/ I, J A(J, I) Chapter 5 Solutions S-3.

I, J A[I][J] / /4 8000/ I, J A(J, I) Chapter 5 Solutions S-3. 5 Solutions Chapter 5 Solutions S-3 5.1 5.1.1 4 5.1.2 I, J 5.1.3 A[I][J] 5.1.4 3596 8 800/4 2 8 8/4 8000/4 5.1.5 I, J 5.1.6 A(J, I) 5.2 5.2.1 Word Address Binary Address Tag Index Hit/Miss 5.2.2 3 0000

More information

6. Results. This section describes the performance that was achieved using the RAMA file system.

6. Results. This section describes the performance that was achieved using the RAMA file system. 6. Results This section describes the performance that was achieved using the RAMA file system. The resulting numbers represent actual file data bytes transferred to/from server disks per second, excluding

More information

Application of the Computer Capacity to the Analysis of Processors Evolution. BORIS RYABKO 1 and ANTON RAKITSKIY 2 April 17, 2018

Application of the Computer Capacity to the Analysis of Processors Evolution. BORIS RYABKO 1 and ANTON RAKITSKIY 2 April 17, 2018 Application of the Computer Capacity to the Analysis of Processors Evolution BORIS RYABKO 1 and ANTON RAKITSKIY 2 April 17, 2018 arxiv:1705.07730v1 [cs.pf] 14 May 2017 Abstract The notion of computer capacity

More information

WHITE PAPER SINGLE & MULTI CORE PERFORMANCE OF AN ERASURE CODING WORKLOAD ON AMD EPYC

WHITE PAPER SINGLE & MULTI CORE PERFORMANCE OF AN ERASURE CODING WORKLOAD ON AMD EPYC WHITE PAPER SINGLE & MULTI CORE PERFORMANCE OF AN ERASURE CODING WORKLOAD ON AMD EPYC INTRODUCTION With the EPYC processor line, AMD is expected to take a strong position in the server market including

More information

International Journal of Innovations in Engineering and Technology (IJIET)

International Journal of Innovations in Engineering and Technology (IJIET) RTL Design and Implementation of Erasure Code for RAID system Chethan.K 1, Dr.Srividya.P 2, Mr.Sivashanmugam Krishnan 3 1 PG Student, Department Of ECE, R. V. College Engineering, Bangalore, India. 2 Associate

More information

Code 5-6: An Efficient MDS Array Coding Scheme to Accelerate Online RAID Level Migration

Code 5-6: An Efficient MDS Array Coding Scheme to Accelerate Online RAID Level Migration 2015 44th International Conference on Parallel Processing Code 5-6: An Efficient MDS Array Coding Scheme to Accelerate Online RAID Level Migration Chentao Wu 1, Xubin He 2, Jie Li 1 and Minyi Guo 1 1 Shanghai

More information

Deduction and Logic Implementation of the Fractal Scan Algorithm

Deduction and Logic Implementation of the Fractal Scan Algorithm Deduction and Logic Implementation of the Fractal Scan Algorithm Zhangjin Chen, Feng Ran, Zheming Jin Microelectronic R&D center, Shanghai University Shanghai, China and Meihua Xu School of Mechatronical

More information

PC-based data acquisition II

PC-based data acquisition II FYS3240 PC-based instrumentation and microcontrollers PC-based data acquisition II Data streaming to a storage device Spring 2015 Lecture 9 Bekkeng, 29.1.2015 Data streaming Data written to or read from

More information

Close-form and Matrix En/Decoding Evaluation on Different Erasure Codes

Close-form and Matrix En/Decoding Evaluation on Different Erasure Codes UNIVERSITY OF MINNESOTA-TWIN CITIES Close-form and Matrix En/Decoding Evaluation on Different Erasure Codes by Zhe Zhang A thesis submitted in partial fulfillment for the degree of Master of Science in

More information

CSE 153 Design of Operating Systems

CSE 153 Design of Operating Systems CSE 153 Design of Operating Systems Winter 2018 Lecture 22: File system optimizations and advanced topics There s more to filesystems J Standard Performance improvement techniques Alternative important

More information

Erasure coding and AONT algorithm selection for Secure Distributed Storage. Alem Abreha Sowmya Shetty

Erasure coding and AONT algorithm selection for Secure Distributed Storage. Alem Abreha Sowmya Shetty Erasure coding and AONT algorithm selection for Secure Distributed Storage Alem Abreha Sowmya Shetty Secure Distributed Storage AONT(All-Or-Nothing Transform) unkeyed transformation φ mapping a sequence

More information

Parallelizing Inline Data Reduction Operations for Primary Storage Systems

Parallelizing Inline Data Reduction Operations for Primary Storage Systems Parallelizing Inline Data Reduction Operations for Primary Storage Systems Jeonghyeon Ma ( ) and Chanik Park Department of Computer Science and Engineering, POSTECH, Pohang, South Korea {doitnow0415,cipark}@postech.ac.kr

More information

Delayed Partial Parity Scheme for Reliable and High-Performance Flash Memory SSD

Delayed Partial Parity Scheme for Reliable and High-Performance Flash Memory SSD Delayed Partial Parity Scheme for Reliable and High-Performance Flash Memory SSD Soojun Im School of ICE Sungkyunkwan University Suwon, Korea Email: lang33@skku.edu Dongkun Shin School of ICE Sungkyunkwan

More information

A Performance Evaluation and Examination of Open-Source Erasure Coding Libraries For Storage

A Performance Evaluation and Examination of Open-Source Erasure Coding Libraries For Storage A Performance Evaluation and Examination of Open-Source Erasure Coding Libraries For Storage James S. Plank Jianqiang Luo Catherine D. Schuman Lihao Xu Zooko Wilcox-O Hearn Appearing in: FAST-9: 7th USENIX

More information

Seek-Efficient I/O Optimization in Single Failure Recovery for XOR-Coded Storage Systems

Seek-Efficient I/O Optimization in Single Failure Recovery for XOR-Coded Storage Systems IEEE th Symposium on Reliable Distributed Systems Seek-Efficient I/O Optimization in Single Failure Recovery for XOR-Coded Storage Systems Zhirong Shen, Jiwu Shu, Yingxun Fu Department of Computer Science

More information

Performance Consistency

Performance Consistency White Paper Performance Consistency SanDIsk Corporation Corporate Headquarters 951 SanDisk Drive, Milpitas, CA 95035, U.S.A. Phone +1.408.801.1000 Fax +1.408.801.8657 www.sandisk.com Performance Consistency

More information

A Performance Evaluation and Examination of Open-Source Erasure Coding Libraries For Storage

A Performance Evaluation and Examination of Open-Source Erasure Coding Libraries For Storage A Performance Evaluation and Examination of Open-Source Erasure Coding Libraries For Storage James S. Plank Jianqiang Luo Catherine D. Schuman Lihao Xu Zooko Wilcox-O Hearn Abstract Over the past five

More information

ECE Enterprise Storage Architecture. Fall 2018

ECE Enterprise Storage Architecture. Fall 2018 ECE590-03 Enterprise Storage Architecture Fall 2018 RAID Tyler Bletsch Duke University Slides include material from Vince Freeh (NCSU) A case for redundant arrays of inexpensive disks Circa late 80s..

More information

SCALING UP OF E-MSR CODES BASED DISTRIBUTED STORAGE SYSTEMS WITH FIXED NUMBER OF REDUNDANCY NODES

SCALING UP OF E-MSR CODES BASED DISTRIBUTED STORAGE SYSTEMS WITH FIXED NUMBER OF REDUNDANCY NODES SCALING UP OF E-MSR CODES BASED DISTRIBUTED STORAGE SYSTEMS WITH FIXED NUMBER OF REDUNDANCY NODES Haotian Zhao, Yinlong Xu and Liping Xiang School of Computer Science and Technology, University of Science

More information

Reducing The De-linearization of Data Placement to Improve Deduplication Performance

Reducing The De-linearization of Data Placement to Improve Deduplication Performance Reducing The De-linearization of Data Placement to Improve Deduplication Performance Yujuan Tan 1, Zhichao Yan 2, Dan Feng 2, E. H.-M. Sha 1,3 1 School of Computer Science & Technology, Chongqing University

More information

Presented by: Nafiseh Mahmoudi Spring 2017

Presented by: Nafiseh Mahmoudi Spring 2017 Presented by: Nafiseh Mahmoudi Spring 2017 Authors: Publication: Type: ACM Transactions on Storage (TOS), 2016 Research Paper 2 High speed data processing demands high storage I/O performance. Flash memory

More information

Computer Science 146. Computer Architecture

Computer Science 146. Computer Architecture Computer Science 46 Computer Architecture Spring 24 Harvard University Instructor: Prof dbrooks@eecsharvardedu Lecture 22: More I/O Computer Science 46 Lecture Outline HW5 and Project Questions? Storage

More information

STORAGE CONFIGURATION GUIDE: CHOOSING THE RIGHT ARCHITECTURE FOR THE APPLICATION AND ENVIRONMENT

STORAGE CONFIGURATION GUIDE: CHOOSING THE RIGHT ARCHITECTURE FOR THE APPLICATION AND ENVIRONMENT WHITEPAPER STORAGE CONFIGURATION GUIDE: CHOOSING THE RIGHT ARCHITECTURE FOR THE APPLICATION AND ENVIRONMENT This document is designed to aid in the configuration and deployment of Nexsan storage solutions

More information

Encoding-Aware Data Placement for Efficient Degraded Reads in XOR-Coded Storage Systems

Encoding-Aware Data Placement for Efficient Degraded Reads in XOR-Coded Storage Systems Encoding-Aware Data Placement for Efficient Degraded Reads in XOR-Coded Storage Systems Zhirong Shen, Patrick P. C. Lee, Jiwu Shu, Wenzhong Guo College of Mathematics and Computer Science, Fuzhou University

More information

DiskReduce: RAID for Data-Intensive Scalable Computing (CMU-PDL )

DiskReduce: RAID for Data-Intensive Scalable Computing (CMU-PDL ) Research Showcase @ CMU Parallel Data Laboratory Research Centers and Institutes 11-2009 DiskReduce: RAID for Data-Intensive Scalable Computing (CMU-PDL-09-112) Bin Fan Wittawat Tantisiriroj Lin Xiao Garth

More information

Database Architecture 2 & Storage. Instructor: Matei Zaharia cs245.stanford.edu

Database Architecture 2 & Storage. Instructor: Matei Zaharia cs245.stanford.edu Database Architecture 2 & Storage Instructor: Matei Zaharia cs245.stanford.edu Summary from Last Time System R mostly matched the architecture of a modern RDBMS» SQL» Many storage & access methods» Cost-based

More information

RAID: The Innovative Data Storage Manager

RAID: The Innovative Data Storage Manager RAID: The Innovative Data Storage Manager Amit Tyagi IIMT College of Engineering, Greater Noida, UP, India Abstract-RAID is a technology that is used to increase the performance and/or reliability of data

More information

hot plug RAID memory technology for fault tolerance and scalability

hot plug RAID memory technology for fault tolerance and scalability hp industry standard servers april 2003 technology brief TC030412TB hot plug RAID memory technology for fault tolerance and scalability table of contents abstract... 2 introduction... 2 memory reliability...

More information

Appendix D: Storage Systems

Appendix D: Storage Systems Appendix D: Storage Systems Instructor: Josep Torrellas CS433 Copyright Josep Torrellas 1999, 2001, 2002, 2013 1 Storage Systems : Disks Used for long term storage of files temporarily store parts of pgm

More information

VERITAS Storage Foundation 4.0 for Oracle

VERITAS Storage Foundation 4.0 for Oracle J U N E 2 0 0 4 VERITAS Storage Foundation 4.0 for Oracle Performance Brief OLTP Solaris Oracle 9iR2 VERITAS Storage Foundation for Oracle Abstract This document details the high performance characteristics

More information

Stupid File Systems Are Better

Stupid File Systems Are Better Stupid File Systems Are Better Lex Stein Harvard University Abstract File systems were originally designed for hosts with only one disk. Over the past 2 years, a number of increasingly complicated changes

More information

Storage. Hwansoo Han

Storage. Hwansoo Han Storage Hwansoo Han I/O Devices I/O devices can be characterized by Behavior: input, out, storage Partner: human or machine Data rate: bytes/sec, transfers/sec I/O bus connections 2 I/O System Characteristics

More information

Pyramid Codes: Flexible Schemes to Trade Space for Access Efficiency in Reliable Data Storage Systems

Pyramid Codes: Flexible Schemes to Trade Space for Access Efficiency in Reliable Data Storage Systems Pyramid Codes: Flexible Schemes to Trade Space for Access Efficiency in Reliable Data Storage Systems Cheng Huang, Minghua Chen, and Jin Li Microsoft Research, Redmond, WA 98052 Abstract To flexibly explore

More information

High Performance Computing Course Notes High Performance Storage

High Performance Computing Course Notes High Performance Storage High Performance Computing Course Notes 2008-2009 2009 High Performance Storage Storage devices Primary storage: register (1 CPU cycle, a few ns) Cache (10-200 cycles, 0.02-0.5us) Main memory Local main

More information

PANASAS TIERED PARITY ARCHITECTURE

PANASAS TIERED PARITY ARCHITECTURE PANASAS TIERED PARITY ARCHITECTURE Larry Jones, Matt Reid, Marc Unangst, Garth Gibson, and Brent Welch White Paper May 2010 Abstract Disk drives are approximately 250 times denser today than a decade ago.

More information

OS and Hardware Tuning

OS and Hardware Tuning OS and Hardware Tuning Tuning Considerations OS Threads Thread Switching Priorities Virtual Memory DB buffer size File System Disk layout and access Hardware Storage subsystem Configuring the disk array

More information

Efficient Load Balancing and Disk Failure Avoidance Approach Using Restful Web Services

Efficient Load Balancing and Disk Failure Avoidance Approach Using Restful Web Services Efficient Load Balancing and Disk Failure Avoidance Approach Using Restful Web Services Neha Shiraz, Dr. Parikshit N. Mahalle Persuing M.E, Department of Computer Engineering, Smt. Kashibai Navale College

More information

File systems CS 241. May 2, University of Illinois

File systems CS 241. May 2, University of Illinois File systems CS 241 May 2, 2014 University of Illinois 1 Announcements Finals approaching, know your times and conflicts Ours: Friday May 16, 8-11 am Inform us by Wed May 7 if you have to take a conflict

More information

OS and HW Tuning Considerations!

OS and HW Tuning Considerations! Administração e Optimização de Bases de Dados 2012/2013 Hardware and OS Tuning Bruno Martins DEI@Técnico e DMIR@INESC-ID OS and HW Tuning Considerations OS " Threads Thread Switching Priorities " Virtual

More information

Database Systems II. Secondary Storage

Database Systems II. Secondary Storage Database Systems II Secondary Storage CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 29 The Memory Hierarchy Swapping, Main-memory DBMS s Tertiary Storage: Tape, Network Backup 3,200 MB/s (DDR-SDRAM

More information

In the late 1980s, rapid adoption of computers

In the late 1980s, rapid adoption of computers hapter 3 ata Protection: RI In the late 1980s, rapid adoption of computers for business processes stimulated the KY ONPTS Hardware and Software RI growth of new applications and databases, significantly

More information

CS252 S05. Main memory management. Memory hardware. The scale of things. Memory hardware (cont.) Bottleneck

CS252 S05. Main memory management. Memory hardware. The scale of things. Memory hardware (cont.) Bottleneck Main memory management CMSC 411 Computer Systems Architecture Lecture 16 Memory Hierarchy 3 (Main Memory & Memory) Questions: How big should main memory be? How to handle reads and writes? How to find

More information

Lecture 23: Storage Systems. Topics: disk access, bus design, evaluation metrics, RAID (Sections )

Lecture 23: Storage Systems. Topics: disk access, bus design, evaluation metrics, RAID (Sections ) Lecture 23: Storage Systems Topics: disk access, bus design, evaluation metrics, RAID (Sections 7.1-7.9) 1 Role of I/O Activities external to the CPU are typically orders of magnitude slower Example: while

More information

Differential RAID: Rethinking RAID for SSD Reliability

Differential RAID: Rethinking RAID for SSD Reliability Differential RAID: Rethinking RAID for SSD Reliability Mahesh Balakrishnan Asim Kadav 1, Vijayan Prabhakaran, Dahlia Malkhi Microsoft Research Silicon Valley 1 The University of Wisconsin-Madison Solid

More information

On Data Parallelism of Erasure Coding in Distributed Storage Systems

On Data Parallelism of Erasure Coding in Distributed Storage Systems On Data Parallelism of Erasure Coding in Distributed Storage Systems Jun Li, Baochun Li Department of Electrical and Computer Engineering, University of Toronto, Canada {junli, bli}@ece.toronto.edu Abstract

More information

1 of 6 4/8/2011 4:08 PM Electronic Hardware Information, Guides and Tools search newsletter subscribe Home Utilities Downloads Links Info Ads by Google Raid Hard Drives Raid Raid Data Recovery SSD in Raid

More information