2 STATEMENT BY AUTHOR This thesis has been submitted in partial fulllment of requirements for an advanced degree at The University of Arizona and is d

Size: px
Start display at page:

Download "2 STATEMENT BY AUTHOR This thesis has been submitted in partial fulllment of requirements for an advanced degree at The University of Arizona and is d"

Transcription

1 ANALYTICAL EVALUATION OF THE RAID 5 DISK ARRAY by Anand Kuratti A Thesis Submitted to the Faculty of the DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING In Partial Fulllment of the Requirements For the Degree of MASTER OF SCIENCE WITH A MAJOR IN ELECTRICAL ENGINEERING In the Graduate College THE UNIVERSITY OF ARIZONA 1994

2 2 STATEMENT BY AUTHOR This thesis has been submitted in partial fulllment of requirements for an advanced degree at The University of Arizona and is deposited in the University Library to be made available to borrowers under rules of the Library. Brief quotations from this thesis are allowable without special permission, provided that accurate acknowledgment of source is made. Requests for permission for extended quotation from or reproduction of this manuscript in whole or in part may be granted by the head of the major department or the Dean of the Graduate College when in his or her judgment the proposed use of the material is in the interests of scholarship. In all other instances, however, permission must be obtained from the author. SIGNED: APPROVAL BY THESIS DIRECTOR This thesis has been approved on the date shown below: William H. Sanders Associate Professor of Electrical and Computer Engineering Date

3 3 ACKNOWLEDGMENTS There are many people I would like to acknowledge for helping me throughout the development of this thesis. Iwould like thank my thesis committee members: Dr. Pamela Delaney, Dr. Bernard Zeigler, and Dr. William Sanders. Iwould especially like to thank Bill for his constant support and valuable ideas that helped keep me on track in the face of numerous failed approaches to this problem. Iwould like to thank everyone in the PMRL lab, past and present: John Diener, Bruce McLeod, Lorenz Lercher, Akber Qureshi, Luai Malhis, Fransiskus Widjanarko, Latha Kant, Bhavan Shah, Doug Obal, and Aad van Moorsel, all of whom listened patiently to my constant ramblings.

4 4 To my parents, for their love and support To my brother, for his long distance enthusiasm

5 5 TABLE OF CONTENTS LIST OF FIGURES : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 7 LIST OF TABLES : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1 ABSTRACT : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : INTRODUCTION : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : RAID 5 ARCHITECTURE : : : : : : : : : : : : : : : : : : : : : : : : : : Data and Parity Placement : : : : : : : : : : : : : : : : : : : : : : : : : : I/O Methods : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Normal I/O Methods : : : : : : : : : : : : : : : : : : : : : : : : : Reconstruction I/O Methods : : : : : : : : : : : : : : : : : : : : : System Workload : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Model Assumptions : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 3 3. PERFORMANCE MODEL : : : : : : : : : : : : : : : : : : : : : : : : : : Disk Model : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Seek Time : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Disk Access Time : : : : : : : : : : : : : : : : : : : : : : : : : : : Disk Service Time : : : : : : : : : : : : : : : : : : : : : : : : : : : Disk Arrival Process : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Disk Access Probabilities : : : : : : : : : : : : : : : : : : : : : : : : : : : Response Time : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : PERFORMABILITY MODEL : : : : : : : : : : : : : : : : : : : : : : : : Decomposition of Reconstruction Interval : : : : : : : : : : : : : : : : : : Response Time : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Optimal Reconstruction Rate : : : : : : : : : : : : : : : : : : : : : : : : : CONCLUSIONS : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 89 Appendix A. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 92 Appendix B. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 93

6 6 Appendix C. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 97 C.1. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 97 Appendix D. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1 D.1. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1 Appendix E. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 12 E.1. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 12 E.2. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 12 Appendix F. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 14 Appendix G. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 18 G.1. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 18 Appendix H. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 116 H.1. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 116 REFERENCES : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 124

7 7 LIST OF FIGURES 2.1. Relationship between Data Mapping Entities : : : : : : : : : : : : : : : : Read Request : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Full Stripe Write Request : : : : : : : : : : : : : : : : : : : : : : : : : : : Partial Stripe Write Request : : : : : : : : : : : : : : : : : : : : : : : : : Read Reconstruction Request : : : : : : : : : : : : : : : : : : : : : : : : : Full Stripe Write Reconstruction Request : : : : : : : : : : : : : : : : : : Partial Stripe Write Reconstruction Request : : : : : : : : : : : : : : : : Disk Prole : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Disk Service Time Density for Dierent Values of p s : : : : : : : : : : : : Erlang Approximation for Disk Service Time Density : : : : : : : : : : : Poisson Characteristic of Individual Disk Accesses : : : : : : : : : : : : : Possible Accesses of Disk 2 - Read 2 Data Stripe Units : : : : : : : : : : Analytical Disk Access Probabilities : : : : : : : : : : : : : : : : : : : : : Simulated Disk Access Probabilities : : : : : : : : : : : : : : : : : : : : : Mean Total Response Time - Analytical : : : : : : : : : : : : : : : : : : : Mean Total Response Time - Simulation : : : : : : : : : : : : : : : : : : Mean Total Response Time - Percent Dierence : : : : : : : : : : : : : : Mean Read Response Time - Analytical : : : : : : : : : : : : : : : : : : : Mean Read Response Time - Simulation : : : : : : : : : : : : : : : : : : Mean Read Response Time - Percent Dierence : : : : : : : : : : : : : : Mean Write Response Time - Analytical : : : : : : : : : : : : : : : : : : Mean Write Response Time - Simulation : : : : : : : : : : : : : : : : : : Mean Write Response Time - Percent Dierence : : : : : : : : : : : : : : RAID 5 Data Reconstruction : : : : : : : : : : : : : : : : : : : : : : : : : Mean Batch Queue Length - 5th Reconstruction Request : : : : : : : : Mean Batch Queue Length - 1th Reconstruction Request : : : : : : : : Mean Batch Queue Length - 2th Reconstruction Request : : : : : : : : Mean Batch Queue Length - 4th Reconstruction Request : : : : : : : : Mean Batch Queue Length - Steady State : : : : : : : : : : : : : : : : : : Percent Dierence - 5th Reconstruction Request : : : : : : : : : : : : : Percent Dierence - 1th Reconstruction Request : : : : : : : : : : : : : Percent Dierence - 2th Reconstruction Request : : : : : : : : : : : : : 71

8 Percent Dierence - 4th Reconstruction Request : : : : : : : : : : : : : Mean Total Response Time - Analytical, percentage = % : : : : : : : : Mean Total Response Time - Simulation, percentage = % : : : : : : : : Mean Total Response Time - Percent Dierence, percentage = % : : : : Mean Total Response Time - Analytical, percentage = 2% : : : : : : : : Mean Total Response Time - Simulation, percentage = 2% : : : : : : : Mean Total Response Time - Percent Dierence, percentage = 2% : : : Mean Total Response Time - Analytical, percentage = 4% : : : : : : : : Mean Total Response Time - Simulation, percentage = 4% : : : : : : : Mean Total Response Time - Percent Dierence, percentage = 4% : : : Mean Total Response Time - Analytical, percentage = 6% : : : : : : : : Mean Total Response Time - Simulation, percentage = 6% : : : : : : : Mean Total Response Time - Percent Dierence, percentage = 6% : : : Mean Total Response Time - Analytical, percentage = 8% : : : : : : : : Mean Total Response Time - Simulation, percentage = 8% : : : : : : : Mean Total Response Time - Percent Dierence, percentage = 8% : : : RAID 5 Data Reconstruction With Additional Reconstruction Rate : Determination of Optimal - Multiple Objective Problem : : : : : : : : Determination of Optimal - Single Objective Problem : : : : : : : : : Optimal Additional Reconstruction Rate : : : : : : : : : : : : : : : : : : 88 G.1. Mean Read Response Time - Analytical, percentage = % : : : : : : : : 18 G.2. Mean Read Response Time - Simulation, percentage = % : : : : : : : : 19 G.3. Mean Read Response Time - Percent Dierence, percentage = % : : : : 19 G.4. Mean Read Response Time - Analytical, percentage = 2% : : : : : : : : 11 G.5. Mean Read Response Time - Simulation, percentage = 2% : : : : : : : 11 G.6. Mean Read Response Time - Percent Dierence, percentage = 2% : : : 111 G.7. Mean Read Response Time - Analytical, percentage = 4% : : : : : : : : 111 G.8. Mean Read Response Time - Simulation, percentage = 4% : : : : : : : 112 G.9. Mean Read Response Time - Percent Dierence, percentage = 4% : : : 112 G.1.Mean Read Response Time - Analytical, percentage = 6% : : : : : : : : 113 G.11.Mean Read Response Time - Simulation, percentage = 6% : : : : : : : 113 G.12.Mean Read Response Time - Percent Dierence, percentage = 6% : : : 114 G.13.Mean Read Response Time - Analytical, percentage = 8% : : : : : : : : 114 G.14.Mean Read Response Time - Simulation, percentage = 8% : : : : : : : 115 G.15.Mean Read Response Time - Percent Dierence, percentage = 8% : : : 115 H.1. Mean Write Response Time - Analytical, percentage = % : : : : : : : : 116 H.2. Mean Write Response Time - Simulation, percentage = % : : : : : : : : 117 H.3. Mean Write Response Time - Percent Dierence, percentage = % : : : : 117 H.4. Mean Write Response Time - Analytical, percentage = 2% : : : : : : : 118

9 H.5. Mean Write Response Time - Simulation, percentage = 2% : : : : : : : 118 H.6. Mean Write Response Time - Percent Dierence, percentage = 2% : : : 119 H.7. Mean Write Response Time - Analytical, percentage = 4% : : : : : : : 119 H.8. Mean Write Response Time - Simulation, percentage = 4% : : : : : : : 12 H.9. Mean Write Response Time - Percent Dierence, percentage = 4% : : : 12 H.1.Mean Write Response Time - Analytical, percentage = 6% : : : : : : : 121 H.11.Mean Write Response Time - Simulation, percentage = 6% : : : : : : : 121 H.12.Mean Write Response Time - Percent Dierence, percentage = 6% : : : 122 H.13.Mean Write Response Time - Analytical, percentage = 8% : : : : : : : 122 H.14.Mean Write Response Time - Simulation, percentage = 8% : : : : : : : 123 H.15.Mean Write Response Time - Percent Dierence, percentage = 8% : : : 123 9

10 1 LIST OF TABLES 2.1. Assumed Disk Parameters : : : : : : : : : : : : : : : : : : : : : : : : : : Read Request for 2 Data Stripe Units - Possible Disk Accesses : : : : : : 45

11 11 ABSTRACT As processor and memory performance continue to dramatically increase, the bottleneck in modern computers has shifted to the I/O subsystem. As a result, strategies to provide better performance than current disk systems have been investigated. One eort is the RAID (Redundant Arrays of Inexpensive Disks) Level 5 disk array. The RAID 5 disk array oers increased parallelism of I/O requests through the disk array architecture and fault tolerance through rotated parity. Although analytical models of disk array performance have been developed, they often rely on simplifying assumptions or bounds which cause results to be accurate for a restricted set of the possible workload parameters. This thesis presents analytical performance and performability models to compute the mean steady state response time for a RAID 5 I/O request under a transaction-processing workload. It is shown that these models are accurate for a wider range of the workload parameters than previous studies. Using an observation of how data is reconstructed when a single disk in a row has failed, the analytical models are extended to investigate an optimal rate for data reconstruction.

12 12 CHAPTER 1 INTRODUCTION Over the past decade, processor speed, memory speed, memory capacity, and disk capacity of computers have dramatically improved [1]. Single chip processor speeds have increased at a rate of 4%-1% per year. Access times for main memory have decreased 4%-1% per year. Main memory capacity has quadrupled every two to three years. In contrast, disk I/O performance has shown only modest gains over the same period of time. Disk seek times have improved at a rate of 7% per year. Transfer times from disk to main memory have remained at least an order of magnitude slower than transfer times from main memory to processor. This imbalanced system growth illustrates that a traditional computer organization consisting of a CPU, memory and single large capacity disk for mass storage is inadequate for the next generation of computers. As a result, if the imbalance in I/O performance provided by current disk systems is not remedied, future improvements in processor and

13 13 memory design will be wasted. Continued improvement in system performance depends on I/O subsystems with higher data and I/O rates. Away to increase I/O performance is to use an array of disks [2, 3]. By interleaving data across many disks, both throughput (measured in megabytes (MB) per second) and I/O rate (measured in I/O requests per second) are improved. Throughput is increased by having many disks cooperate in transferring a block of information; the I/O rate is increased by having multiple disks service multiple independent disk requests. Although disk arrays can achieve better performance, an important consequence is that the reliability of multiple disks is lower than a single disk. For example, if disk failure times are exponentially distributed, 1 disks have a combined failure rate 1 times larger than a single disk [4]. More importantly, if every disk failure caused data loss, a 1 disk array would lose data every few hundred hours. To protect against data loss due to disk failures, redundancy schemes have been incorporated into disk arrays. Redundancy schemes are designed to allow a disk array to continue operation when one or more disks have failed and data on failed disks becomes unavailable. Because disk arrays combined with data redundancy hold the promise of improved performance and availabilityover single disks, researchers have investigated dierent ways to design and organize disk array architectures. One eort is Redundant Arrays of Inexpensive Disks (RAID). Introduced in [5], Patterson, Gibson, and Katz present ve ways to introduce redundancy into an array of disks: RAID Level 1 to RAID Level 5. For each level, data is interleaved across multiple disks, but the type of redundancy ranges from traditional mirroring to rotated parity. Using a simple formula to estimate maximum throughput, the

14 14 authors conclude that RAID 5 with rotated parity oers the best performance potential of the organizations considered. Although RAID 5 oers improved performance and availability, techniques for modeling and analyzing I/O performance are important to be able to compare RAID 5 and current disk systems. In particular, analytical models combined with a realistic assessment of workload allow for accurate design and performance prediction. However, like many parallel systems, disk arrays are dicult to model because of queuing and fork-join synchronization. Since data is placed on multiple disks, an I/O request to the disk array breaks up into several disk requests. Each disk request may wait for service, then waits for the other disk reqests to completerequest to Under general conditions, queuing or fork-join synchronization is tractable, but the combination is unsolvable. Analysis is highly dependent on the characteristics of the particular system and requires careful use of approximations and simplifying assumptions. Previous work in the analytical modeling of disk arrays falls into three categories: 1. models that ignore queuing 2. models that ignore fork-join synchronization 3. models that consider queuing and fork-join synchronization using approximate techniques. Models that ignore queuing are useful in computing minimum response time or maximum throughput. Although useful in estimating the limits of system performance, such

15 15 bounds are only accurate when the system load is extremely light or heavy. Salem and Garcia-Molina [6] derive the expected minimum response time to study the benets of data striping in synchronized non-redundant disk arrays and show the eects of several low-level disk optimizations on response times at individual disks. Bitton and Gray [7] calculate expected disk seek times for unsynchronized mirrored disk arrays. Kim and Tantwai [8] derive service time distributions for unsynchronized, bit-interleaved, non-redundant disk arrays. Patterson, Gibson, and Katz [5, 9] compute maximum throughput estimates for several RAID levels. Models that ignore fork-join synchronization are frequently used to model bit-interleaved disk arrays. Kim [1] models synchronized bit-interleaved arrays as an M=G=1 queue and shows that such arrays provide lower service times and better load balancing, but decrease the number of concurrent requests. Chen and Towsley [11] analytically model RAID Levels 1 through 5 using bounds based on the request workload. Overhead for fork-join synchronization is ignored for small write requests, resulting in an optimistic model; large requests are modeled using a single queue for all disks in the array. Since data is placed on multiple disks, an I/O request requires requests to individual disks in the array. The I/O request is complete when all disk requests nish. This behavior is similar to a fork-join queue in which a task forks into several subtasks, each of which is sent to a dierent server. When a subtask completes service, it enters a join node and waits for the remaining subtasks to nish service. After all subtasks are complete, the task is complete. Because of the similarity ofrequests in a disk array to tasks in a

16 16 fork-join queue, results from analyses of fork-join queues have been used to model queuing and fork-join synchronization of disk requests in disk arrays. Although exact results are available for the task response time of a two server fork-join queue [12, 13], general systems that exhibit both queuing and synchronization are not tractable. As a result, attention has shifted to computation of upper and lower bounds. Menon and Mattson [14] formulate an approximate model for RAID 5 under transaction processing workloads, based on a scaling approximation for the M=M=1 queue developed by Nelson and Tantwai [15]. However, the work does not justify the approximation for more than two disks or show that exponential service is an appropriate model for disk accesses. Baccelli, Makowski, and Schwartz [16] derive bounds for response time in fork-join queues under general arrival and service patterns using stochastic ordering and associated random variables. Most previous studies of disk arrays, including RAID 5, often rely on assumptions or bounds which limit the accuracy of results when compared to system measurements or detailed simulation models. When simplifying assumptions are used, the model is developed with regard to certain operating conditions. For example, models which ignore queuing of disk requests are only accurate when the rate of I/O requests is low and the probability that a diks request waits for service is small. Models that compute bounds are usually accurate for restricted regions of the workload. For example, Chen and Towsley [11] calculate mean response time for read and write requests given dierent request rates and sizes. The percent dierence for their analytical calculations of the I/O request response time for single stripe unit requests is less than 1% when compared to detailed

17 17 simulation. However, for multiple stripe unit requests, especially write I/O requests, the dierence is greater than 1% and as high as 5%. This thesis presents analytical models to calculate the steady state average, mean read, and mean write response time of RAID 5 I/O requests under a transaction-processing workload. It is shown that these models are accurate for a wider range of system workload than previous studies. By systematically deriving the distribution of time to access and transfer data during a disk request, the arrival process of requests to individual disks in the array, and the time for all dependent disk requests in an I/O request to complete, a more precise model which considers both queuing and fork-join synchronization is developed. To validate the analytical results, values for a wide range of I/O request sizes and arrival rates are computed and compared to results from a detailed simulation model. Finally, the optimal rate for data reconstruction is determined by formulating data reconstruction as a single objective mathematical programming problem. The organization of this thesis is as follows: Chapter 2 will briey describe the RAID 5 architecture, including data and parity assignment, I/O methods, and components of system workload. Using this description, assumptions used in developing the analytical models are presented. Chapter 3 will discuss the performance model, as well as derivations for the time to service a disk request, the arrival process of requests to individual disks, and the time for all disk requests for an I/O request to complete. Chapter 4 demonstrates that results from the performance model can be extended to analysis of single disk failures and determination of an optimal rebuild rate. Chapter 5 will give conclusions and directions for future research.

18 18 CHAPTER 2 RAID 5 ARCHITECTURE Redundant Arrays of Inexpensive Disks employ two concepts for improved performance and data availability, striping and data redundancy. Striping data across multiple disks provides higher performance than single disks by increasing parallelism and load balancing of requests. Redundancy improves data availability by allowing RAID to operate in the face of single disk failures without data loss. Although striping and data redundancy are simple concepts, the design of a disk array involves complex tradeos between availability, performance, and cost. This chapter describes how RAID 5 addresses these issues through data and parity placement and I/O methods. A more complete reference of RAID architectures can be found in [5, 9]. Using a description of the system operation, the workload and assumptions used to develop the analytical models are presented. 2.1 Data and Parity Placement A RAID 5 disk array consists of N identical disks on which data is interleaved. The unit of data interleaving, or amount of data that is placed on one disk before data is placed on the next disk, is a stripe unit. Since disks are organized into rows and columns, the set of stripe units with the same physical location on each disk in a row is a stripe.

19 19 The number of disks in a stripe is dened as the stripe width, W s. The data redundancy scheme used in RAID 5 is parity-based. Each stripe contains a parity stripe unit, which is the exclusive-or (XOR) of all data stripe units within the stripe. When a single disk in a row fails, data can be reconstructed by reading the corresponding data and parity stripe units from the other disks. To illustrate the relationships between a stripe unit, stripe, parity stripe unit, and disk, consider an array of 2 disks of 4 columns and 5 rows in Figure 2.1. Stripe 11 contains data stripe units 55, 56, 57, 59 and parity unit 58 on disks 15, 16, 17, 19, and 18. Since each stripe/row contains 5 stripe units/disks, the stripe width equals 5. In contrast to redundancy schemes with dedicated parity disks, parity is distributed uniformly across all disks in a RAID 5 disk array. Because parity stripe units are rotated, I/O requests which must update parity are more evenly balanced across all disks in the array. Another advantage of rotated parity is that data is also distributed more evenly, which allows more disks to participate in I/O operations and increase throughput and I/O rate. Although there are numerous ways to encode parity relative to data, a standard policy is right asymmetric. For right asymmetric parity placement shown in Figure 2.1, parity stripe units are laid in a diagonal pattern starting from the top rightmost disk. Given how data and parity units are placed on a RAID 5 disk array, several functions can be dened to relate the relative locations of stripe units, stripes, and disks: 1. stripe unit number! disk: SUN % ND

20 2 column disk stripe row parity stripe unit number of disks in array: stripe width: number of disk groups (rows): Figure 2.1: Relationship between Data Mapping Entities

21 21 2. stripe unit number! stripe number: d(sun/w s )-1e 3. stripe number! parity stripe unit: [bsn/w s (W s ) 2 c+(nr)]+ [(SN % W s )(NR)] where SUN stripe unit number, SN stripe number, W s stripe width, NR number of rows, and ND number of disks in the array. For instance, given stripe unit 31 in Figure 2.1, the corresponding disk, stripe number, and parity unit can be computed using the above functions: 1. disk = 31 % 2 = stripe number = d31/5-1e = 6 3. parity unit = b6/5c25+4+(6 %5)4 = I/O Methods Using the above description of how data and parity are placed on a RAID 5 disk array, methods to read from and write to the array can be dened. Depending on whether disks have failed, a RAID 5 disk array operates in one of two modes. When all disks are functioning, the array isinnormal mode. In reconstruction mode, one or more disks have failed and the array must reconstruct data for I/O requests which access the failed disk(s). I/O methods to read and write data for each mode are described in the following sections.

22 22 1. read data data stripe unit requested data stripe unit parity stripe unit Normal I/O Methods Figure 2.2: Read Request Because data is placed on multiple disks, a I/O request to the disk array to read or write data results in requests for data stripe units at individual disks. If M bytes are requested and the stripe unit size is b bytes, n = dm=be data stripe units are requested. If the request is a read as shown in Figure 2.2, the request is complete when all n disk requests complete. For a write request, data must not only be written, but the corresponding parity stripe units must also be updated. Depending on how much of a stripe is written, three cases arise. 1. The request starts at the rst data disk in a stripe and the request size is n = W s. In this case, all data stripe units in a stripe are written, or a full stripe write as shown in Figure 2.3. Since all data stripe units are overwritten, the new parity is

23 23 1. write new data and parity data stripe unit requested data stripe unit parity stripe unit Figure 2.3: Full Stripe Write Request generated entirely from new data. The request is complete when the n data and parity stripe units are written. 2. The request accesses a single partial stripe (n < W s ) and all n data stripe units requested belong to the same stripe as illustrated in Figure 2.4. In this case, the parity stripe unit must be updated. This is accomplished by rst reading the n old data and parity stripe units. Second, the new parity stripe unit is computed by XORing the old and new data stripe units. The request completes after the n data stripe units and new parity stripe unit have been written. 3. If the request accesses two or more partial stripes, i.e. the n data stripe units are allocated across stripe boundaries, two or more parity stripe units must be updated. Since stripe units in one stripe do not depend upon stripe units in another stripe, the

24 24 1. read old data and parity 2. compute new parity 3. write new data and parity XOR data stripe unit requested data stripe unit parity stripe unit Figure 2.4: Partial Stripe Write Request operation is divided into multiple partial stripe operations. The request completes when all partial stripe operations nish Reconstruction I/O Methods Because a parity stripe unit is associated with each stripe, data can be reconstructed when a single disk in a row fails. When a disk has failed and a requests data stripe unit cannot be accessed, it can be rebuilt through an XOR of the remaining data and parity stripe units in the stripe. Although a data stripe unit from a failed disk can be reconstructed each time it is needed, an important question is where new and reconstructed data should be stored. To address this issue, most RAID systems contain a pool of spare

25 25 disks 1. When a stripe unit is reconstructed or overwritten by new data, it is also written to a spare disk. By writing data to a spare disk, unnecessary reads to other disks in the are prevented stripe when the same data stripe unit is requested again. As more new and reconstructed data is written to a spare disk, the spare disk eventually replaces the failed disk. When all data from the failed disk(s) have been written to corresponding spare disk(s), operation returns to normal mode. Yet, because requested data may not always be available at the spare disk, I/O methods described for normal operation are modied during data reconstruction. If a failed disk is not accessed during a read request, the requested data stripe units are read from each of the corresponding disks as described for a normal read request. However, if a failed disk is accessed as shown in Figure 2.5 and the needed stripe unit is available from the spare disk, the stripe unit is read from the spare disk. However, if the stripe unit has not been reconstructed, the other data stripe units and parity units in the stripe are read. Then the requested stripe unit is reconstructed through an XOR of the remaining data and parity stripe units. Finally, the reconstructed stripe unit is written to the spare disk to complete the I/O request. As described for normal operation, a write request can access a full stripe, single partial stripe, or multiple partial stripes. In Figure 2.6, when data is written to a full stripe, all disks in the stripe, including the spare disk, are written with the new data and parity. 1 An important factor in the design of disk arrays is whether spare disks are hot or cold. Hot disks are on line, which allows for immediate switching, but subject to the same failure conditions as data disks. Cold disks are powered when needed, but require a start up period, which may impact the response time of requests. In this work, a fully functioning spare disk is assumed to be available when a data disk has failed.

26 26 When a single partial stripe is written as in Figure 2.7, old data and parity must be read to compute the new parity. This is the same as a read request from a failed disk described above. After the new parity has been computed, the data and parity disks are written. New data or parity which would have been written to the failed disk is instead written to the spare disk. A write to multiple partial stripes is considered as a series of multiple single partial stripe writes. 2.3 System Workload In order to assess how well a system performs, the conditions under which a system operates must also be considered. Although the description architecture provides details of how a RAID 5 disk array operates, the RAID 5 performance depends on the inputs which drive the system. These inputs are dened as the workload. The workload for RAID 5, and disk arrays in general, is composed of the frequency and pattern of the arrival of I/O requests, and size of an I/O request. Since the arrival of I/O requests to the disk array depends on the characteristics of the application(s) which read from and write data to the array, it is impossible to give a general model for the arrival of I/O requests. However, for many applications, the arrival of I/O requests can be approximated by a Poisson process. In this thesis, it is assumed that the arrival of I/O requests is Poisson with rate.

27 27 1. read data XOR 3c. reconstruct stripe unit 2. check if stripe unit at spare disk 3d. write stripe unit to spare disk 3a. if not, reconstruct stripe unit (read remaining stripe units) 3d. stripe unit available 4. otherwise, stripe unit available data stripe unit from failed disk requested data stripe unit requested data stripe unit parity stripe unit Figure 2.5: Read Reconstruction Request

28 28 1. write new data and parity 2. write new data to spare disk data stripe unit requested from failed disk data stripe unit requested data stripe unit parity stripe unit Figure 2.6: Full Stripe Write Reconstruction Request

29 29 1. read old data and parity XOR 3c. reconstruct stripe unit 2. check if stripe unit at spare disk 3d. write stripe unit to spare disk 6b. write data to spare disk 3a. if not, reconstruct stripe unit (read remaining stripe units) 3d. stripe unit available 3b. otherwise, stripe unit available 4. old data and parity read XOR 5. compute new parity 6a. write new data and parity data stripe unit from failed disk requested data stripe unit requested data stripe unit parity stripe unit Figure 2.7: Partial Stripe Write Reconstruction Request

30 3 The second component of the system workload is the size of an I/O request. For many applications, request sizes can be classied as either supercomputer-based, where requests are large and infrequent, or transaction-based, where small amounts of data are frequently accessed [2, 7]. For this work, it is assumed that requests are transaction-based, where the the number of data stripe units requested is less than or equal to the number of data stripe units in a stripe, W s, 1. A distribution which reects this type of workload is a quasi-geometric function [11] P fn = ng = 8 >< >: n =1; (1,) n,1 (1, ) (1,),(1,) Ws n =2;:::;W s, 1 where N is the request size and 1 n W s,1. Since the maximum number of data stripe units in an I/O request is W s, 1 and a request for data can overlap stripe boundaries, at most two stripes can be accessed during an I/O request. Given this description of RAID 5 operation and workload, a set of assumptions used to construct models of the I/O request response time is presented. 2.4 Model Assumptions Using a description of a RAID 5 disk array, including data and parity mapping, I/O methods and system workload, the goal of this thesis is to develop models to accurately compute the mean response time of RAID 5 I/O requests. In doing so, the following assumptions are made: 1. Current RAID systems are typically constructed with 1 to 1 disks. To obtain numerical results without loss of generality, the array is assumed to contain 2 disks

31 31 Time for full disk rotation (R max ) 16.7 ms Number of disk cylinders (C) 12 Total usable disk storage 5 MB Arm acceleration time (a) 3ms Seek factor (b).5 Transfer rate () 3 MB/s Table 2.1: Assumed Disk Parameters and a stripe width of 5 disks. Each disk has the parameters shown in table 2.1 which reect current disk technology. 2. The stripe unit size equals 4 KB. 3. For many transaction-processing workloads, such as scientic databases, a majority of requests are queries to read data. The ratio of reads to writes for such systems is usually 2 or 3 to 1. In this thesis, probabilities for read and write requests are assumed to be.7 and Each disk can service only one request at a time. Other requests wait and are serviced in rst come-rst served (FCFS) order. 5. The arrival of I/O requests to the system is assumed to be a Poisson process with a rate. 6. It is assumed that I/O requests access data throughout the disk array in a uniform pattern. Since an I/O request requires multiple stripe units, this means that the starting stripe unit is random and that each disk in the array is equally likely to contain the starting stripe unit in a request. 7. Parity placement is right asymmetric.

32 32 8. Since the focus of this work is the performance of the disk array, it is assumed that the disk subsystem is disk limited, i.e. the memory and data paths are fast enough have little relative eect on the I/O request response time.

33 33 CHAPTER 3 PERFORMANCE MODEL An important metric for disk systems is response time, or the time for an I/O request to nish after data has been requested. Since data is interleaved over several disks, an I/O request to a RAID 5 disk array results in multiple disk requests for stripe units. The time for all disk requests to complete is dened as the response time of the I/O request. This chapter will analyze and model the operations that occur during an I/O request when all disks are functioning. First, using previous work for how a disk locates and transfers data, the distribution of time for a disk to service a request is derived. Second, the arrival process of requests from I/O requests to individual disks is considered. Third, a method for computing the mean time needed for all disk requests in an I/O request to nish is investigated. 3.1 Disk Model To develop a model for the response time of RAID 5 I/O requests, individual disk accesses must be analyzed. Although disk behavior involves many complex electrical and mechanical interactions, three components dominate the time for a disk access [17]: seek time, rotational latency, and transfer time. Seek time is dened as the time required

34 34 for the disk arm to move to the correct cylinder. Rotational latency is the time for the required data sector to spin under the read/write head(s). Transfer time is the time to transfer the data to memory. The probability distribution (PDF) and density functions (pdf) for the time to complete a disk request can be derived based on previous results for seek time, rotational latency, and transfer time Seek Time In considering a model for seek time, Lynch [17] observes that there is a non-negative probability that the disk arm does not move during a disk access, or sequential access probability p s. When disk requests are scheduled on a rst-come, rst-served basis, he shows through empirical measurement of several disk systems that when the disk arm does move, it tends moves to any other cylinder with equal probability. Using these observations, he expresses the probability density for seek distance as the probability the disk arm moves i cylinders. This is written as P fd = ig = 8 >< >: p s i = (1, p s ) 2(C,i) C(C,1) i =1; 2;:::;C, 1 where D is a discrete random variable representing seek distance and C is the total number of disk cylinders. To determine seek time, which is the amount of time needed for the disk arm to move i cylinders, a relationship between seek distance and seek time must be determined. Using trace data measurements of several disks, Chen and Katz [?] empirically determine a formula for seek time that is a function of seek distance and disk specications

35 seek time (seconds) seek distance (cylinders) Figure 3.1: Disk Prole s = 8 >< >: d = a + b p d d> where s is the seek time, d is the seek distance in cylinders, a is the arm acceleration time, and b is the seek factor of the disk. Note that if the number of disk cylinders is C, the maximum number of cylinders that a disk arm can move during a request is C, 1 and the maximum seek time (S max ) is a + b p C, 1. To illustrate the behavior of seek time versus seek distance, the prole of a disk with the parameters in table 2.1 is shown in Figure 3.1.

36 36 Using previous work for the probability density of seek distance and the relationship of seek time to seek distance, the distribution and density of seek time can be written in terms of seek distance. Since seek time is a function of the seek distance random variable, the seek time pdf is a transformation of the seek distance pdf [18]. For the general case, where Y is a function g(x) of a random variable X, the density of Y can be written as f Y (y) = f X (x 1 ) jg (x 1 )j + f X(x 2 ) jg (x 2 )j + :::+ f X (x n) jg (x n)j where x 1 ;x 2 :::x n are the real roots of g(x) and g (x) is the derivative ofg(x). Using this rule, the derivative of seek time is and the density of seek time S is s = g (d) = 8 >< >: d = 2(s,a) b 2 d> f S (s) = (1,ps)2[C,( s,a b )2 ] C 2 : Since C, ( s,a b )2 = Cb2,s 2 +2as,a 2 b 2, f S (s) can be simplied to f S (s) = 2(1,ps) C [ Cb2,s 2 +2as,a 2 2 b 2 ][ s,a b 2 ] Disk Access Time =[ 2(1,ps) ][Cb 2, s 2 +2as, a 2 )(s, a)] C 2 b 4 =[ 2(1,ps) C 2 b 4 ][Cb 2 s, s 3 +3as 2, 3a 2 s + a 3 ] : The second component in determining the amount of time to service a disk request is the rotational latency. Rotational latency is dened as the time for the disk to rotate to

37 37 the starting sector of the data requested. Under a variety of workloads and disk scheduling policies, rotational latency is commonly observed to be uniformly distributed in [;R max ], where R max is the time for a full disk rotation [7, 14, 19]. The pdf of the rotational latency is written as f R (r) = 1 R max ; r R max : Because the time for a disk request depends on the amount of time to locate needed data on the disk, the pdf the of disk access time, dened as the total time to move to the starting cylinder and track of the data requested, must be determined. If the random variable X denotes disk access time, then X can be expressed as S+R, where S represents seek time and R represents rotational latency. Since the time for the disk arm to move to the correct cylinder (S) is independent ofthe time for the disk to spin to the correct sector (R), the probability density ofx is the convolution of S and R f X (x) =f R (r) f S (s) = Z Smax f R (x, s)df S (s) : Since seek time is based on the number of cylinders that the disk arm moves, S is not a continuous random variable. Due to the discrete nature of seek time, the regions of integration for f X (x) depend on a, S max, and R max. For the case where R max b p C, 1, which corresponds to the parameters in table 2.1, the density of X is written as f X (x) = 8 >< R x f R (x)p s + a f + R (x, s)df s (s) R x a f + R (x, s)df s (s) R x x,r max f R (x, s)df s (s) x R max R max <x R max + a R max + a<x S max >: R Smax x,r max f R (x, s)df s (s) S max <x S max + R max :

38 38 Depending on which region a particular value for seek time belongs, the corresponding density value can be computed. Evaluation of each integral expression is shown in Appendix A Disk Service Time Once the track and sector containing the data has been located on the disk, the nal component of a disk request is the time to transfer the data from disk to main memory. The transfer time T for single block of data equals b= where b is the block size in bytes and is the transfer rate in bytes/second. Thus, the time for a disk to locate and transfer a single block of data, or disk service time, isy = X +T. Since each RAID 5 disk request consists of transfering a stripe unit of xed size, the transfer time for each disk request is constant. Because the transfer time for each disk request is constant, the transfer time shifts the disk service time pdf but does not change the shape of the density. Figure 3.2 shows the pdf of the disk service time for a 4 KB stripe unit given dierent values of p s and the parameters listed in table 2.1. Note that seek time and rotational latency are much greater than the transfer time for a disk request. As p s increases, the rotational latency dominates the disk service time of the request. If the disk arm does not move, rotational latency is eectively the only component of time for a disk request, and the density of disk service time equals the uniform density of rotational latency. This is illustrated in Figure 3.2. When p s =1/C, where C is the number of disk cylinders, the probability that the disk arm does not move equals the probability ofmoving to any other cylinder. For this case,

39 R max =.167 seconds p s =1/C p s =.1 p s =.3 p s =.5 p s =.7 p s =.9 5 pdf of disk service time disk service time (seconds) Figure 3.2: Disk Service Time Density for Dierent Values of p s the function is continuous. The graph of the disk service time density in Figure 3.2 is similar to trace data measurements for several disk systems observed in [8]. To determine the disk service time density for a disk in a RAID 5 system under a transaction-processing workload, the locations of data between successive requests must be considered. Because of the assumptions that the starting stripe unit of an I/O request is random and requests for a disk are serviced in rst come, rst served order, the data accessed between successive requests will tend to be scattered across the disk. Therefore, during a request, the disk arm tends to move to any other cylinder, including not move, with equal probability. This is equivalent to the case where p s = 1/C. When the sequential access probability p s equals 1/C, the pdf of disk service time can be approximated by an Erlang density oforder k and mean. Figure 3.3 shows an

40 4 6 actual (p s =1/C) least squares (Erlang) peak fit (Erlang) 5 pdf of disk service time disk service time (seconds) Figure 3.3: Erlang Approximation for Disk Service Time Density optimized t of the Erlang p df to the actual pdf using the least squares curve tting method. By shifting the mean slightly, the peak can be more closely matched (peak t), while sacricing the error on the right side of the curve. The parameters of the Erlang density for this case are order (k) 8 and mean () 42. This pdf will be used in analytical models developed in this following sections. In contrast, the simulator described in Appendix F models the actual behavior of a disk during a request. First, the time to move to the cylinder where the needed stripe unit is computed. Second, the time for the disk to spin to the correct data sector from the current position is calculated. To determine the actual disk service time, both of these quantities are added to the xed time to transfer the stripe unit. In this manner, the actual disk behavior is modeled and can provide an accurate comparison to the probabilistic expressions developed above.

41 41 With an understanding of the time to service a single disk request, the arrival process of requests resulting from an I/O request to individual disks in the array is investigated in the next section. 3.2 Disk Arrival Process Given the assumption that the arrival of I/O requests is a Poisson process, it is important tocharacterize the arrival process of subsequent disk requests to individual disks in the array. This section will illustrate how groups and individual disks are accessed during an I/O request. Let fn(t)jt g be the Poisson arrival process of I/O requests to the disk array and fn k (t)jt g; 1 k n, ben output processes, where each output process is the group of k disks accessed during an I/O request; p k is the probability that specic group of k disks is accessed during an I/O request. For each I/O request, only one out of a possible n groups of disks may be accessed, groups of disks accessed during successive I/O requests form a sequence of generalized Bernoulli trials. The conditional distribution that m k is the number of I/O requests that access a group of k disks given that there are m I/O requests in the time interval (;t] is described by the multinomial distribution, P fn 1 (t) =m 1 ;N 2 (t) =m 2 ;:::N n (t) =m n jn(t) =mg = m! m 1!m 2! :::m n! pm 1 1 pm 2 2 :::p mn n ; where P n k=1 p k = 1 and and P n k=1 m k = m. Multiplying by the probability of m I/O requests in (;t], the probability mass function that m 1 requests access group 1, m 2 requests access group 2, ::: m n requests access group n in (;t]is

42 42 = m! m 1!m 2!:::m n! p m 1 1 pm 2 2 :::p mn n e,t (t) m m! P fn 1 (t) =m 1 ;N 2 (t) =m 2 ;:::N n (t) =m n g = Q n k=1 e,p kt (p k t) m k m k! : This result shows that arrivals of requests to groups of disks, N 1 (t);n 2 (t);:::n n (t), are mutually independent and are Poisson with parameters p 1 ; p 2 ;:::p n. Therefore, when the arrival of I/O requests to a RAID 5 disk array ispoisson with rate, arrivals to groups of disks are Poisson with rates p k, where p k is the probability that the group of k of disks is accessed during an I/O request. Furthermore, because a disk is part of dierent groups that can be accessed during a request, the superposition of Poisson group requests results in Poisson arrivals at an individual disk. The arrival of disk requests at a disk is p j, where p j is the probability that disk j, 1 j N, is accessed during an I/O request and N is the number of disks in the array. The following example illustrates how Poisson I/O requests result in Poisson requests to groups and individual disks. Consider an array of 4 disks and an I/O request rate of as shown in Figure 3.4. If requests access disks 1 and 2, 2 and 3, and 3 and 4, with probabilities p 1 =:25, p 2 =:5, and p 3 =:25 ( P 3 i=1 p i = 1), arrivals to disks 1 and 2, 2 and 3, 3 and 4 are Poisson with rates.25,.5, and.25. Since arrivals to groups of disks are Poisson, arrivals to disks 1, 2, 3, and 4 are Poisson with rates.25 (p 1 = p 1 ),.75 (p 2 = p 1 + p 2 ),.75 (p 3 = p 2 + p 3 ), and.25 (p 4 = p 3 )by superposition.

Definition of RAID Levels

Definition of RAID Levels RAID The basic idea of RAID (Redundant Array of Independent Disks) is to combine multiple inexpensive disk drives into an array of disk drives to obtain performance, capacity and reliability that exceeds

More information

I/O CANNOT BE IGNORED

I/O CANNOT BE IGNORED LECTURE 13 I/O I/O CANNOT BE IGNORED Assume a program requires 100 seconds, 90 seconds for main memory, 10 seconds for I/O. Assume main memory access improves by ~10% per year and I/O remains the same.

More information

I/O CANNOT BE IGNORED

I/O CANNOT BE IGNORED LECTURE 13 I/O I/O CANNOT BE IGNORED Assume a program requires 100 seconds, 90 seconds for main memory, 10 seconds for I/O. Assume main memory access improves by ~10% per year and I/O remains the same.

More information

RAID (Redundant Array of Inexpensive Disks)

RAID (Redundant Array of Inexpensive Disks) Magnetic Disk Characteristics I/O Connection Structure Types of Buses Cache & I/O I/O Performance Metrics I/O System Modeling Using Queuing Theory Designing an I/O System RAID (Redundant Array of Inexpensive

More information

Lecture 23: Storage Systems. Topics: disk access, bus design, evaluation metrics, RAID (Sections )

Lecture 23: Storage Systems. Topics: disk access, bus design, evaluation metrics, RAID (Sections ) Lecture 23: Storage Systems Topics: disk access, bus design, evaluation metrics, RAID (Sections 7.1-7.9) 1 Role of I/O Activities external to the CPU are typically orders of magnitude slower Example: while

More information

Appendix D: Storage Systems

Appendix D: Storage Systems Appendix D: Storage Systems Instructor: Josep Torrellas CS433 Copyright Josep Torrellas 1999, 2001, 2002, 2013 1 Storage Systems : Disks Used for long term storage of files temporarily store parts of pgm

More information

CS2410: Computer Architecture. Storage systems. Sangyeun Cho. Computer Science Department University of Pittsburgh

CS2410: Computer Architecture. Storage systems. Sangyeun Cho. Computer Science Department University of Pittsburgh CS24: Computer Architecture Storage systems Sangyeun Cho Computer Science Department (Some slides borrowed from D Patterson s lecture slides) Case for storage Shift in focus from computation to communication

More information

Concepts Introduced. I/O Cannot Be Ignored. Typical Collection of I/O Devices. I/O Issues

Concepts Introduced. I/O Cannot Be Ignored. Typical Collection of I/O Devices. I/O Issues Concepts Introduced I/O Cannot Be Ignored Assume a program requires 100 seconds, 90 seconds for accessing main memory and 10 seconds for I/O. I/O introduction magnetic disks ash memory communication with

More information

RAID SEMINAR REPORT /09/2004 Asha.P.M NO: 612 S7 ECE

RAID SEMINAR REPORT /09/2004 Asha.P.M NO: 612 S7 ECE RAID SEMINAR REPORT 2004 Submitted on: Submitted by: 24/09/2004 Asha.P.M NO: 612 S7 ECE CONTENTS 1. Introduction 1 2. The array and RAID controller concept 2 2.1. Mirroring 3 2.2. Parity 5 2.3. Error correcting

More information

Performance Modeling of a Parallel I/O System: An. Application Driven Approach y. Abstract

Performance Modeling of a Parallel I/O System: An. Application Driven Approach y. Abstract Performance Modeling of a Parallel I/O System: An Application Driven Approach y Evgenia Smirni Christopher L. Elford Daniel A. Reed Andrew A. Chien Abstract The broadening disparity between the performance

More information

Part IV I/O System Chapter 1 2: 12: Mass S torage Storage Structur Structur Fall 2010

Part IV I/O System Chapter 1 2: 12: Mass S torage Storage Structur Structur Fall 2010 Part IV I/O System Chapter 12: Mass Storage Structure Fall 2010 1 Disk Structure Three elements: cylinder, track and sector/block. Three types of latency (i.e., delay) Positional or seek delay mechanical

More information

Part IV I/O System. Chapter 12: Mass Storage Structure

Part IV I/O System. Chapter 12: Mass Storage Structure Part IV I/O System Chapter 12: Mass Storage Structure Disk Structure Three elements: cylinder, track and sector/block. Three types of latency (i.e., delay) Positional or seek delay mechanical and slowest

More information

Mass-Storage Structure

Mass-Storage Structure CS 4410 Operating Systems Mass-Storage Structure Summer 2011 Cornell University 1 Today How is data saved in the hard disk? Magnetic disk Disk speed parameters Disk Scheduling RAID Structure 2 Secondary

More information

ECE Enterprise Storage Architecture. Fall 2018

ECE Enterprise Storage Architecture. Fall 2018 ECE590-03 Enterprise Storage Architecture Fall 2018 RAID Tyler Bletsch Duke University Slides include material from Vince Freeh (NCSU) A case for redundant arrays of inexpensive disks Circa late 80s..

More information

Performance Modeling and Analysis. of Disk Arrays. Edward Kihyen Lee. M.S. (University of California at Berkeley) requirements for the degree of

Performance Modeling and Analysis. of Disk Arrays. Edward Kihyen Lee. M.S. (University of California at Berkeley) requirements for the degree of Performance Modeling and Analysis of Disk Arrays by Edward Kihyen Lee B.S. (University of California at Los Angeles) 1987 M.S. (University of California at Berkeley) 1990 A dissertation submitted in partial

More information

CSE325 Principles of Operating Systems. Mass-Storage Systems. David P. Duggan. April 19, 2011

CSE325 Principles of Operating Systems. Mass-Storage Systems. David P. Duggan. April 19, 2011 CSE325 Principles of Operating Systems Mass-Storage Systems David P. Duggan dduggan@sandia.gov April 19, 2011 Outline Storage Devices Disk Scheduling FCFS SSTF SCAN, C-SCAN LOOK, C-LOOK Redundant Arrays

More information

HP AutoRAID (Lecture 5, cs262a)

HP AutoRAID (Lecture 5, cs262a) HP AutoRAID (Lecture 5, cs262a) Ali Ghodsi and Ion Stoica, UC Berkeley January 31, 2018 (based on slide from John Kubiatowicz, UC Berkeley) Array Reliability Reliability of N disks = Reliability of 1 Disk

More information

DISK SHADOWING. Dina Bitton 1 Department of Electrical Engineering and Computer Science University of Illinois at Chicago

DISK SHADOWING. Dina Bitton 1 Department of Electrical Engineering and Computer Science University of Illinois at Chicago Tandem TR 88.5 DISK SHADOWING Dina Bitton 1 Department of Electrical Engineering and Computer Science University of Illinois at Chicago Jim Gray Tandem Computers Cupertino California June 1988 Tandem Technical

More information

I/O, Disks, and RAID Yi Shi Fall Xi an Jiaotong University

I/O, Disks, and RAID Yi Shi Fall Xi an Jiaotong University I/O, Disks, and RAID Yi Shi Fall 2017 Xi an Jiaotong University Goals for Today Disks How does a computer system permanently store data? RAID How to make storage both efficient and reliable? 2 What does

More information

vsan 6.6 Performance Improvements First Published On: Last Updated On:

vsan 6.6 Performance Improvements First Published On: Last Updated On: vsan 6.6 Performance Improvements First Published On: 07-24-2017 Last Updated On: 07-28-2017 1 Table of Contents 1. Overview 1.1.Executive Summary 1.2.Introduction 2. vsan Testing Configuration and Conditions

More information

Virtual Memory. Reading. Sections 5.4, 5.5, 5.6, 5.8, 5.10 (2) Lecture notes from MKP and S. Yalamanchili

Virtual Memory. Reading. Sections 5.4, 5.5, 5.6, 5.8, 5.10 (2) Lecture notes from MKP and S. Yalamanchili Virtual Memory Lecture notes from MKP and S. Yalamanchili Sections 5.4, 5.5, 5.6, 5.8, 5.10 Reading (2) 1 The Memory Hierarchy ALU registers Cache Memory Memory Memory Managed by the compiler Memory Managed

More information

Administrivia. CMSC 411 Computer Systems Architecture Lecture 19 Storage Systems, cont. Disks (cont.) Disks - review

Administrivia. CMSC 411 Computer Systems Architecture Lecture 19 Storage Systems, cont. Disks (cont.) Disks - review Administrivia CMSC 411 Computer Systems Architecture Lecture 19 Storage Systems, cont. Homework #4 due Thursday answers posted soon after Exam #2 on Thursday, April 24 on memory hierarchy (Unit 4) and

More information

Che-Wei Chang Department of Computer Science and Information Engineering, Chang Gung University

Che-Wei Chang Department of Computer Science and Information Engineering, Chang Gung University Che-Wei Chang chewei@mail.cgu.edu.tw Department of Computer Science and Information Engineering, Chang Gung University l Chapter 10: File System l Chapter 11: Implementing File-Systems l Chapter 12: Mass-Storage

More information

Chapter 11. I/O Management and Disk Scheduling

Chapter 11. I/O Management and Disk Scheduling Operating System Chapter 11. I/O Management and Disk Scheduling Lynn Choi School of Electrical Engineering Categories of I/O Devices I/O devices can be grouped into 3 categories Human readable devices

More information

Performance Analysis of Multiprocessor Disk Array Systems Using Colored Petri Nets

Performance Analysis of Multiprocessor Disk Array Systems Using Colored Petri Nets University of Rhode Island DigitalCommons@URI Open Access Master's Theses 1994 Performance Analysis of Multiprocessor Disk Array Systems Using Colored Petri Nets Kurt R. Almquist University of Rhode Island

More information

Chapter 10: Mass-Storage Systems. Operating System Concepts 9 th Edition

Chapter 10: Mass-Storage Systems. Operating System Concepts 9 th Edition Chapter 10: Mass-Storage Systems Silberschatz, Galvin and Gagne 2013 Chapter 10: Mass-Storage Systems Overview of Mass Storage Structure Disk Structure Disk Attachment Disk Scheduling Disk Management Swap-Space

More information

Chapter 10: Mass-Storage Systems

Chapter 10: Mass-Storage Systems Chapter 10: Mass-Storage Systems Silberschatz, Galvin and Gagne 2013 Chapter 10: Mass-Storage Systems Overview of Mass Storage Structure Disk Structure Disk Attachment Disk Scheduling Disk Management Swap-Space

More information

HP AutoRAID (Lecture 5, cs262a)

HP AutoRAID (Lecture 5, cs262a) HP AutoRAID (Lecture 5, cs262a) Ion Stoica, UC Berkeley September 13, 2016 (based on presentation from John Kubiatowicz, UC Berkeley) Array Reliability Reliability of N disks = Reliability of 1 Disk N

More information

Introduction Disks RAID Tertiary storage. Mass Storage. CMSC 420, York College. November 21, 2006

Introduction Disks RAID Tertiary storage. Mass Storage. CMSC 420, York College. November 21, 2006 November 21, 2006 The memory hierarchy Red = Level Access time Capacity Features Registers nanoseconds 100s of bytes fixed Cache nanoseconds 1-2 MB fixed RAM nanoseconds MBs to GBs expandable Disk milliseconds

More information

5.11 Parallelism and Memory Hierarchy: Redundant Arrays of Inexpensive Disks 485.e1

5.11 Parallelism and Memory Hierarchy: Redundant Arrays of Inexpensive Disks 485.e1 5.11 Parallelism and Memory Hierarchy: Redundant Arrays of Inexpensive Disks 485.e1 5.11 Parallelism and Memory Hierarchy: Redundant Arrays of Inexpensive Disks Amdahl s law in Chapter 1 reminds us that

More information

Database Systems II. Secondary Storage

Database Systems II. Secondary Storage Database Systems II Secondary Storage CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 29 The Memory Hierarchy Swapping, Main-memory DBMS s Tertiary Storage: Tape, Network Backup 3,200 MB/s (DDR-SDRAM

More information

CS3600 SYSTEMS AND NETWORKS

CS3600 SYSTEMS AND NETWORKS CS3600 SYSTEMS AND NETWORKS NORTHEASTERN UNIVERSITY Lecture 9: Mass Storage Structure Prof. Alan Mislove (amislove@ccs.neu.edu) Moving-head Disk Mechanism 2 Overview of Mass Storage Structure Magnetic

More information

Chapter 12: Mass-Storage

Chapter 12: Mass-Storage Chapter 12: Mass-Storage Systems Chapter 12: Mass-Storage Systems Revised 2010. Tao Yang Overview of Mass Storage Structure Disk Structure Disk Attachment Disk Scheduling Disk Management Swap-Space Management

More information

CISC 7310X. C11: Mass Storage. Hui Chen Department of Computer & Information Science CUNY Brooklyn College. 4/19/2018 CUNY Brooklyn College

CISC 7310X. C11: Mass Storage. Hui Chen Department of Computer & Information Science CUNY Brooklyn College. 4/19/2018 CUNY Brooklyn College CISC 7310X C11: Mass Storage Hui Chen Department of Computer & Information Science CUNY Brooklyn College 4/19/2018 CUNY Brooklyn College 1 Outline Review of memory hierarchy Mass storage devices Reliability

More information

Lecture 25: Interconnection Networks, Disks. Topics: flow control, router microarchitecture, RAID

Lecture 25: Interconnection Networks, Disks. Topics: flow control, router microarchitecture, RAID Lecture 25: Interconnection Networks, Disks Topics: flow control, router microarchitecture, RAID 1 Virtual Channel Flow Control Each switch has multiple virtual channels per phys. channel Each virtual

More information

File. File System Implementation. Operations. Permissions and Data Layout. Storing and Accessing File Data. Opening a File

File. File System Implementation. Operations. Permissions and Data Layout. Storing and Accessing File Data. Opening a File File File System Implementation Operating Systems Hebrew University Spring 2007 Sequence of bytes, with no structure as far as the operating system is concerned. The only operations are to read and write

More information

Lecture 21: Reliable, High Performance Storage. CSC 469H1F Fall 2006 Angela Demke Brown

Lecture 21: Reliable, High Performance Storage. CSC 469H1F Fall 2006 Angela Demke Brown Lecture 21: Reliable, High Performance Storage CSC 469H1F Fall 2006 Angela Demke Brown 1 Review We ve looked at fault tolerance via server replication Continue operating with up to f failures Recovery

More information

COMP283-Lecture 3 Applied Database Management

COMP283-Lecture 3 Applied Database Management COMP283-Lecture 3 Applied Database Management Introduction DB Design Continued Disk Sizing Disk Types & Controllers DB Capacity 1 COMP283-Lecture 3 DB Storage: Linear Growth Disk space requirements increases

More information

Database Systems. November 2, 2011 Lecture #7. topobo (mit)

Database Systems. November 2, 2011 Lecture #7. topobo (mit) Database Systems November 2, 2011 Lecture #7 1 topobo (mit) 1 Announcement Assignment #2 due today Assignment #3 out today & due on 11/16. Midterm exam in class next week. Cover Chapters 1, 2,

More information

V. Mass Storage Systems

V. Mass Storage Systems TDIU25: Operating Systems V. Mass Storage Systems SGG9: chapter 12 o Mass storage: Hard disks, structure, scheduling, RAID Copyright Notice: The lecture notes are mainly based on modifications of the slides

More information

Lecture 9. I/O Management and Disk Scheduling Algorithms

Lecture 9. I/O Management and Disk Scheduling Algorithms Lecture 9 I/O Management and Disk Scheduling Algorithms 1 Lecture Contents 1. I/O Devices 2. Operating System Design Issues 3. Disk Scheduling Algorithms 4. RAID (Redundant Array of Independent Disks)

More information

Storing Data: Disks and Files

Storing Data: Disks and Files Storing Data: Disks and Files Chapter 7 (2 nd edition) Chapter 9 (3 rd edition) Yea, from the table of my memory I ll wipe away all trivial fond records. -- Shakespeare, Hamlet Database Management Systems,

More information

An Introduction to RAID

An Introduction to RAID Intro An Introduction to RAID Gursimtan Singh Dept. of CS & IT Doaba College RAID stands for Redundant Array of Inexpensive Disks. RAID is the organization of multiple disks into a large, high performance

More information

Chapter 10: Mass-Storage Systems

Chapter 10: Mass-Storage Systems Chapter 10: Mass-Storage Systems Silberschatz, Galvin and Gagne Overview of Mass Storage Structure Magnetic disks provide bulk of secondary storage of modern computers Drives rotate at 60 to 200 times

More information

The term "physical drive" refers to a single hard disk module. Figure 1. Physical Drive

The term physical drive refers to a single hard disk module. Figure 1. Physical Drive HP NetRAID Tutorial RAID Overview HP NetRAID Series adapters let you link multiple hard disk drives together and write data across them as if they were one large drive. With the HP NetRAID Series adapter,

More information

Chapter 14: Mass-Storage Systems. Disk Structure

Chapter 14: Mass-Storage Systems. Disk Structure 1 Chapter 14: Mass-Storage Systems Disk Structure Disk Scheduling Disk Management Swap-Space Management RAID Structure Disk Attachment Stable-Storage Implementation Tertiary Storage Devices Operating System

More information

Chapter 13: Mass-Storage Systems. Disk Scheduling. Disk Scheduling (Cont.) Disk Structure FCFS. Moving-Head Disk Mechanism

Chapter 13: Mass-Storage Systems. Disk Scheduling. Disk Scheduling (Cont.) Disk Structure FCFS. Moving-Head Disk Mechanism Chapter 13: Mass-Storage Systems Disk Scheduling Disk Structure Disk Scheduling Disk Management Swap-Space Management RAID Structure Disk Attachment Stable-Storage Implementation Tertiary Storage Devices

More information

Chapter 13: Mass-Storage Systems. Disk Structure

Chapter 13: Mass-Storage Systems. Disk Structure Chapter 13: Mass-Storage Systems Disk Structure Disk Scheduling Disk Management Swap-Space Management RAID Structure Disk Attachment Stable-Storage Implementation Tertiary Storage Devices Operating System

More information

CSCI-GA Database Systems Lecture 8: Physical Schema: Storage

CSCI-GA Database Systems Lecture 8: Physical Schema: Storage CSCI-GA.2433-001 Database Systems Lecture 8: Physical Schema: Storage Mohamed Zahran (aka Z) mzahran@cs.nyu.edu http://www.mzahran.com View 1 View 2 View 3 Conceptual Schema Physical Schema 1. Create a

More information

CSE 380 Computer Operating Systems

CSE 380 Computer Operating Systems CSE 380 Computer Operating Systems Instructor: Insup Lee University of Pennsylvania Fall 2003 Lecture Note on Disk I/O 1 I/O Devices Storage devices Floppy, Magnetic disk, Magnetic tape, CD-ROM, DVD User

More information

Mladen Stefanov F48235 R.A.I.D

Mladen Stefanov F48235 R.A.I.D R.A.I.D Data is the most valuable asset of any business today. Lost data, in most cases, means lost business. Even if you backup regularly, you need a fail-safe way to ensure that your data is protected

More information

Module 13: Secondary-Storage Structure

Module 13: Secondary-Storage Structure Module 13: Secondary-Storage Structure Disk Structure Disk Scheduling Disk Management Swap-Space Management Disk Reliability Stable-Storage Implementation Operating System Concepts 13.1 Silberschatz and

More information

Distributed Video Systems Chapter 5 Issues in Video Storage and Retrieval Part 2 - Disk Array and RAID

Distributed Video Systems Chapter 5 Issues in Video Storage and Retrieval Part 2 - Disk Array and RAID Distributed Video ystems Chapter 5 Issues in Video torage and Retrieval art 2 - Disk Array and RAID Jack Yiu-bun Lee Department of Information Engineering The Chinese University of Hong Kong Contents 5.1

More information

Lecture 23: I/O Redundant Arrays of Inexpensive Disks Professor Randy H. Katz Computer Science 252 Spring 1996

Lecture 23: I/O Redundant Arrays of Inexpensive Disks Professor Randy H. Katz Computer Science 252 Spring 1996 Lecture 23: I/O Redundant Arrays of Inexpensive Disks Professor Randy H Katz Computer Science 252 Spring 996 RHKS96 Review: Storage System Issues Historical Context of Storage I/O Storage I/O Performance

More information

Storage. Hwansoo Han

Storage. Hwansoo Han Storage Hwansoo Han I/O Devices I/O devices can be characterized by Behavior: input, out, storage Partner: human or machine Data rate: bytes/sec, transfers/sec I/O bus connections 2 I/O System Characteristics

More information

Chapter 10: Mass-Storage Systems. Operating System Concepts 9 th Edition

Chapter 10: Mass-Storage Systems. Operating System Concepts 9 th Edition Chapter 10: Mass-Storage Systems Silberschatz, Galvin and Gagne 2013 Objectives To describe the physical structure of secondary storage devices and its effects on the uses of the devices To explain the

More information

IBM i Version 7.3. Systems management Disk management IBM

IBM i Version 7.3. Systems management Disk management IBM IBM i Version 7.3 Systems management Disk management IBM IBM i Version 7.3 Systems management Disk management IBM Note Before using this information and the product it supports, read the information in

More information

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

Computer Organization and Structure. Bing-Yu Chen National Taiwan University Computer Organization and Structure Bing-Yu Chen National Taiwan University Storage and Other I/O Topics I/O Performance Measures Types and Characteristics of I/O Devices Buses Interfacing I/O Devices

More information

Performance analysis of disk mirroring techniques

Performance analysis of disk mirroring techniques Florida International University FIU Digital Commons FIU Electronic Theses and Dissertations University Graduate School 3-28-1994 Performance analysis of disk mirroring techniques Taysir Abdalla Florida

More information

Disk Scheduling. Based on the slides supporting the text

Disk Scheduling. Based on the slides supporting the text Disk Scheduling Based on the slides supporting the text 1 User-Space I/O Software Layers of the I/O system and the main functions of each layer 2 Disk Structure Disk drives are addressed as large 1-dimensional

More information

Disks and I/O Hakan Uraz - File Organization 1

Disks and I/O Hakan Uraz - File Organization 1 Disks and I/O 2006 Hakan Uraz - File Organization 1 Disk Drive 2006 Hakan Uraz - File Organization 2 Tracks and Sectors on Disk Surface 2006 Hakan Uraz - File Organization 3 A Set of Cylinders on Disk

More information

Principles of Data Management. Lecture #2 (Storing Data: Disks and Files)

Principles of Data Management. Lecture #2 (Storing Data: Disks and Files) Principles of Data Management Lecture #2 (Storing Data: Disks and Files) Instructor: Mike Carey mjcarey@ics.uci.edu Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Today s Topics v Today

More information

CSE 153 Design of Operating Systems

CSE 153 Design of Operating Systems CSE 153 Design of Operating Systems Winter 2018 Lecture 22: File system optimizations and advanced topics There s more to filesystems J Standard Performance improvement techniques Alternative important

More information

Database Management Systems, 2nd edition, Raghu Ramakrishnan, Johannes Gehrke, McGraw-Hill

Database Management Systems, 2nd edition, Raghu Ramakrishnan, Johannes Gehrke, McGraw-Hill Lecture Handout Database Management System Lecture No. 34 Reading Material Database Management Systems, 2nd edition, Raghu Ramakrishnan, Johannes Gehrke, McGraw-Hill Modern Database Management, Fred McFadden,

More information

Chapter 6. Storage and Other I/O Topics

Chapter 6. Storage and Other I/O Topics Chapter 6 Storage and Other I/O Topics Introduction I/O devices can be characterized by Behaviour: input, output, storage Partner: human or machine Data rate: bytes/sec, transfers/sec I/O bus connections

More information

Silberschatz, et al. Topics based on Chapter 13

Silberschatz, et al. Topics based on Chapter 13 Silberschatz, et al. Topics based on Chapter 13 Mass Storage Structure CPSC 410--Richard Furuta 3/23/00 1 Mass Storage Topics Secondary storage structure Disk Structure Disk Scheduling Disk Management

More information

College of Computer & Information Science Spring 2010 Northeastern University 12 March 2010

College of Computer & Information Science Spring 2010 Northeastern University 12 March 2010 College of Computer & Information Science Spring 21 Northeastern University 12 March 21 CS 76: Intensive Computer Systems Scribe: Dimitrios Kanoulas Lecture Outline: Disk Scheduling NAND Flash Memory RAID:

More information

Lecture: Storage, GPUs. Topics: disks, RAID, reliability, GPUs (Appendix D, Ch 4)

Lecture: Storage, GPUs. Topics: disks, RAID, reliability, GPUs (Appendix D, Ch 4) Lecture: Storage, GPUs Topics: disks, RAID, reliability, GPUs (Appendix D, Ch 4) 1 Magnetic Disks A magnetic disk consists of 1-12 platters (metal or glass disk covered with magnetic recording material

More information

Rowena Cole and Luigi Barone. Department of Computer Science, The University of Western Australia, Western Australia, 6907

Rowena Cole and Luigi Barone. Department of Computer Science, The University of Western Australia, Western Australia, 6907 The Game of Clustering Rowena Cole and Luigi Barone Department of Computer Science, The University of Western Australia, Western Australia, 697 frowena, luigig@cs.uwa.edu.au Abstract Clustering is a technique

More information

I/O Systems and Storage Devices

I/O Systems and Storage Devices CSC 256/456: Operating Systems I/O Systems and Storage Devices John Criswell! University of Rochester 1 I/O Device Controllers I/O devices have both mechanical component & electronic component! The electronic

More information

Network. Department of Statistics. University of California, Berkeley. January, Abstract

Network. Department of Statistics. University of California, Berkeley. January, Abstract Parallelizing CART Using a Workstation Network Phil Spector Leo Breiman Department of Statistics University of California, Berkeley January, 1995 Abstract The CART (Classication and Regression Trees) program,

More information

Chapter 14: Mass-Storage Systems

Chapter 14: Mass-Storage Systems Chapter 14: Mass-Storage Systems Disk Structure Disk Scheduling Disk Management Swap-Space Management RAID Structure Disk Attachment Stable-Storage Implementation Tertiary Storage Devices Operating System

More information

NAS System. User s Manual. Revision 1.0

NAS System. User s Manual. Revision 1.0 User s Manual Revision 1.0 Before You Begin efore going through with this manual, you should read and focus on the following safety guidelines. Information about the NAS system s packaging and delivery

More information

Chapter-6. SUBJECT:- Operating System TOPICS:- I/O Management. Created by : - Sanjay Patel

Chapter-6. SUBJECT:- Operating System TOPICS:- I/O Management. Created by : - Sanjay Patel Chapter-6 SUBJECT:- Operating System TOPICS:- I/O Management Created by : - Sanjay Patel Disk Scheduling Algorithm 1) First-In-First-Out (FIFO) 2) Shortest Service Time First (SSTF) 3) SCAN 4) Circular-SCAN

More information

Readings. Storage Hierarchy III: I/O System. I/O (Disk) Performance. I/O Device Characteristics. often boring, but still quite important

Readings. Storage Hierarchy III: I/O System. I/O (Disk) Performance. I/O Device Characteristics. often boring, but still quite important Storage Hierarchy III: I/O System Readings reg I$ D$ L2 L3 memory disk (swap) often boring, but still quite important ostensibly about general I/O, mainly about disks performance: latency & throughput

More information

CHAPTER 4 AN INTEGRATED APPROACH OF PERFORMANCE PREDICTION ON NETWORKS OF WORKSTATIONS. Xiaodong Zhang and Yongsheng Song

CHAPTER 4 AN INTEGRATED APPROACH OF PERFORMANCE PREDICTION ON NETWORKS OF WORKSTATIONS. Xiaodong Zhang and Yongsheng Song CHAPTER 4 AN INTEGRATED APPROACH OF PERFORMANCE PREDICTION ON NETWORKS OF WORKSTATIONS Xiaodong Zhang and Yongsheng Song 1. INTRODUCTION Networks of Workstations (NOW) have become important distributed

More information

CSCI-GA Operating Systems. I/O : Disk Scheduling and RAID. Hubertus Franke

CSCI-GA Operating Systems. I/O : Disk Scheduling and RAID. Hubertus Franke CSCI-GA.2250-001 Operating Systems I/O : Disk Scheduling and RAID Hubertus Franke frankeh@cs.nyu.edu Disks Scheduling Abstracted by OS as files A Conventional Hard Disk (Magnetic) Structure Hard Disk

More information

IBM. Systems management Disk management. IBM i 7.1

IBM. Systems management Disk management. IBM i 7.1 IBM IBM i Systems management Disk management 7.1 IBM IBM i Systems management Disk management 7.1 Note Before using this information and the product it supports, read the information in Notices, on page

More information

Chapter 12: Mass-Storage

Chapter 12: Mass-Storage hapter 12: Mass-Storage Systems hapter 12: Mass-Storage Systems To explain the performance characteristics of mass-storage devices To evaluate disk scheduling algorithms To discuss operating-system services

More information

Clustering and Reclustering HEP Data in Object Databases

Clustering and Reclustering HEP Data in Object Databases Clustering and Reclustering HEP Data in Object Databases Koen Holtman CERN EP division CH - Geneva 3, Switzerland We formulate principles for the clustering of data, applicable to both sequential HEP applications

More information

Chapter 12: Mass-Storage

Chapter 12: Mass-Storage hapter 12: Mass-Storage Systems hapter 12: Mass-Storage Systems Overview of Mass Storage Structure Disk Structure Disk Attachment Disk Scheduling Disk Management RAID Structure Objectives Moving-head Disk

More information

CSE380 - Operating Systems. Communicating with Devices

CSE380 - Operating Systems. Communicating with Devices CSE380 - Operating Systems Notes for Lecture 15-11/4/04 Matt Blaze (some examples by Insup Lee) Communicating with Devices Modern architectures support convenient communication with devices memory mapped

More information

Lecture 23. Finish-up buses Storage

Lecture 23. Finish-up buses Storage Lecture 23 Finish-up buses Storage 1 Example Bus Problems, cont. 2) Assume the following system: A CPU and memory share a 32-bit bus running at 100MHz. The memory needs 50ns to access a 64-bit value from

More information

Analysis of Striping Techniques in Robotic. Leana Golubchik y Boelter Hall, Graduate Student Oce

Analysis of Striping Techniques in Robotic. Leana Golubchik y Boelter Hall, Graduate Student Oce Analysis of Striping Techniques in Robotic Storage Libraries Abstract Leana Golubchik y 3436 Boelter Hall, Graduate Student Oce UCLA Computer Science Department Los Angeles, CA 90024-1596 (310) 206-1803,

More information

Performance Evaluation of Two New Disk Scheduling Algorithms. for Real-Time Systems. Department of Computer & Information Science

Performance Evaluation of Two New Disk Scheduling Algorithms. for Real-Time Systems. Department of Computer & Information Science Performance Evaluation of Two New Disk Scheduling Algorithms for Real-Time Systems Shenze Chen James F. Kurose John A. Stankovic Don Towsley Department of Computer & Information Science University of Massachusetts

More information

Disk Scheduling. Chapter 14 Based on the slides supporting the text and B.Ramamurthy s slides from Spring 2001

Disk Scheduling. Chapter 14 Based on the slides supporting the text and B.Ramamurthy s slides from Spring 2001 Disk Scheduling Chapter 14 Based on the slides supporting the text and B.Ramamurthy s slides from Spring 2001 1 User-Space I/O Software Layers of the I/O system and the main functions of each layer 2 Disks

More information

Operating Systems 2010/2011

Operating Systems 2010/2011 Operating Systems 2010/2011 Input/Output Systems part 2 (ch13, ch12) Shudong Chen 1 Recap Discuss the principles of I/O hardware and its complexity Explore the structure of an operating system s I/O subsystem

More information

Disk scheduling Disk reliability Tertiary storage Swap space management Linux swap space management

Disk scheduling Disk reliability Tertiary storage Swap space management Linux swap space management Lecture Overview Mass storage devices Disk scheduling Disk reliability Tertiary storage Swap space management Linux swap space management Operating Systems - June 28, 2001 Disk Structure Disk drives are

More information

CONFIGURING ftscalable STORAGE ARRAYS ON OpenVOS SYSTEMS

CONFIGURING ftscalable STORAGE ARRAYS ON OpenVOS SYSTEMS Best Practices CONFIGURING ftscalable STORAGE ARRAYS ON OpenVOS SYSTEMS Best Practices 2 Abstract ftscalable TM Storage G1, G2 and G3 arrays are highly flexible, scalable hardware storage subsystems that

More information

Technical Note P/N REV A01 March 29, 2007

Technical Note P/N REV A01 March 29, 2007 EMC Symmetrix DMX-3 Best Practices Technical Note P/N 300-004-800 REV A01 March 29, 2007 This technical note contains information on these topics: Executive summary... 2 Introduction... 2 Tiered storage...

More information

Tape Group Parity Protection

Tape Group Parity Protection Tape Group Parity Protection Theodore Johnson johnsont@research.att.com AT&T Laboratories Florham Park, NJ Sunil Prabhakar sunil@cs.purdue.edu Purdue University West Lafayette, IN Abstract We propose a

More information

Tape pictures. CSE 30341: Operating Systems Principles

Tape pictures. CSE 30341: Operating Systems Principles Tape pictures 4/11/07 CSE 30341: Operating Systems Principles page 1 Tape Drives The basic operations for a tape drive differ from those of a disk drive. locate positions the tape to a specific logical

More information

Module 13: Secondary-Storage

Module 13: Secondary-Storage Module 13: Secondary-Storage Disk Structure Disk Scheduling Disk Management Swap-Space Management Disk Reliability Stable-Storage Implementation Tertiary Storage Devices Operating System Issues Performance

More information

Chapter 6 External Memory

Chapter 6 External Memory Chapter 6 External Memory Magnetic Disk Removable RAID Disk substrate coated with magnetizable material (iron oxide rust) Substrate used to be aluminium Now glass Improved surface uniformity Increases

More information

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2017 Lecture 13

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2017 Lecture 13 CS24: INTRODUCTION TO COMPUTING SYSTEMS Spring 2017 Lecture 13 COMPUTER MEMORY So far, have viewed computer memory in a very simple way Two memory areas in our computer: The register file Small number

More information

A Disk Head Scheduling Simulator

A Disk Head Scheduling Simulator A Disk Head Scheduling Simulator Steven Robbins Department of Computer Science University of Texas at San Antonio srobbins@cs.utsa.edu Abstract Disk head scheduling is a standard topic in undergraduate

More information

CMSC 424 Database design Lecture 12 Storage. Mihai Pop

CMSC 424 Database design Lecture 12 Storage. Mihai Pop CMSC 424 Database design Lecture 12 Storage Mihai Pop Administrative Office hours tomorrow @ 10 Midterms are in solutions for part C will be posted later this week Project partners I have an odd number

More information

1. Introduction. Traditionally, a high bandwidth file system comprises a supercomputer with disks connected

1. Introduction. Traditionally, a high bandwidth file system comprises a supercomputer with disks connected 1. Introduction Traditionally, a high bandwidth file system comprises a supercomputer with disks connected by a high speed backplane bus such as SCSI [3][4] or Fibre Channel [2][67][71]. These systems

More information

Main Points of the Computer Organization and System Software Module

Main Points of the Computer Organization and System Software Module Main Points of the Computer Organization and System Software Module You can find below the topics we have covered during the COSS module. Reading the relevant parts of the textbooks is essential for a

More information

CS-736 Midterm: Beyond Compare (Spring 2008)

CS-736 Midterm: Beyond Compare (Spring 2008) CS-736 Midterm: Beyond Compare (Spring 2008) An Arpaci-Dusseau Exam Please Read All Questions Carefully! There are eight (8) total numbered pages Please put your NAME ONLY on this page, and your STUDENT

More information