1 PERFORMANCE ANALYSIS OF SUPERCOMPUTING ENVIRONMENTS. Department of Computer Science, University of Illinois at Urbana-Champaign

Size: px
Start display at page:

Download "1 PERFORMANCE ANALYSIS OF SUPERCOMPUTING ENVIRONMENTS. Department of Computer Science, University of Illinois at Urbana-Champaign"

Transcription

1 1 PERFORMANCE ANALYSIS OF TAPE LIBRARIES FOR SUPERCOMPUTING ENVIRONMENTS Ilker Hamzaoglu and Huseyin Simitci Department of Computer Science, University of Illinois at Urbana-Champaign {hamza, Abstract: In this paper, we analyzed the performance of various roboticallyaccessed tape libraries for supercomputing workloads. We used an event-driven simulator to conduct the performance analysis, and we considered the workloads at the National Center for Atmospheric Research (NCAR) and the National Center for Supercomputing Applications (NCSA). The simulation results showed that tape libraries with adequate performance parameters perform quite well on the supercomputing workloads, thus they are promising to be an important part of the mass storage systems used in the supercomputing environments. The results also showed that the tape drive forward search and rewind rate is the bottleneck in the tape libraries used under the supercomputing workloads. Keywords: tertiary storage, tape library, supercomputing, performance analysis, simulation 1.1 INTRODUCTION Over the last decade, as the performance of processors is signicantly increased, applications that involve huge amounts of data, e.g. computational science (2, 16, 17) and data mining (12, 13), became commonplace. It is not cost eective to use the secondary storage devices as the only storage devices to cope with this much of data. Instead, tertiary storage devices (3, 11) should be used to store much of this data. This research was supported in part by the Department of Computer Science, University of Illinois at Urbana-Champaign. 1

2 2 Random Access Memory (RAM) primary storage Magnetic Disk secondary storage Magnetic Tape Optical Disk tertiary storage Figure 1.1 Storage System Hierarchy Figure 1.1 shows a typical storage system hierarchy. At the top of the hierarchy is primary storage which usually consists of semiconductor random access memory (RAM) used for caches and main memory. The next level is secondary storage which usually consists of magnetic disk devices. Tertiary storage which usually consists of magnetic tapes and optical disks lies at the bottom of the storage hierarchy. The top level of the hierarchy has the highest cost per byte, and it is the smallest and the fastest of all three levels. As we go down in the hierarchy, the storage cost signicantly decreases, whereas the size and the access latency signicantly increase. Although tertiary storage includes optical tapes and holographical storage as well as magnetic tapes and optical disks, magnetic tapes and optical disks are the two major types of commonly used tertiary storage devices. Tapes have higher latency then optical disks, but their bandwidth is much higher. Since the les in supercomputing applications tend to be large, the dierence in transfer time between tape and optical disk is substantial. Therefore, for supercomputing applications, magnetic tapes are more suitable. In supercomputing environments such as the National Center for Atmospheric Research (NCAR) and National Center for Supercomputing Applications (NCSA), sustaining a good response time at a reasonable cost became a severe problem as the data grows at a rate of several terabytes (TB) per year (9, 14). To cost eectively store this much of data, an appropriate mass storage system that can handle the workloads of these computing centers should be used. In this paper, we will use the terms mass storage system and archive system interchangeably. The structure of a typical mass storage system is shown in Figure 1.2. Although early mass storage systems used in the supercomputing environments contained only manually loaded tape cartridge systems, this is no longer cost eective for obtaining a reasonable performance when dealing with terabytes of data. Robot-controlled tape libraries, however, are promising to be a cost

3 PERFORMANCE ANALYSIS OF TAPE LIBRARIES 3 Computing Environment I/O Processors Disk Storage Unit Robot-controlled Tape Library Manually Loaded Tape Drives Figure 1.2 Mass Storage System Structure eective solution for handling huge amounts of data (1, 5). A robot-controlled tape library consists of a storage rack for tape cartridges, a number of tape drives, and a robot arm that moves the cartridges between the storage rack and the tape drives. Since these tape libraries can handle massive amounts of data without user intervention, they reduce the data management cost and the response time, and they increase the data reliability. In this paper, we analyzed the performance of various robotically-accessed tape library congurations for supercomputing workloads. We also assessed the impact of the performance of various tape library components on the overall performance of the tape libraries for supercomputing workloads. We carried out the performance analysis using the simulation technique, because it is more accurate than analytical modeling and it provides easier trade-o evaluation, i.e. comparing the performance of alternative tape library congurations under a variety of workloads, than measurement (8). We used an event-driven simulator (3) to simulate the behavior of various tape library congurations under three dierent supercomputing workloads; NCAR Mass Storage System workload (14), NCSA Common File System workload (9), and NCSA UniTree File Archive System workload (7, 18). The simulation results showed that tape libraries with adequate performance parameters perform quite well on the supercomputing workloads, thus they are promising to be an important part of the mass storage systems used in the supercomputing environments. The results also showed that parallelism in tape libraries, i.e. having more than one tape drive, increases the performance signicantly. The robot arm speed has less impact on the overall tape library performance than the tape drive speed. The tape drive search and rewind rate has more impact on the overall tape library performance than the tape drive transfer rate. In other words, the tape drive forward search and rewind rate is the bottleneck in the tape libraries used under the supercomputing workloads.

4 4 The rest of the paper is organized as follows. The previous work in this area is presented in Section 1.2. The simulation framework including the supercomputing workloads, the tape libraries, and the tape library simulator used in the simulations, is described in Section 1.3. The simulation results are presented in Section 1.4. Finally, Section 1.5 presents the conclusions. 1.2 PREVIOUS WORK In this section, we will present the previous work about the performance analysis of tape libraries and the previous work about characterizing the workloads at the supercomputing environments. Chervenak's work on the performance analysis of tertiary storage devices presented an overview of the current tertiary storage technology, including magnetic tapes, holographic storage and optical tapes (3, 4). She dened a simulation model and developed an event-driven simulator for simulating the tertiary storage devices. She simulated the behavior of various tertiary storage systems under three dierent workloads; sequential workload, video server workload and digital library workload. She found that in order for the tertiary storage libraries to be eective for these workloads, the ones that contain large number of drives with satisfactory bandwidth should be used. She also performed a simulation study to assess the benets of tape library striping for various workloads, and she found that although striping is quite eective for sequential workloads, it performs poorly on the workloads that are not strictly sequential because it creates contention for a small number of tape drives. Johnson developed an analytical performance model of a robot-controlled tape library, and he analyzed the performance of various tape library congurations using this performance model (1). Johnson and Miller measured the performance of several low-end and high-end tape drives and they developed an analytical model of a tape drive using the performance values obtained from these measurements (11). Pentakalos et al. developed an approximate closed queuing network model of a hierarchical mass storage system and they analyzed the eect of workload intensity increase, use of le compression, and use of le abstractions on the mass storage system performance using this model (15). Jensen and Reed presented empirical data on the usage of the Common File System (CFS), the previous mass storage system used in NCSA (9). They analyzed the le transactions on the CFS based on 3 years of transaction records, and they quantied the le access patterns in terms of the request sizes, transaction rates and le interreference times. They found that small disks in the archival storage received most of the transactions, but these constituted only a small minority of the data, and the cartridge tapes received the fewest transactions and the largest volume data. 1% of all CFS transactions accounted for over 75% of the CFS data volume. The mean time between transactions was computed as 39 seconds, though there were many bursts of CFS requests with much less interarrival times. Finestead and Yeager explained the structure of the UniTree File Archive System (UFAS), the current mass storage system used in NCSA (7). UFAS is

5 PERFORMANCE ANALYSIS OF TAPE LIBRARIES 5 Table 1.1 File Access Pattern Parameters for NCAR MSS and NCSA CFS Workloads Parameter NCAR,MSS NCSA,CFS File interreference time 161 sec 329 sec Request placement distribution ZIPF ZIPF Percentage write accesses 33% 34% Request size distribution Uniform Uniform Request size mean 8 MB 17 MB an archival storage system that automatically migrates les from local disks to tapes when disk space becomes low or les have not been accessed in a certain amount of time. UFAS, currently, uses only manually loaded tape drives (6). Miller and Katz examined a dierent supercomputing environment, NCAR Mass Storage System (MSS) (14). They analyzed 24 months of trace data gathered from system logs. They found that although 66% of all the transactions were received by the archive disks, these transactions only accounted for the 1% of the data trac. On the other hand, even though only 2% of the transactions were received by the manually loaded tape cartridge system, these transactions constituted the 66% of the data trac in the archive system. They found that the average interval between mass storage system requests was 18 seconds. However, 9% of all references were followed by another one in less than 1 seconds which shows that I/Os were clustered. 1.3 SIMULATION FRAMEWORK We simulated the behavior of two dierent tape libraries under the supercomputing workloads using an event-driven tape library simulator. In this section, we will describe the supercomputing workloads, the tape libraries, and the tape library simulator used in this work Supercomputing Workloads We analyzed the performance of various tape library congurations for NCAR Mass Storage System (MSS), NCSA Common File System (CFS), and NCSA UniTree File Archive System (UFAS) workloads. All three workloads are determined by using the le access requests received by the tape archive part of the corresponding mass storage system. We determined the NCSA CFS and the NCAR MSS workloads by using the le access patterns presented in (9) and (14) respectively. The le access pattern parameters characterizing these workloads are presented in Table 1.1. For both systems, we assumed a Uniform request size distribution, and a Zipf request placement distribution. Since, in both systems, the input/outputs are clustered and for the rereferenced les the second access came soon after the rst one, we used the Zipf distribution to imitate this behavior. Because this provides a workload with access locality that follows the Zipf's Law distribution

6 6 Table 1.2 Performance Parameters for EXB85 and DST65 Tape Drives Parameter EXB85 DST65 Tape Eject Time 15 sec 5 sec Tape Load Time 5 sec 5 sec Rewind Startup Time 2 sec 5 sec Rewind Rate (after startup) 5 MB/sec 8 MB/sec Forward Search Startup Time 15 sec 5 sec Forward Search Rate (after startup) 5 MB/sec 8 MB/sec Read Transfer Rate 5 KB/sec 15 MB/sec Write Transfer Rate 5 KB/sec 15 MB/sec Tape Capacity 5 GB 5 GB (19), i.e. the nth most popular movie of N movies is requested with probability C/n, where C = 1 / (1 + 1/2 + 1/ /N). The request arrivals are generated using a Normal le interreference time distribution with the given mean values. The standard deviation of the distribution is taken to be the same as its mean value in order to provide an irregular request arrival pattern. We determined the NCSA UFAS workload by using one week transaction records logged by Unitree (6). Request arrival times, request sizes and request types (read or write) are directly taken from the log les. Request placement information, i.e. request start block, is adapted to the simulated tape library conguration. The start block of a le in a secondary or tertiary storage system depends on many factors, e.g. le allocation policy and storage system capacity. Therefore, it's not possible to use this information directly from the log les, and it should be adapted to the capacity of the simulated storage system Tape Libraries We simulated the behavior of two dierent tape libraries, Exabyte EXB12 and Ampex DST712, under the supercomputing workloads. Exabyte EXB12 is a small low-end tape library produced by Exabyte Corporation (5). It contains four Exabyte EXB85 8mm helical scan tape drives. It can store up to 1.2 terabytes using 116 tapes. Ampex DST712 is a large high performance tape library produced by Ampex Corporation (1). It contains two Ampex DST65 19mm helical scan tape drives. It can store up to 5.8 terabytes using 116 tapes. The tape drive and the tape library models used for simulating the EXB85 and DST65 tape drives and the EXB12 and DST712 tape libraries are based on the performance parameters presented in Tables 1.2 and 1.3 respectively. Load and eject times, robot arm movement, and pick up and place times are modeled as constant values. Search and rewind times are calculated using a model consisting of a constant startup overhead followed by a linear positioning rate. We also made the same optimistic assumption as Chervenak (3) that the devices operate at streaming rates which is true for supercomputing environments.

7 PERFORMANCE ANALYSIS OF TAPE LIBRARIES 7 Table 1.3 Performance Parameters for EXB12 and DST712 Tape Libraries Tape Library Simulator Parameter EXB12 DST712 Robot movement time 2 sec 1 sec Robot pick (grab) time 2 sec 2 sec Robot place (put) time 2 sec 2 sec Number of tape drives 4 2 Number of robot arms 1 1 Number of tapes List price $1, $5, We have used the event-driven tape library simulator developed by Chervenak (3). The original simulator was a close simulator. A close simulator keeps the number of outstanding requests (concurrency) in the system constant throughout a simulation run by issuing a new request as soon as one request completes service. This way, Chervenak had a control over the workload concurrency and this allowed her to use the number of concurrent requests that can be sustained for a given response time limit as the performance metric. However, in order to simulate the tape libraries using the le access traces of applications in an actual system, there should be no restrictions on the workload concurrency and on the response time of the system. Therefore, in order to simulate the workloads accurately and to report the performance values in terms of response time without putting any restrictions on the response time itself, we modied the simulator so that it works as an open simulator. In an open simulator, requests arrive to the system without any restrictions and they leave the system after they are serviced. The new simulator either receives the requests according to an event arrival distribution as in the case of NCSA CFS and NCAR MSS workloads, or gets them directly from a given trace le as in the case of NCSA UFAS workload. We implemented a preprocessor to convert the NCSA UniTree le archive system traces into a suitable format that can be processed by the simulator. We also modied the simulator to directly accept this trace le, instead of generating requests according to an arrival distribution. The simulator is event-driven. A queue of relevant events is maintained and processed in time order. The events considered during a simulation run are arrival (request enters to system), device free (drive nishes processing a request), arm free (robot arm nishes moving a tape), and rewind and eject done (drive nishes rewinding and ejecting a loaded tape). In a simulation run, statistics are gathered only after a certain number of requests has been processed, in order to avoid using the initial transient tape library state. This improves the reliability of the results. We used a threshold value of 5 requests in our simulations. The simulator reports the tape library performance in terms of the mean response time (latency).

8 8 1.4 SIMULATION RESULTS We rst simulated the behavior of EXB12 and DST712 tape libraries under each workload. We simulated 1 request arrivals for NCAR MSS and NCSA CFS workloads, and 75 request arrivals for NCSA UFAS workload. Figure 1.3 shows the mean response times of both tape libraries for each workload. Jensen and Reed reported the response time of the NCSA CFS tape archive as 546 seconds (9). Only manually loaded tape drives are used in the NCSA CFS tape archive. The response time of EXB12 and DST712, even with 4 tape drives, is better than the NCSA CFS tape archive performance. Miller and Katz reported the response time of the NCAR MSS tape archive as 14 seconds (14). A StorageTek 44 tape library with a tape load time of 6 seconds and a transfer rate of 6 MB/sec is used in the NCAR tape archive. The EXB12 tape library, even with 16 tape drives, performed worse than the current tape archive system at NCAR. However, DST712 tape library performed better than the current tape archive system at NCAR. Finestead reported the average response time of the current NCSA UniTree tape archive as 27 seconds (6). 12 STK348 tape drives with a transfer rate of 3 MB/sec and 8 Metrum tape drives with a transfer rate of 2 MB/sec, all of which are manually loaded, are used in the Unitree tape archive. As in the case of the NCAR MSS workload, EXB12 tape library, even with 16 tape drives, performed worse than the current tape archive system at NCSA. However, DST712 tape library performed better than the current tape archive system at NCSA. These results show that tape libraries with adequate performance parameters perform quite well under the supercomputing workloads. The decrease in the response times of both tape libraries when going from a single tape drive to 4 tape drives demonstrates the importance of the input/output parallelism. However, tape drive parallelism without adequate tape drive and robot arm performance has only limited eect on the ability of a tape library to handle supercomputing workloads Improving Overall Performance of Tape Libraries In this section, we examine the eect of improving overall tape library performance by certain factors on the performance of tape libraries for supercomputing workloads. We simulated the behavior of two tape libraries that are twice as fast and ten times as fast as EXB12 tape library. The performance parameters for these tape libraries are shown in Table 1.4. We simulated three dierent tape library congurations for each speed up factor, e.g. for the speed up factor of 2, we simulated a tape library with a twice as fast tape drive as the original one and an unchanged robot arm, a tape library with a twice as fast robot arm as the original one and an unchanged tape drive, and a twice as fast tape library as the original EXB12. We simulated each tape library conguration both with 4 and 16 tape drives.

9 PERFORMANCE ANALYSIS OF TAPE LIBRARIES Number of Tape Drives 3a: NCSA CFS Workload EXB12 DST EXB12 DST Number of Tape Drives 3b: NCAR MSS Workload EXB12 DST Number of Tape Drives 3c: NCSA UFAS Workload Figure 1.3 Performance of EXB12 and DST712 Tape Libraries

10 1 Table 1.4 Performance Parameters for EXB12 Tape Library at Various Speed up Factors Speed up Original 2X 1X (EXB12) Robot movement time (sec) Robot pick (grab) time (sec) Robot place (put) time (sec) Tape Eject Time (sec) Tape Load Time (sec) Rewind Startup Time (sec) Rewind Rate (after startup) (MB/sec) Forward Search Startup Time (sec) Forward Search Rate (after startup) (MB/sec) Read Transfer Rate (KB/sec) Write Transfer Rate (KB/sec) Figure 1.4 presents the mean response times of all the tape library congurations for each supercomputing workload. In this gure, R-T (robot arm? tape drive) represents the original EXB12 tape library. 2R represents a two times faster robot arm, and 2T represents a two times faster tape drive. Similarly, 1R represents a ten times faster robot arm, and 1T represents a ten times faster tape drive. Each combination of R and T represents a corresponding tape library conguration. For example, R-1T represents a tape library with an unchanged robot arm speed which is the same as the EXB12 robot arm speed and a ten times as fast tape drive as the original EXB85. The graphs in the gure show that improving overall performance of tape libraries had a similar eect on the tape library performance for all three workloads. The simulation results show that improving the performance of robot arms slightly improved the performance of the tape libraries for supercomputing workloads. However, improving the performance of tape drives signicantly decreased the tape library response times for supercomputing workloads. This indicates that, in order to improve the eectiveness of a tape library for a supercomputing workload, improving the performance of tape drives is more cost eective than improving the performance of robot arms Improving Individual Properties of Tape Drives In this section, we examine the eect of improving individual tape drive performance parameters on the tape library performance, in particular the tape drive transfer rate and the tape drive search and rewind rate. We simulated the behavior of EXB12 tape library with four EXB85 tape drives under the supercomputing workloads by varying a single parameter at a time and keeping the others constant at their original values. This way, we assessed the eect of improving a single performance parameter on the overall tape library performance.

11 PERFORMANCE ANALYSIS OF TAPE LIBRARIES tape drives 16 tape drives R-T R-2T 2R-T 2R-2T R-1T 1R-T 1R-1T Speedup Factors 4a: NCSA CFS Workload tape drives 16 tape drives R-T R-2T 2R-T 2R-2T R-1T 1R-T 1R-1T Speedup Factors 4b: NCAR MSS Workload 2 4 tape drives 16 tape drives R-T R-2T 2R-T 2R-2T R-1T 1R-T 1R-1T Speedup Factors 4c: NCSA UFAS Workload Figure 1.4 Performance of EXB12 Tape Library at Various Speed up Factors

12 12 Figure 1.5 presents the mean response time of EXB12 tape library under the supercomputing workloads for each individual parameter improvement. The graphs in this gure show that improving individual properties of tape drives had a similar eect on the tape library performance for all three workloads. The simulation results show that improving the tape drive transfer rate slightly decreased the tape library response times for supercomputing workloads. Improving the tape drive search and rewind rate, however, decreased the tape library response times signicantly. This indicates that, in order to improve the eectiveness of a tape library for a supercomputing workload, improving the tape drive search and rewind rate is more cost eective than improving the tape drive transfer rate Ejecting and Loading Tapes at Arbitrary Positions In this section, we examine the eect of ejecting and loading tapes at arbitrary positions, rather than fully rewinding the tapes before ejecting them, on the performance of tape libraries for supercomputing workloads. This is becoming a practical way of using tape cartridges, as several tape drive manufacturers are reducing rewind and search times by implementing periodic zones on the tapes where eject and load operations are allowed (1, 5). Since Chervenak's tape library simulator (3) requires that a tape should be fully rewound before an eject operation can be performed, we modied the simulator to allow this behavior. This is achieved by keeping track of the position of each tape cartridge so that when a tape is reloaded to a tape drive, the tape block number under the tape head is known. We simulated the behavior of EXB12 and DST712 tape libraries for the NCAR MSS and NCSA CFS workloads using the new simulator. Figure 1.6 shows the mean response times of both the original and the improved tape libraries for the two workloads. The graphs in this gure show that this technique improves the tape library performance by 1% for both workloads. 1.5 CONCLUSIONS In this paper, we analyzed the performance of various robotically-accessed tape libraries for supercomputing workloads. We also assessed the impact of the performance of various tape library components on the overall performance of the tape libraries for supercomputing workloads. We used an event-driven simulator to conduct the performance analysis, and we considered the workloads at the National Center for Atmospheric Research (NCAR) and the National Center for Supercomputing Applications (NCSA). The simulation results showed that although the performance of EXB12 tape library for the NCAR MSS and NCSA CFS workloads is reasonable, it behaves poorly for NCSA UFAS workload. However, Ampex DST712 performs quite well on all three workloads. This shows that the robot-controlled tape libraries with adequate performance parameters are promising to be an important part of the mass storage systems used in the supercomputing environments.

13 PERFORMANCE ANALYSIS OF TAPE LIBRARIES Transfer Rate (MB/sec) 1 2 Forward Search/Rewind Rate (MB/sec) 5a: NCSA CFS Workload Transfer Rate (MB/sec) 1 2 Forward Search/Rewind Rate (MB/sec) 5b: NCAR MSS Workload Transfer Rate (MB/sec) 1 2 Forward Search/Rewind Rate (MB/sec) 5c: NCSA UFAS Workload Figure 1.5 Eect of Varying Tape Drive Transfer Rate and Forward Search/Rewind Rate on the Performance of EXB12 Tape Library

14 EXB12 EXB12 Arb. Pos. Eject DST712 DST712 Arb. Pos. Eject Number of Tape Drives 6a: NCSA CFS Workload EXB12 EXB12 Arb. Pos. Eject DST712 DST712 Arb. Pos. Eject Number of Tape Drives 6b: NCAR MSS Workload Figure 1.6 Performance of Tape Libraries that Eject and Load Tapes at Arbitrary Positions The results also showed that parallelism in tape libraries, i.e. having more than one tape drive, increases the performance signicantly. The robot arm speed has less impact on the overall tape library performance than the tape drive speed. The tape drive search and rewind rate has more impact on the overall tape library performance than the tape drive transfer rate. In other words, the tape drive forward search and rewind rate is the bottleneck in the tape libraries used under the supercomputing workloads. Ejecting and loading tapes at arbitrary positions, rather than fully rewinding the tapes before ejecting them, improves the performance of the tape libraries by 1%. Acknowledgments We thank to Ann Chervenak for providing us her tape library simulator. Thanks also to Arlan Finestead for providing us the NCSA UniTree log les.

15 PERFORMANCE ANALYSIS OF TAPE LIBRARIES 15 References [1] Ampex Corporation, [2] C. Baillie, J. Michalakes, and R. Skalin, \Regional Weather Modeling on Parallel Computers", Parallel Computing, vol. 23, pp , December [3] A. L. Chervenak, \Tertiary Storage: An Evaluation of New Applications", PhD Thesis, University of California at Berkeley, [4] A. L. Chervenak, \Challenges for Tertiary Storage in Multimedia Servers", Parallel Computing, vol. 24, pp , January [5] Exabyte Corporation, [6] A. Finestead, Personal Communication. [7] A. Finestead and N. Yeager, \Performance of a Distributed Superscalar Storage Server", Technical Report, National Center for Supercomputing Applications, October [8] R. Jain, \The Art of Computer Systems Performance Analysis", John Wiley & Sons, [9] D. Jensen and D. Reed, \File Archive Activity in a Supercomputing Environment", ACM Int. Conf. on Supercomputing, July [1] T. Johnson, \An Analytical Performance Model of Robotic Storage Libraries", Int. Conf. on Performance Theory, Measurement and Evaluation of Computer and Communication Systems, pp , October [11] T. Johnson and E. L. Miller, \Performance Measurements and Models of Tertiary Storage Devices", Int. Conf. on Very Large Databases, pp. 5-61, August [12] H. Kargupta, I. Hamzaoglu, B. Staord, V. Hanagandi, and K. Buescher, \PADMA: PArallel Data Mining Agents For Scalable Text Classication", High Performance Computing Conference, pp , April [13] H. Kargupta, I. Hamzaoglu, and B. Staord, \Scalable, Distributed Data Mining Using An Agent Based Architecture", Int. Conf. on Knowledge Discovery and Data Mining, pp , August [14] E. L. Miller and R. Katz, \An Analysis of File Migration in a Unix Supercomputing Environment", USENIX Conference, pp , January [15] O. Pentakalos, D. Menasce, M. Halem, and Y. Yesha, \Analytical Performance Modeling of Hierarchical Mass Storage Systems", IEEE Transactions on Computers, vol. 46, pp , October [16] H. Simitci, \Parallel Solution Alternatives for Sparse Triangular Systems in Interior Point Methods", High Performance Computing Systems and Applications, edited by J. Schaeer, Kluwer Academic Publishers, October [17] H. Simitci and D. A. Reed, \A Comparison of Logical and Physical Parallel I/O Patterns", Int. Journal of Supercomputer Applications and High Performance Computing, pp , May [18] User's Guide to the NCSA UniTree Archival System, National Center for Supercomputing Applications, May [19] G. K. Zipf, \Human Behavior and Principle of Least Eort: An Introduction to Human Ecology", Addison Wesley, 1949.

Storage Hierarchy Management for Scientific Computing

Storage Hierarchy Management for Scientific Computing Storage Hierarchy Management for Scientific Computing by Ethan Leo Miller Sc. B. (Brown University) 1987 M.S. (University of California at Berkeley) 1990 A dissertation submitted in partial satisfaction

More information

Distributed File Systems Part IV. Hierarchical Mass Storage Systems

Distributed File Systems Part IV. Hierarchical Mass Storage Systems Distributed File Systems Part IV Daniel A. Menascé Hierarchical Mass Storage Systems On-line data requirements Mass Storage Systems Concepts Mass storage system architectures Example systems Performance

More information

Storage Systems for Movies-on-Demand Video Servers. Ann L. Chervenak. Georgia Institute of Technology. Atlanta, Georgia. David A.

Storage Systems for Movies-on-Demand Video Servers. Ann L. Chervenak. Georgia Institute of Technology. Atlanta, Georgia. David A. Storage Systems for Movies-on-Demand Video Servers Ann L. Chervenak College of Computing Georgia Institute of Technology Atlanta, Georgia David A. Patterson Randy H. Katz Computer Science Division University

More information

Analysis of Striping Techniques in Robotic. Leana Golubchik y Boelter Hall, Graduate Student Oce

Analysis of Striping Techniques in Robotic. Leana Golubchik y Boelter Hall, Graduate Student Oce Analysis of Striping Techniques in Robotic Storage Libraries Abstract Leana Golubchik y 3436 Boelter Hall, Graduate Student Oce UCLA Computer Science Department Los Angeles, CA 90024-1596 (310) 206-1803,

More information

Storage Hierarchy Management for Scientific Computing

Storage Hierarchy Management for Scientific Computing Storage Hierarchy Management for Scientific Computing by Ethan Leo Miller Sc. B. (Brown University) 1987 M.S. (University of California at Berkeley) 1990 A dissertation submitted in partial satisfaction

More information

Storage Hierarchy Management for Scientific Computing

Storage Hierarchy Management for Scientific Computing Storage Hierarchy Management for Scientific Computing by Ethan Leo Miller Sc. B. (Brown University) 1987 M.S. (University of California at Berkeley) 1990 A dissertation submitted in partial satisfaction

More information

Storage System. Distributor. Network. Drive. Drive. Storage System. Controller. Controller. Disk. Disk

Storage System. Distributor. Network. Drive. Drive. Storage System. Controller. Controller. Disk. Disk HRaid: a Flexible Storage-system Simulator Toni Cortes Jesus Labarta Universitat Politecnica de Catalunya - Barcelona ftoni, jesusg@ac.upc.es - http://www.ac.upc.es/hpc Abstract Clusters of workstations

More information

CHAPTER 4 AN INTEGRATED APPROACH OF PERFORMANCE PREDICTION ON NETWORKS OF WORKSTATIONS. Xiaodong Zhang and Yongsheng Song

CHAPTER 4 AN INTEGRATED APPROACH OF PERFORMANCE PREDICTION ON NETWORKS OF WORKSTATIONS. Xiaodong Zhang and Yongsheng Song CHAPTER 4 AN INTEGRATED APPROACH OF PERFORMANCE PREDICTION ON NETWORKS OF WORKSTATIONS Xiaodong Zhang and Yongsheng Song 1. INTRODUCTION Networks of Workstations (NOW) have become important distributed

More information

Chapter 14: Mass-Storage Systems

Chapter 14: Mass-Storage Systems Chapter 14: Mass-Storage Systems Disk Structure Disk Scheduling Disk Management Swap-Space Management RAID Structure Disk Attachment Stable-Storage Implementation Tertiary Storage Devices Operating System

More information

Tertiary Storage: Current Status and Future Trends. S. Prabhakar D. Agrawal A. El Abbadi A. Singh. University of California. Santa Barbara, CA 93106

Tertiary Storage: Current Status and Future Trends. S. Prabhakar D. Agrawal A. El Abbadi A. Singh. University of California. Santa Barbara, CA 93106 Tertiary Storage: Current Status and Future Trends S. Prabhakar D. Agrawal A. El Abbadi A. Singh Department of Computer Science University of California Santa Barbara, CA 93106 Abstract This report summarizes

More information

IBM Almaden Research Center, at regular intervals to deliver smooth playback of video streams. A video-on-demand

IBM Almaden Research Center, at regular intervals to deliver smooth playback of video streams. A video-on-demand 1 SCHEDULING IN MULTIMEDIA SYSTEMS A. L. Narasimha Reddy IBM Almaden Research Center, 650 Harry Road, K56/802, San Jose, CA 95120, USA ABSTRACT In video-on-demand multimedia systems, the data has to be

More information

SCSI bus: The SCSI cables that connect initiators and targets. 2.2 Architecture of the tape library The tape library includes the following types of c

SCSI bus: The SCSI cables that connect initiators and targets. 2.2 Architecture of the tape library The tape library includes the following types of c User-mode API for Tape Libraries Aram Khalili Computer Science Department University of Maryland Baltimore County 1000 Hilltop Circle Baltimore, MD 21250 akhali1@umbc.edu Abstract Tape libraries are becoming

More information

Benchmarking Tape System Performance

Benchmarking Tape System Performance Benchmarking Tape System Performance Theodore Johnson AT\&T Labs - Research 18 Park Ave., Bldg. 13 Florham Park, NJ 7932 +1 973 36-8779, fax +1 973 36-85 johnsont@research.att.com Ethan L. Miller Computer

More information

Database Systems II. Secondary Storage

Database Systems II. Secondary Storage Database Systems II Secondary Storage CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 29 The Memory Hierarchy Swapping, Main-memory DBMS s Tertiary Storage: Tape, Network Backup 3,200 MB/s (DDR-SDRAM

More information

Integrating Fibre Channel Storage Devices into the NCAR MSS

Integrating Fibre Channel Storage Devices into the NCAR MSS Integrating Fibre Channel Storage Devices into the NCAR MSS John Merrill National Center for Atmospheric Research 1850 Table Mesa Dr., Boulder, CO, 80305-5602 Phone:+1-303-497-1273 FAX: +1-303-497-1848

More information

Towards Scalable Benchmarks for Mass Storage Systems *

Towards Scalable Benchmarks for Mass Storage Systems * Towards Scalable Benchmarks for Mass Storage Systems * Ethan L. Miller Computer Science and Electrical Engineering Department University of Maryland Baltimore County 1000 Hilltop Drive Baltimore, MD 21228

More information

Today: Secondary Storage! Typical Disk Parameters!

Today: Secondary Storage! Typical Disk Parameters! Today: Secondary Storage! To read or write a disk block: Seek: (latency) position head over a track/cylinder. The seek time depends on how fast the hardware moves the arm. Rotational delay: (latency) time

More information

Module 13: Secondary-Storage

Module 13: Secondary-Storage Module 13: Secondary-Storage Disk Structure Disk Scheduling Disk Management Swap-Space Management Disk Reliability Stable-Storage Implementation Tertiary Storage Devices Operating System Issues Performance

More information

Chapter 12: Mass-Storage Systems. Operating System Concepts 8 th Edition,

Chapter 12: Mass-Storage Systems. Operating System Concepts 8 th Edition, Chapter 12: Mass-Storage Systems, Silberschatz, Galvin and Gagne 2009 Chapter 12: Mass-Storage Systems Overview of Mass Storage Structure Disk Structure Disk Attachment Disk Scheduling Disk Management

More information

Mass Storage System Performance Prediction Using a Trace-Driven Simulator

Mass Storage System Performance Prediction Using a Trace-Driven Simulator Mass Storage System Performance Prediction Using a Trace-Driven Simulator Bill Anderson National Center for Atmospheric Research (NCAR) * Boulder, CO 80305 andersnb@ucar.edu Abstract * Performance prediction

More information

CPS104 Computer Organization and Programming Lecture 18: Input-Output. Outline of Today s Lecture. The Big Picture: Where are We Now?

CPS104 Computer Organization and Programming Lecture 18: Input-Output. Outline of Today s Lecture. The Big Picture: Where are We Now? CPS104 Computer Organization and Programming Lecture 18: Input-Output Robert Wagner cps 104.1 RW Fall 2000 Outline of Today s Lecture The system Magnetic Disk Tape es DMA cps 104.2 RW Fall 2000 The Big

More information

Twelfth IEEE Symposium on Mass Storage Systems. Striped Tape Arrays. Ann L. Drapeau, Randy H. Katz University of California Berkeley, California

Twelfth IEEE Symposium on Mass Storage Systems. Striped Tape Arrays. Ann L. Drapeau, Randy H. Katz University of California Berkeley, California Striped Tape Arrays Ann L. Drapeau, Randy H. Katz University of California Berkeley, California Abstract A growing number of applications require high-capacity, high-throughput tertiary storage systems

More information

Introduction Disks RAID Tertiary storage. Mass Storage. CMSC 420, York College. November 21, 2006

Introduction Disks RAID Tertiary storage. Mass Storage. CMSC 420, York College. November 21, 2006 November 21, 2006 The memory hierarchy Red = Level Access time Capacity Features Registers nanoseconds 100s of bytes fixed Cache nanoseconds 1-2 MB fixed RAM nanoseconds MBs to GBs expandable Disk milliseconds

More information

Data Storage and Query Answering. Data Storage and Disk Structure (2)

Data Storage and Query Answering. Data Storage and Disk Structure (2) Data Storage and Query Answering Data Storage and Disk Structure (2) Review: The Memory Hierarchy Swapping, Main-memory DBMS s Tertiary Storage: Tape, Network Backup 3,200 MB/s (DDR-SDRAM @200MHz) 6,400

More information

Chapter 14: Mass-Storage Systems. Disk Structure

Chapter 14: Mass-Storage Systems. Disk Structure 1 Chapter 14: Mass-Storage Systems Disk Structure Disk Scheduling Disk Management Swap-Space Management RAID Structure Disk Attachment Stable-Storage Implementation Tertiary Storage Devices Operating System

More information

Silberschatz, et al. Topics based on Chapter 13

Silberschatz, et al. Topics based on Chapter 13 Silberschatz, et al. Topics based on Chapter 13 Mass Storage Structure CPSC 410--Richard Furuta 3/23/00 1 Mass Storage Topics Secondary storage structure Disk Structure Disk Scheduling Disk Management

More information

Cluster quality 15. Running time 0.7. Distance between estimated and true means Running time [s]

Cluster quality 15. Running time 0.7. Distance between estimated and true means Running time [s] Fast, single-pass K-means algorithms Fredrik Farnstrom Computer Science and Engineering Lund Institute of Technology, Sweden arnstrom@ucsd.edu James Lewis Computer Science and Engineering University of

More information

Distributed Video Systems Chapter 3 Storage Technologies

Distributed Video Systems Chapter 3 Storage Technologies Distributed Video Systems Chapter 3 Storage Technologies Jack Yiu-bun Lee Department of Information Engineering The Chinese University of Hong Kong Contents 3.1 Introduction 3.2 Magnetic Disks 3.3 Video

More information

Intelligent Data Mining. Metadata. Agent. Information Extraction. Information Extraction. Facilitator. Query Result. SQL Query.

Intelligent Data Mining. Metadata. Agent. Information Extraction. Information Extraction. Facilitator. Query Result. SQL Query. Web Based Parallel/Distributed Medical Data Mining Using Software Agents Hillol Kargupta, Brian Staord, and Ilker Hamzaoglu Computational Science Methods Group X Division, Los Alamos National Laboratory

More information

Performance Modeling of a Parallel I/O System: An. Application Driven Approach y. Abstract

Performance Modeling of a Parallel I/O System: An. Application Driven Approach y. Abstract Performance Modeling of a Parallel I/O System: An Application Driven Approach y Evgenia Smirni Christopher L. Elford Daniel A. Reed Andrew A. Chien Abstract The broadening disparity between the performance

More information

CHAPTER 12: MASS-STORAGE SYSTEMS (A) By I-Chen Lin Textbook: Operating System Concepts 9th Ed.

CHAPTER 12: MASS-STORAGE SYSTEMS (A) By I-Chen Lin Textbook: Operating System Concepts 9th Ed. CHAPTER 12: MASS-STORAGE SYSTEMS (A) By I-Chen Lin Textbook: Operating System Concepts 9th Ed. Chapter 12: Mass-Storage Systems Overview of Mass-Storage Structure Disk Structure Disk Attachment Disk Scheduling

More information

I/O, 2002 Journal of Software. Vol.13, No /2002/13(08)

I/O, 2002 Journal of Software. Vol.13, No /2002/13(08) 1-9825/22/13(8)1612-9 22 Journal of Software Vol13, No8 I/O, (,184) E-mail: {shijing,dcszlz}@mailstsinghuaeducn http://dbgroupcstsinghuaeducn :, I/O,,, - - - : ; I/O ; ; ; - : TP311 : A,, [1], 1 12 (Terabyte),

More information

Overview of Mass Storage Structure

Overview of Mass Storage Structure Overview of Mass Storage Structure Magnetic disks provide bulk of secondary storage Drives rotate at 70 to 250 times per second Ipod disks: 4200 rpm Laptop disks: 4200, 5400 rpm or 7200 rpm Desktop disks:

More information

Chapter 12: Mass-Storage

Chapter 12: Mass-Storage Chapter 12: Mass-Storage Systems Chapter 12: Mass-Storage Systems Revised 2010. Tao Yang Overview of Mass Storage Structure Disk Structure Disk Attachment Disk Scheduling Disk Management Swap-Space Management

More information

Multimedia Storage Servers

Multimedia Storage Servers Multimedia Storage Servers Cyrus Shahabi shahabi@usc.edu Computer Science Department University of Southern California Los Angeles CA, 90089-0781 http://infolab.usc.edu 1 OUTLINE Introduction Continuous

More information

CSE 124: Networked Services Lecture-16

CSE 124: Networked Services Lecture-16 Fall 2010 CSE 124: Networked Services Lecture-16 Instructor: B. S. Manoj, Ph.D http://cseweb.ucsd.edu/classes/fa10/cse124 11/23/2010 CSE 124 Networked Services Fall 2010 1 Updates PlanetLab experiments

More information

A Disk Head Scheduling Simulator

A Disk Head Scheduling Simulator A Disk Head Scheduling Simulator Steven Robbins Department of Computer Science University of Texas at San Antonio srobbins@cs.utsa.edu Abstract Disk head scheduling is a standard topic in undergraduate

More information

Tape pictures. CSE 30341: Operating Systems Principles

Tape pictures. CSE 30341: Operating Systems Principles Tape pictures 4/11/07 CSE 30341: Operating Systems Principles page 1 Tape Drives The basic operations for a tape drive differ from those of a disk drive. locate positions the tape to a specific logical

More information

On Checkpoint Latency. Nitin H. Vaidya. In the past, a large number of researchers have analyzed. the checkpointing and rollback recovery scheme

On Checkpoint Latency. Nitin H. Vaidya. In the past, a large number of researchers have analyzed. the checkpointing and rollback recovery scheme On Checkpoint Latency Nitin H. Vaidya Department of Computer Science Texas A&M University College Station, TX 77843-3112 E-mail: vaidya@cs.tamu.edu Web: http://www.cs.tamu.edu/faculty/vaidya/ Abstract

More information

CSE 124: Networked Services Fall 2009 Lecture-19

CSE 124: Networked Services Fall 2009 Lecture-19 CSE 124: Networked Services Fall 2009 Lecture-19 Instructor: B. S. Manoj, Ph.D http://cseweb.ucsd.edu/classes/fa09/cse124 Some of these slides are adapted from various sources/individuals including but

More information

The Google File System

The Google File System October 13, 2010 Based on: S. Ghemawat, H. Gobioff, and S.-T. Leung: The Google file system, in Proceedings ACM SOSP 2003, Lake George, NY, USA, October 2003. 1 Assumptions Interface Architecture Single

More information

Technische Universitat Munchen. Institut fur Informatik. D Munchen.

Technische Universitat Munchen. Institut fur Informatik. D Munchen. Developing Applications for Multicomputer Systems on Workstation Clusters Georg Stellner, Arndt Bode, Stefan Lamberts and Thomas Ludwig? Technische Universitat Munchen Institut fur Informatik Lehrstuhl

More information

FB(9,3) Figure 1(a). A 4-by-4 Benes network. Figure 1(b). An FB(4, 2) network. Figure 2. An FB(27, 3) network

FB(9,3) Figure 1(a). A 4-by-4 Benes network. Figure 1(b). An FB(4, 2) network. Figure 2. An FB(27, 3) network Congestion-free Routing of Streaming Multimedia Content in BMIN-based Parallel Systems Harish Sethu Department of Electrical and Computer Engineering Drexel University Philadelphia, PA 19104, USA sethu@ece.drexel.edu

More information

CS3600 SYSTEMS AND NETWORKS

CS3600 SYSTEMS AND NETWORKS CS3600 SYSTEMS AND NETWORKS NORTHEASTERN UNIVERSITY Lecture 9: Mass Storage Structure Prof. Alan Mislove (amislove@ccs.neu.edu) Moving-head Disk Mechanism 2 Overview of Mass Storage Structure Magnetic

More information

Storage. CS 3410 Computer System Organization & Programming

Storage. CS 3410 Computer System Organization & Programming Storage CS 3410 Computer System Organization & Programming These slides are the product of many rounds of teaching CS 3410 by Deniz Altinbuke, Kevin Walsh, and Professors Weatherspoon, Bala, Bracy, and

More information

CS370 Operating Systems

CS370 Operating Systems CS370 Operating Systems Colorado State University Yashwant K Malaiya Fall 2016 Lecture 35 Mass Storage Slides based on Text by Silberschatz, Galvin, Gagne Various sources 1 1 Questions For You Local/Global

More information

Reduction of Periodic Broadcast Resource Requirements with Proxy Caching

Reduction of Periodic Broadcast Resource Requirements with Proxy Caching Reduction of Periodic Broadcast Resource Requirements with Proxy Caching Ewa Kusmierek and David H.C. Du Digital Technology Center and Department of Computer Science and Engineering University of Minnesota

More information

Relative Reduced Hops

Relative Reduced Hops GreedyDual-Size: A Cost-Aware WWW Proxy Caching Algorithm Pei Cao Sandy Irani y 1 Introduction As the World Wide Web has grown in popularity in recent years, the percentage of network trac due to HTTP

More information

The Google File System

The Google File System The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung December 2003 ACM symposium on Operating systems principles Publisher: ACM Nov. 26, 2008 OUTLINE INTRODUCTION DESIGN OVERVIEW

More information

perform well on paths including satellite links. It is important to verify how the two ATM data services perform on satellite links. TCP is the most p

perform well on paths including satellite links. It is important to verify how the two ATM data services perform on satellite links. TCP is the most p Performance of TCP/IP Using ATM ABR and UBR Services over Satellite Networks 1 Shiv Kalyanaraman, Raj Jain, Rohit Goyal, Sonia Fahmy Department of Computer and Information Science The Ohio State University

More information

1. Introduction. Traditionally, a high bandwidth file system comprises a supercomputer with disks connected

1. Introduction. Traditionally, a high bandwidth file system comprises a supercomputer with disks connected 1. Introduction Traditionally, a high bandwidth file system comprises a supercomputer with disks connected by a high speed backplane bus such as SCSI [3][4] or Fibre Channel [2][67][71]. These systems

More information

MASS-STORAGE STRUCTURE

MASS-STORAGE STRUCTURE UNIT IV MASS-STORAGE STRUCTURE Mass-Storage Systems ndescribe the physical structure of secondary and tertiary storage devices and the resulting effects on the uses of the devicesnexplain the performance

More information

Chapter 10: Mass-Storage Systems

Chapter 10: Mass-Storage Systems COP 4610: Introduction to Operating Systems (Spring 2016) Chapter 10: Mass-Storage Systems Zhi Wang Florida State University Content Overview of Mass Storage Structure Disk Structure Disk Scheduling Disk

More information

Frank Miller, George Apostolopoulos, and Satish Tripathi. University of Maryland. College Park, MD ffwmiller, georgeap,

Frank Miller, George Apostolopoulos, and Satish Tripathi. University of Maryland. College Park, MD ffwmiller, georgeap, Simple Input/Output Streaming in the Operating System Frank Miller, George Apostolopoulos, and Satish Tripathi Mobile Computing and Multimedia Laboratory Department of Computer Science University of Maryland

More information

Real-time communication scheduling in a multicomputer video. server. A. L. Narasimha Reddy Eli Upfal. 214 Zachry 650 Harry Road.

Real-time communication scheduling in a multicomputer video. server. A. L. Narasimha Reddy Eli Upfal. 214 Zachry 650 Harry Road. Real-time communication scheduling in a multicomputer video server A. L. Narasimha Reddy Eli Upfal Texas A & M University IBM Almaden Research Center 214 Zachry 650 Harry Road College Station, TX 77843-3128

More information

Disk Scheduling with Dynamic Request Priorities. Yongcheng Li See-Mong Tan Zhigang Chen Roy H. Campbell. Department of Computer Science

Disk Scheduling with Dynamic Request Priorities. Yongcheng Li See-Mong Tan Zhigang Chen Roy H. Campbell. Department of Computer Science Disk Scheduling with Dynamic Request Priorities Yongcheng Li See-Mong Tan Zhigang Chen Roy H. Campbell Department of Computer Science University of Illinois at Urbana-Champaign Digital Computer Laboratory

More information

Distributed Scheduling for the Sombrero Single Address Space Distributed Operating System

Distributed Scheduling for the Sombrero Single Address Space Distributed Operating System Distributed Scheduling for the Sombrero Single Address Space Distributed Operating System Donald S. Miller Department of Computer Science and Engineering Arizona State University Tempe, AZ, USA Alan C.

More information

The Google File System

The Google File System The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung SOSP 2003 presented by Kun Suo Outline GFS Background, Concepts and Key words Example of GFS Operations Some optimizations in

More information

Chapter 13: Mass-Storage Systems. Disk Scheduling. Disk Scheduling (Cont.) Disk Structure FCFS. Moving-Head Disk Mechanism

Chapter 13: Mass-Storage Systems. Disk Scheduling. Disk Scheduling (Cont.) Disk Structure FCFS. Moving-Head Disk Mechanism Chapter 13: Mass-Storage Systems Disk Scheduling Disk Structure Disk Scheduling Disk Management Swap-Space Management RAID Structure Disk Attachment Stable-Storage Implementation Tertiary Storage Devices

More information

Chapter 13: Mass-Storage Systems. Disk Structure

Chapter 13: Mass-Storage Systems. Disk Structure Chapter 13: Mass-Storage Systems Disk Structure Disk Scheduling Disk Management Swap-Space Management RAID Structure Disk Attachment Stable-Storage Implementation Tertiary Storage Devices Operating System

More information

Verification and Validation of X-Sim: A Trace-Based Simulator

Verification and Validation of X-Sim: A Trace-Based Simulator http://www.cse.wustl.edu/~jain/cse567-06/ftp/xsim/index.html 1 of 11 Verification and Validation of X-Sim: A Trace-Based Simulator Saurabh Gayen, sg3@wustl.edu Abstract X-Sim is a trace-based simulator

More information

Storage Technology Requirements of the NCAR Mass Storage System

Storage Technology Requirements of the NCAR Mass Storage System Storage Technology Requirements of the NCAR Mass Storage System Gene Harano National Center for Atmospheric Research (NCAR) 1850 Table Mesa Dr. Boulder, CO 80303 Phone: +1-303-497-1203; FAX: +1-303-497-1848

More information

Improving object cache performance through selective placement

Improving object cache performance through selective placement University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 2006 Improving object cache performance through selective placement Saied

More information

Two hours - online. The exam will be taken on line. This paper version is made available as a backup

Two hours - online. The exam will be taken on line. This paper version is made available as a backup COMP 25212 Two hours - online The exam will be taken on line. This paper version is made available as a backup UNIVERSITY OF MANCHESTER SCHOOL OF COMPUTER SCIENCE System Architecture Date: Monday 21st

More information

Google File System. By Dinesh Amatya

Google File System. By Dinesh Amatya Google File System By Dinesh Amatya Google File System (GFS) Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung designed and implemented to meet rapidly growing demand of Google's data processing need a scalable

More information

Chapter 11. I/O Management and Disk Scheduling

Chapter 11. I/O Management and Disk Scheduling Operating System Chapter 11. I/O Management and Disk Scheduling Lynn Choi School of Electrical Engineering Categories of I/O Devices I/O devices can be grouped into 3 categories Human readable devices

More information

Chapter 5: Input Output Management. Slides by: Ms. Shree Jaswal

Chapter 5: Input Output Management. Slides by: Ms. Shree Jaswal : Input Output Management Slides by: Ms. Shree Jaswal Topics as per syllabus I/O Devices, Organization of the I/O Function, Operating System Design Issues, I/O Buffering, Disk Scheduling and disk scheduling

More information

6.2 DATA DISTRIBUTION AND EXPERIMENT DETAILS

6.2 DATA DISTRIBUTION AND EXPERIMENT DETAILS Chapter 6 Indexing Results 6. INTRODUCTION The generation of inverted indexes for text databases is a computationally intensive process that requires the exclusive use of processing resources for long

More information

A Study of the Performance Tradeoffs of a Tape Archive

A Study of the Performance Tradeoffs of a Tape Archive A Study of the Performance Tradeoffs of a Tape Archive Jason Xie (jasonxie@cs.wisc.edu) Naveen Prakash (naveen@cs.wisc.edu) Vishal Kathuria (vishal@cs.wisc.edu) Computer Sciences Department University

More information

CLOUD-SCALE FILE SYSTEMS

CLOUD-SCALE FILE SYSTEMS Data Management in the Cloud CLOUD-SCALE FILE SYSTEMS 92 Google File System (GFS) Designing a file system for the Cloud design assumptions design choices Architecture GFS Master GFS Chunkservers GFS Clients

More information

is developed which describe the mean values of various system parameters. These equations have circular dependencies and must be solved iteratively. T

is developed which describe the mean values of various system parameters. These equations have circular dependencies and must be solved iteratively. T A Mean Value Analysis Multiprocessor Model Incorporating Superscalar Processors and Latency Tolerating Techniques 1 David H. Albonesi Israel Koren Department of Electrical and Computer Engineering University

More information

Data Placement for Tertiary Storage

Data Placement for Tertiary Storage Data Placement for Tertiary Storage Jiangtao Li Department of Computer Sciences Purdue University West Lafayette, IN 47907 U.S.A. jtli@cs.purdue.edu Phone: + 1 765 494-6008 Fax: + 1 765 494-0739 Sunil

More information

CSE 451: Operating Systems Spring Module 12 Secondary Storage. Steve Gribble

CSE 451: Operating Systems Spring Module 12 Secondary Storage. Steve Gribble CSE 451: Operating Systems Spring 2009 Module 12 Secondary Storage Steve Gribble Secondary storage Secondary storage typically: is anything that is outside of primary memory does not permit direct execution

More information

V. Mass Storage Systems

V. Mass Storage Systems TDIU25: Operating Systems V. Mass Storage Systems SGG9: chapter 12 o Mass storage: Hard disks, structure, scheduling, RAID Copyright Notice: The lecture notes are mainly based on modifications of the slides

More information

Chapter 14: Mass-Storage Systems

Chapter 14: Mass-Storage Systems Chapter 14: Mass-Storage Systems Disk Structure Disk Scheduling Disk Management Swap-Space Management RAID Structure Disk Attachment Stable-Storage Implementation Tertiary Storage Devices Operating System

More information

Storing Data: Disks and Files

Storing Data: Disks and Files Storing Data: Disks and Files Chapter 7 (2 nd edition) Chapter 9 (3 rd edition) Yea, from the table of my memory I ll wipe away all trivial fond records. -- Shakespeare, Hamlet Database Management Systems,

More information

Access Time Modeling of a MLR1 Tape Drive

Access Time Modeling of a MLR1 Tape Drive Access Time Modeling of a MLR1 Tape Drive Olav Sandstå, Thomas Maukon Andersen, Roger Midtstraum, and Rune Sætre Department of Computer and Information Science Norwegian University of Science and Technology

More information

Associate Professor Dr. Raed Ibraheem Hamed

Associate Professor Dr. Raed Ibraheem Hamed Associate Professor Dr. Raed Ibraheem Hamed University of Human Development, College of Science and Technology Computer Science Department 2015 2016 1 Points to Cover Storing Data in a DBMS Primary Storage

More information

Using Statistical Properties of Text to Create. Metadata. Computer Science and Electrical Engineering Department

Using Statistical Properties of Text to Create. Metadata. Computer Science and Electrical Engineering Department Using Statistical Properties of Text to Create Metadata Grace Crowder crowder@cs.umbc.edu Charles Nicholas nicholas@cs.umbc.edu Computer Science and Electrical Engineering Department University of Maryland

More information

Warehouse- Scale Computing and the BDAS Stack

Warehouse- Scale Computing and the BDAS Stack Warehouse- Scale Computing and the BDAS Stack Ion Stoica UC Berkeley UC BERKELEY Overview Workloads Hardware trends and implications in modern datacenters BDAS stack What is Big Data used For? Reports,

More information

Network Load Balancing Methods: Experimental Comparisons and Improvement

Network Load Balancing Methods: Experimental Comparisons and Improvement Network Load Balancing Methods: Experimental Comparisons and Improvement Abstract Load balancing algorithms play critical roles in systems where the workload has to be distributed across multiple resources,

More information

X1 StorNext SAN. Jim Glidewell Information Technology Services Boeing Shared Services Group

X1 StorNext SAN. Jim Glidewell Information Technology Services Boeing Shared Services Group X1 StorNext SAN Jim Glidewell Information Technology Services Boeing Shared Services Group Current HPC Systems Page 2 Two Cray T-90's A 384 CPU Origin 3800 Three 256-CPU Linux clusters Cray X1 X1 SAN -

More information

COMPUTE PARTITIONS Partition n. Partition 1. Compute Nodes HIGH SPEED NETWORK. I/O Node k Disk Cache k. I/O Node 1 Disk Cache 1.

COMPUTE PARTITIONS Partition n. Partition 1. Compute Nodes HIGH SPEED NETWORK. I/O Node k Disk Cache k. I/O Node 1 Disk Cache 1. Parallel I/O from the User's Perspective Jacob Gotwals Suresh Srinivas Shelby Yang Department of r Science Lindley Hall 215, Indiana University Bloomington, IN, 4745 fjgotwals,ssriniva,yangg@cs.indiana.edu

More information

RESPONSIVENESS IN A VIDEO. College Station, TX In this paper, we will address the problem of designing an interactive video server

RESPONSIVENESS IN A VIDEO. College Station, TX In this paper, we will address the problem of designing an interactive video server 1 IMPROVING THE INTERACTIVE RESPONSIVENESS IN A VIDEO SERVER A. L. Narasimha Reddy ABSTRACT Dept. of Elec. Engg. 214 Zachry Texas A & M University College Station, TX 77843-3128 reddy@ee.tamu.edu In this

More information

Nowadays data-intensive applications play a

Nowadays data-intensive applications play a Journal of Advances in Computer Engineering and Technology, 3(2) 2017 Data Replication-Based Scheduling in Cloud Computing Environment Bahareh Rahmati 1, Amir Masoud Rahmani 2 Received (2016-02-02) Accepted

More information

Implementing and Evaluating Jukebox Schedulers Using JukeTools

Implementing and Evaluating Jukebox Schedulers Using JukeTools Implementing and Evaluating Schedulers Using JukeTools Maria Eva Lijding Sape Mullender Pierre Jansen Fac. of Computer Science, University of Twente P.O.Box 217, 7500AE Enschede, The Netherlands E-mail:

More information

CS252 S05. CMSC 411 Computer Systems Architecture Lecture 18 Storage Systems 2. I/O performance measures. I/O performance measures

CS252 S05. CMSC 411 Computer Systems Architecture Lecture 18 Storage Systems 2. I/O performance measures. I/O performance measures CMSC 411 Computer Systems Architecture Lecture 18 Storage Systems 2 I/O performance measures I/O performance measures diversity: which I/O devices can connect to the system? capacity: how many I/O devices

More information

Google File System. Arun Sundaram Operating Systems

Google File System. Arun Sundaram Operating Systems Arun Sundaram Operating Systems 1 Assumptions GFS built with commodity hardware GFS stores a modest number of large files A few million files, each typically 100MB or larger (Multi-GB files are common)

More information

Operating system Dr. Shroouq J.

Operating system Dr. Shroouq J. 2.2.2 DMA Structure In a simple terminal-input driver, when a line is to be read from the terminal, the first character typed is sent to the computer. When that character is received, the asynchronous-communication

More information

Chapter 12: Mass-Storage

Chapter 12: Mass-Storage hapter 12: Mass-Storage Systems hapter 12: Mass-Storage Systems To explain the performance characteristics of mass-storage devices To evaluate disk scheduling algorithms To discuss operating-system services

More information

Distributed Video Systems Chapter 5 Issues in Video Storage and Retrieval Part I - The Single-Disk Case

Distributed Video Systems Chapter 5 Issues in Video Storage and Retrieval Part I - The Single-Disk Case Distributed Video Systems Chapter 5 Issues in Video Storage and Retrieval Part I - he Single-Disk Case Jack Yiu-bun Lee Department of Information Engineering he Chinese University of Hong Kong Contents

More information

Application DBMS. Media Server

Application DBMS. Media Server Scheduling and Optimization of the Delivery of Multimedia Streams Using Query Scripts Scott T. Campbell (scott@cc-campbell.com) Department of Computer Science and Systems Analysis, Miami University, Oxford,

More information

Chapter 12: Mass-Storage

Chapter 12: Mass-Storage hapter 12: Mass-Storage Systems hapter 12: Mass-Storage Systems Overview of Mass Storage Structure Disk Structure Disk Attachment Disk Scheduling Disk Management RAID Structure Objectives Moving-head Disk

More information

Table 9. ASCI Data Storage Requirements

Table 9. ASCI Data Storage Requirements Table 9. ASCI Data Storage Requirements 1998 1999 2000 2001 2002 2003 2004 ASCI memory (TB) Storage Growth / Year (PB) Total Storage Capacity (PB) Single File Xfr Rate (GB/sec).44 4 1.5 4.5 8.9 15. 8 28

More information

Automated Clustering-Based Workload Characterization

Automated Clustering-Based Workload Characterization Automated Clustering-Based Worload Characterization Odysseas I. Pentaalos Daniel A. MenascŽ Yelena Yesha Code 930.5 Dept. of CS Dept. of EE and CS NASA GSFC Greenbelt MD 2077 George Mason University Fairfax

More information

Eect of fan-out on the Performance of a. Single-message cancellation scheme. Atul Prakash (Contact Author) Gwo-baw Wu. Seema Jetli

Eect of fan-out on the Performance of a. Single-message cancellation scheme. Atul Prakash (Contact Author) Gwo-baw Wu. Seema Jetli Eect of fan-out on the Performance of a Single-message cancellation scheme Atul Prakash (Contact Author) Gwo-baw Wu Seema Jetli Department of Electrical Engineering and Computer Science University of Michigan,

More information

Virtual Memory. Reading. Sections 5.4, 5.5, 5.6, 5.8, 5.10 (2) Lecture notes from MKP and S. Yalamanchili

Virtual Memory. Reading. Sections 5.4, 5.5, 5.6, 5.8, 5.10 (2) Lecture notes from MKP and S. Yalamanchili Virtual Memory Lecture notes from MKP and S. Yalamanchili Sections 5.4, 5.5, 5.6, 5.8, 5.10 Reading (2) 1 The Memory Hierarchy ALU registers Cache Memory Memory Memory Managed by the compiler Memory Managed

More information

Distributed File Systems II

Distributed File Systems II Distributed File Systems II To do q Very-large scale: Google FS, Hadoop FS, BigTable q Next time: Naming things GFS A radically new environment NFS, etc. Independence Small Scale Variety of workloads Cooperation

More information

Implementing a Digital Video Archive Based on the Sony PetaSite and XenData Software

Implementing a Digital Video Archive Based on the Sony PetaSite and XenData Software Based on the Sony PetaSite and XenData Software The Video Edition of XenData Archive Series software manages a Sony PetaSite tape library on a Windows Server 2003 platform to create a digital video archive

More information

Consistent Logical Checkpointing. Nitin H. Vaidya. Texas A&M University. Phone: Fax:

Consistent Logical Checkpointing. Nitin H. Vaidya. Texas A&M University. Phone: Fax: Consistent Logical Checkpointing Nitin H. Vaidya Department of Computer Science Texas A&M University College Station, TX 77843-3112 hone: 409-845-0512 Fax: 409-847-8578 E-mail: vaidya@cs.tamu.edu Technical

More information