Adaptive disk scheduling for overload management

Size: px
Start display at page:

Download "Adaptive disk scheduling for overload management"

Transcription

1 Adaptive disk scheduling for overload management Alma Riska Seagate Research 1251 Waterfront Place Pittsburgh, PA Erik Riedel Seagate Research 1251 Waterfront Place Pittsburgh, PA Sami Iren Seagate Research 1251 Waterfront Place Pittsburgh, PA Abstract Most computer systems today are lightly loaded in normal operation. The real performance problems occur during burst times when the system becomes overloaded. We evaluate how the choice of scheduling algorithms can assist a system in maintaining stable performance while operating under transient overloads. We propose a new disk scheduling algorithm that efficiently handles overload by dynamically adjusting its parameters. The algorithm adapts its operation to the current load conditions and achieves good overall performance, while maintaining minimal variability in the request response time. We evaluate the robustness of the algorithm against different disks and synthetic and realistic traces measured in benchmarked systems. 1. Introduction Today s computer systems, in particular those supporting Internet applications, are characterized by swift and sharp fluctuations in load [15]. Although designers aim to provide systems with enough resources to sustain the worstcase load, the dynamics of applications and the event-driven nature of the request intensity make the worst-case scenario difficult to predict. The burstiness in request arrivals propagates through all layers in the system, including the network subsystem, memory and caches at multiple levels, and finally into the storage subsystem. As an example of burstiness in storage subsystems, previous measurements indicate that load fluctuates severely reaching as many as outstanding requests even when traditional (non-internet) applications generate the storage subsystem workload [9]. Clearly, the long term solution for handling persistent overload conditions, i.e., long request queues, is to increase the system resources. However, if the system experiences unexpected transient overload conditions then better management of the available resources and adjustment of system operation can avoid system collapse and allow for graceful degradation of performance. Given the sudden nature of transient overload conditions, we focus on ways that allow the system to adapt its operation to current load conditions without human intervention. Because our goal is to handle sharp and transient increases in the load intensities, we focus on ways to adjust system operation in fine time scales, i.e., time scales that measure the average request service time. At the disk level, we achieve this goal by proposing an adaptive disk scheduling algorithm based on disk-level characteristics. The algorithm adapts its operation on-the-fly based on the disk load. By disk-level characteristics, we mean the disk properties that are used for efficient scheduling, such as the relative seek and rotational latency for every request. In addition to building self-adjusting storage subsystems via load-based adaptive disk scheduling algorithms, effort has been put to adapt the disk layout to the disk access pattern and achieve better locality and sequentiality in the stream of requests [1]. In contrast to our approach, this work focuses on coarser time scales, i.e., days, rather than the fine time scales i.e., milliseconds that we consider. Various model-based approaches have been proposed for adaptive resource management at the storage subsystems [1] and in higher layers of the system hierarchy [3]. The remainder of the paper is organized as follows. Section 2 presents an overview of disk scheduling algorithms. We describe the synthetic workload that we use in our simulation-based analysis in Section 3. The evaluation and analysis that leads to our adaptive algorithm is presented in Section 4. We introduce our algorithm and analyze its performance in Section 5. We continue with performance analysis of our adaptive algorithm under a real workload in Section 6. We summarize our results and conclude in Section Background Apart from FCFS, there are two major categories of disk scheduling algorithms, namely seek-based and positionbased. Seek-based disk algorithms, such as SCAN, LOOK, Proceedings of the First International Conference on the Quantitative Evaluation of Systems (QEST 4) /4 $ 2. IEEE Authorized licensed use limited to: Seagate Technology. Downloaded on December 3, 28 at 16:21 from IEEE Xplore. Restrictions apply.

2 Shortest Seek Time First (SSTF), schedule the request with the shortest seek time first, where seek time is the time the disk head positions from one track to the next 1. Positionbased algorithms, such as the Shortest Positioning Time First (SPTF), schedule the request with the shortest seek + rotation latency first, where rotation latency is the time that a track has to rotate and reach the sector with the requested data. Numerous papers provide comprehensive analysis of these disk scheduling algorithms [11, 5, 16]. Position-based disk scheduling algorithms achieve the best overall performance [5, 2]. The optimal disk scheduling algorithm generates the schedule such that the time to serve all outstanding requests is minimal [2] rather than selecting the request with the shortest positioning time as SPTF does. The optimal disk scheduling algorithm is computationally expensive and, in practice, disk drives use SPTF to schedule requests. As such, we use SPTF as the base case in our evaluation. Although position and seek-based disk scheduling algorithms might introduce high variation in request response times [16], their performance benefits overcome this drawback [8]. A variant of the SPTF algorithm that focuses on reducing variation in the request response time is the Batched SPTF (B-SPTF) algorithm [5]. This algorithm partitions requests into batches and applies SPTF only over the requests of the first (oldest) batch. Upon completion of requests in the first batch, the algorithm continues with requests in the second batch. Another variation of the B-SPTF scheduling algorithm, Leaky B-SPTF, allows for new requests to be admitted in the batch of requests in service, if a schedule can be found that does not violate the deadline to completely serve the current batch [5]. We propose a very similar algorithm called Window-Based SPTF (WB-SPTF). This algorithm applies SPTF over requests that fall within a sliding time window rather than on batches of requests. The arrival time of the oldest request in the queue serves as the starting reference for the time window in the WB-SPTF scheduling algorithm. The performance of batched-based disk scheduling algorithms, which define the batches either by number of requests (B-SPTF) or by time window (WB-SPTF), depends on a single parameter, namely the size of the batch or the size of the time window. The focus and contribution of this paper is on the analysis and evaluation of the effects of the window size, in WB-SPTF performance, as well as on the new adaptive algorithm, Dynamic WB-SPTF, that we propose based on such analysis. The Dynamic WB-SPTF algorithm adjusts its window size according to the load in the system. This algorithm aims to find the right window size for a given system load and applies the SPTF algorithm only 1 On a disk the data is stored in circular tracks, which are logically partitioned in sectors. over the requests that fall within the current window. The algorithm is self-adjusting because it adapts its parameters to the load conditions. The Dynamic WB-SPTF performance is better or similar to the performance of SPTF, depending on workload characteristics. Dynamic WB-SPTF performs similar to WB-SPTF with the optimal window size. Our analysis shows that Dynamic WB-SPTF is robust and performs well for different types of workloads and hardware (i.e., disk) characteristics. 3. Synthetic Workload Characterization Our objective is to use request scheduling to improve disk performance under fluctuating request arrival intensities. Bursty arrivals might result in long disk queues (i.e., higher than 1), causing overload periods. An overload condition at the disk depends on two factors, namely the request arrival intensity and workload characteristics represented by the amount of locality and sequentiality in the disk access pattern. Our initial analysis is based on tracedriven simulation using a synthetic trace which is characterized by fluctuations in both arrival intensities and workload characteristics. The synthetic trace consists of 64,726 requests and spans a time interval of 4 seconds. The arrival process of the trace, depicted in Figure 1, has three intervals of high arrival intensities that cause overload at the disk. These intervals are (1,14), (2, 24), and (325, 365) seconds, which we denote by B, D, and F, respectively. Each interval of overload is followed by an interval of low load in the system, i.e., intervals A, C, E, and G in Figure 1, so that the effect of the overload in the system performance is not carried over from one interval to the next. Our synthetic trace is a mix of random and sequential disk accesses as shown in Figure 1. In the synthetic trace, we do not have locality in the disk access pattern. The sequential accesses are obtained by assuming various simultaneous video streaming. With our synthetic trace, we aim to analyze system behavior in overload in three different scenarios; 1 - under random workload as in interval B, 2 - under a mix of random and sequential workload as in interval D, and 3 - under a pure mix of only sequential streams as in interval F. Table 1 highlights the load and workload characteristics in each of the intervals that we define in Figure 1. In our analysis, we use DiskSim 2. [4] as the disk level simulator. This simulator is appropriately modified to accommodate the WB-SPTF and the Dynamic WB-SPTF scheduling algorithms. The disk that we simulate in all our experiments with the synthetic trace described in Figure 1, is the Seagate Barracuda with 72 rpm and a total of 4,11, blocks. In Section 6, we present another set of experiments with measured data in the Seagate Cheetah with 1, rpm and a total of 17,783,24 blocks. Note that Bar- Proceedings of the First International Conference on the Quantitative Evaluation of Systems (QEST 4) /4 $ 2. IEEE Authorized licensed use limited to: Seagate Technology. Downloaded on December 3, 28 at 16:21 from IEEE Xplore. Restrictions apply.

3 Requests per second 7 A B C D E F G Time (in milliseconds) Block number A B C D E F G 4e+6 3.5e+6 3e+6 2.5e+6 2e+6 1.5e+6 1e Time (in milliseconds) Figure 1. Characteristics of the synthetic trace used in our analysis; arrival rate, disk access pattern. Interval A B C D E F G Load Low High Low High Low High Low Workload Random Random Random Random Random Sequential Sequential Sequential Sequential Table 1. Characteristics of the synthetic trace per interval of time racuda is an ATA disk drive while Cheetah is a SCSI disk drive. By selecting different hardware to test our algorithm, we aim to evaluate its robustness in different environments. 4. Performance of FCFS, SPTF, and WB- SPTF under overload First, we evaluate what impact transient overloads, as those depicted in Figure 1, have on disk performance. Initially, we compare the performance of FCFS and SPTF scheduling algorithms. Figure 2 illustrates the individual response time for each request in the trace as a function of its arrival time. Observe that the performance of FCFS is poor (average response time is ms) compared to the performance of SPTF (average response time is 1565 ms). In addition, we compute the standard deviation in response time as an important metric that measures variability in disk performance. The variability introduced by SPTF in request response time, i.e., the standard deviation, is 3293 ms. In the case of the transient overload, the SPTF response time standard deviation (3293 ms), although high (more than twice the average response time), is less severe when compared with the FCFS response time standard deviation, which although in the same range as the FCFS average response time, is very high. Note that in Figure 2, FCFS data is incomplete because the simulation could not finish the entire trace due to limitation of resources, i.e. queues longer than the available buffer space. Disk performance improves further by applying the WB- SPTF scheduling algorithm instead of pure SPTF. Figure 2 plots the response time of individual requests as a function of arrival time for SPTF and WB-SPTF() scheduling algorithms, where WB-SPTF() indicates the Window-Based-SPTF scheduling algorithm with window size of milliseconds. Note that, in cases of overload, under WB-SPTF(), response time has a lower standard deviation (only 1657 ms) compared to the SPTF response time standard deviation (3293 ms). The average response times of both scheduling algorithms are in the same range. For medium or light load, WB-SPTF() and SPTF perform essentially the same (i.e., the window size is large enough and allows all outstanding requests to be included in the batch of requests where the SPTF scheduling algorithm applies). The window size of milliseconds for the WB-SPTF scheduling algorithm is picked to demonstrate the benefits of applying SPTF only over a portion of outstanding requests once the system operates in overload. Our experiments show that other window sizes generate better results than WB-SPTF(), not only overall, but particularly for individual overload time intervals. We evaluate the performance of WB-SPTF for 6 different window sizes, i.e.,,,,,, and milliseconds and present the average response times and the standard deviation of response time in Figures 3 and 4, respectively. Specifically, we present the overall performance of each WB-SPTF and the pure SPTF in Figures 3 and 4, and for each overload interval B, D, and F in Figures 3,(c), and (d) and 4, (c), and (d), respectively. The highlighted bar in each of the graphs of Figures 3 and 4 corresponds to the value of Proceedings of the First International Conference on the Quantitative Evaluation of Systems (QEST 4) /4 $ 2. IEEE Authorized licensed use limited to: Seagate Technology. Downloaded on December 3, 28 at 16:21 from IEEE Xplore. Restrictions apply.

4 Response Time (millisec) 4 3 FCFS 2 1 SPTF Time (millisec) Response Time (millisec) 3 SPTF 2 1 WB-SPTF Time (millisec) Figure 2. Individual request response time under FCFS and SPTF and SPTF and WB- SPTF() scheduling algorithms. the window size for which WB-SPTF performs best, while the first bar in each graph correspond to the SPTF scheduling algorithm. The results of Figures 3 and 4 indicate that - Performance of SPTF and WB-SPTF depends on both the load in the system and the workload mix (i.e., random, sequential, or their mix). - Performance of WB-SPTF depends on the window size. If the window size is not selected carefully, disk performance is quite poor. For example, Figure 3 illustrates how poor WB-SPTF() performs and how the performance improves for the window size of milliseconds. - Performance of the WB-SPTF algorithm, as of other disk scheduling algorithms, depends on the characteristics of the workload. In the case of interval B, i.e., fully random workload, only a very large window size, which allows WB-SPTF to behave as pure SPTF, yields comparable performance between WB-SPTF and SPTF, i.e., the best performance for the interval. The trend of the best-performing window size changes as we move from random toward more sequential workloads i.e., intervals D and F defined in Figure 1. Performance results for intervals D and F are presented in Figures 3(c) and 3(d), respectively. Observe that the more sequential the workload becomes, the better performing WB-SPTFs are those with smaller window sizes. Additionally, performance improvement gained by using WB-SPTF instead of pure SPTF increases as the workload becomes more sequential. - Most importantly, note that for different load and workload conditions there are different window sizes that we have to select to achieve the best performance from the WB- SPTF algorithm. For example, for intervals B, D, and F the optimal window sizes are,, and milliseconds, respectively. - Consistently, the standard deviation in response times for the WB-SPTF scheduling algorithm is lower than for the SPTF scheduling algorithm (see all graphs in Figure 4). Particularly, the gap increases when moving from a completely random workload toward a more sequential one, since the optimal window size for WB-SPTF decreases when the workload is more sequential. The results presented in this section illustrate how request scheduling assists the disk on maintaining good overall performance under fluctuations in the arrivals intensity. In cases of overload and non-random workloads, WB-SPTF yields better performance that the pure SPTF. However, the performance of WB-SPTF scheduling algorithm depends on the correct selection of its window size. Over time, different window sizes yield different performance results depending on the load and workload characteristics at the disk. In the following section, we discuss how we can dynamically change the window size of the WB-SPTF scheduling algorithm to achieve high performance for different system load and workload characteristics. 5. Dynamic WB-SPTF Algorithm In this section, we propose a new variation of the WB- SPTF scheduling algorithm, that adapts the window size onthe-fly in response to fluctuations in the disk load. The results of Figure 3 indicate that, generally, the higher/lower the load in the system the larger/smaller the optimal window size for the WB-SPTF scheduling algorithm. Nevertheless, it is not trivial to dynamically update the window size as the load in the system changes. In the previous section, we showed that both load and workload characteristics affect the behavior of the scheduling algorithm. In our approach, we modify the window size of WB-SPTF based only on the system load. The workload characteristics are taken into account by the algorithm indirectly, because we measure the system load not by the number of new arrivals but by the number of outstanding requests in the system. For example, in the case of random accesses (see Figure 3), small window sizes for WB-SPTF will cause the number of outstanding requests to increase and the latter will trigger window size increase to improve performance. This trend continues until the window size is large enough to include all requests in the queue and the performance of WB-SPTF Proceedings of the First International Conference on the Quantitative Evaluation of Systems (QEST 4) /4 $ 2. IEEE Authorized licensed use limited to: Seagate Technology. Downloaded on December 3, 28 at 16:21 from IEEE Xplore. Restrictions apply.

5 3 2 1 SPTF SPTF Interval D SPTF (c) Interval F SPTF (d) Figure 3. Average response time under SPTF and WB-SPTF with window sizes,,,,, and milliseconds, respectively; overall and for time intervals B, (c) D, and (d) F Interval D Interval F SPTF SPTF (c) SPTF (d) SPTF Figure 4. Standard deviation of response time under SPTF and WB-SPTF with window sizes,,,,, and milliseconds, respectively; overall and for time intervals B, (c) D, and (d) F. nears the performance of SPTF, which in case of fully random workloads is the best one (see Figure 3). The metric that we use to trigger a window size change is the ratio of the number of requests within the window of WB-SPTF to the total number of outstanding requests in the disk queue. We refer to this metric as. While a change in the number of outstanding requests in the system indicates that an update in the window size might be necessary, the ratio determines if actually such action will take place and its direction, i.e., increase or decrease. The basic guidelines for the Dynamic WB-SPTF algorithm are as follows: Under light and medium system load, any window size is fine as long as it is large enough to include the entire set of outstanding requests at the disk. Note that, in the long run, the dynamic algorithm tends to calculate a small window size for light and medium system load. Under high system load, the optimal window size increases relative to the optimal window size under light or medium load while decreases. Under overload, the optimal window size increases relative to the window size under high load while decreases. At first, these guidelines might seem counter-intuitive, because they basically state that as load increases, even though the window size increases, the portion of the requests within the window decreases. This is related to the bursty conditions that we focus on, where the number of outstanding requests is high, the waiting time per request increases, and even large window sizes include only a fraction of the set of outstanding requests. We define four load levels at disk: (1) Light, (2) Medium, (3) High, and (4) Overload. Each of these load levels is determined by observing both the queue build-up and the request slowdown in the system. Based on the queue build-up and the respective request slowdown for the systems that we evaluated, we define low disk load when there are at most 16 outstanding requests in the system, medium when there are at most 32 outstanding requests in the system, high when there are at most 64 outstanding requests, and finally the disk is overloaded when there are more than 64 outstanding requests. If there are more than 512 outstanding requests in the system the overload is considered severe and the system beyond this state does not adapt anymore to changes i.e., increases in the load, but continues the operation with the window size that it has reached already. For each load level, we define an interval of acceptable (recall that is the ratio of the number of requests within the window with the total number of outstanding requests). The length of the interval of acceptable increases and its individual boundary values decrease as system load increases. If the current is not within the interval of accept- Proceedings of the First International Conference on the Quantitative Evaluation of Systems (QEST 4) /4 $ 2. IEEE Authorized licensed use limited to: Seagate Technology. Downloaded on December 3, 28 at 16:21 from IEEE Xplore. Restrictions apply.

6 1 3 Acceptable Ratios Window Size Change LOW MEDIUM HIGH OVERLOAD Load Levels LIGHT MEDIUM HIGH OVERLOAD Load Levels Figure 5. Parameters used in the Dynamic WB-SPTF algorithm. Acceptable ratios for the four load levels. Window size increase (left bar) and decrease (right bar) for each load level. able ratio for the current load level then the algorithm updates the window size. The window size is increased if the ratio is below the lower boundary of the ratio interval. The window size is decreased if the ratio is above the upper boundary of the acceptable ratio interval. Hence, the ratio intervals are the tools that guide window size updates, how often they occur, and the levels reached by the window size for a given system load. The Dynamic WB-SPTF algorithm checks if the window size needs to be updated upon completion of each request. By default, if the load in the system is light then SPTF applies over the entire set of outstanding requests. For all other load levels, the window size increases or decreases according to the changes in the number of outstanding requests in the system. The initial window size does not affect Dynamic WB-SPTF performance in the long run. The algorithm does not introduce any computational overhead. It just adds a simple check if the window size is correct. The computational cost of applying SPTF on the entire, i.e., larger, set of requests is much higher, because for each request the seek+rotational latency have to be computed given the current position of the head. In time of overload, the large number of outstanding requests makes scheduling quite computational expensive, and with Dynamic WB-SPTF we considerably reduce the cost by applying SPTF only over a fraction of the outstanding requests. In Figure 5, we present the values of the Dynamic WB- SPTF parameters. These values are not hardware dependent. They worked well with various hardware that we tested. A more detailed analysis of these choices is subject of future work. In Figure 5, we present the acceptable intervals for all load levels. Note that for the light load, the interval reduces to the number 1, causing Dynamic WB-SPTF to behave equivalently with SPTF. The performance of Dynamic WB-SPTF is not sensitive to the acceptable ratios intervals as long as they follow the trend described previously. In Figure 5, we present the amount of time added or subtracted from the window size every time a change is necessary according to the Dynamic WB-SPTF algorithm. For each load level, we present two bars; the left bar indicates added amount while the right bar represents the subtracted amount from the window size. The window size changes are small in case of light load and large in high and overload. This allows for faster adaption to sharp oscillations in the load. We decrease the window size slower than we increase it to avoid unnecessary oscillations in the window size. A single window size change allows, in low and medium load conditions, individual requests to be included/excluded from the window. In high and overload, a single window change allows tens of requests to be included/excluded from the window Dynamic WB-SPTF Performance Results We use the synthetic trace described in Figure 1 to analyze the performance of Dynamic WB-SPTF. We measure the average response time for each time interval defined in Figure 1 and present only the results for the overloaded intervals B, D, and F, and overall. In addition to the average response time, we focus on the response time standard deviation, since it describes the amount of variability introduced by the scheduling algorithm. We present our findings in Figures 6 and 7. For comparison in each graph, we include the respective results for SPTF (as the base line), as well as WB-SPTF(), WB-SPTF(), and WB-SPTF(), as the best performing WB-SPTFs for intervals B, D, and F respectively. The results in Figure 6 indicate that Dynamic WB-SPTF manages to adapt its operation to different load condi- Proceedings of the First International Conference on the Quantitative Evaluation of Systems (QEST 4) /4 $ 2. IEEE Authorized licensed use limited to: Seagate Technology. Downloaded on December 3, 28 at 16:21 from IEEE Xplore. Restrictions apply.

7 3 2 1 SPTF Dynamic SPTF Dynamic (c) Interval D SPTF Dynamic (d) Interval F SPTF Dynamic Figure 6. Average response time under SPTF, WB-SPTF(), WB-SPTF(), WB-SPTF(), and the Dynamic WB-SPTF scheduling algorithms overall and for time intervals B, (c) D, and (d) F Interval D Interval F SPTF Dynamic SPTF Dynamic (c) SPTF Dynamic (d) SPTF Dynamic Figure 7. Standard deviation of response time under SPTF, WB-SPTF(), WB-SPTF(), WB- SPTF(), and the Dynamic WB-SPTF scheduling algorithms overall and for time intervals B, (c) D, and (d) F. tions yielding performance similar to the best performing scheduling algorithm. Observe that by focusing on the outstanding requests rather than arrival intensities, Dynamic WB-SPTF manages to adapt its operation to different workload conditions as well, by adapting a large window for interval B (Figure 6) and smaller windows for intervals D and F with more sequential workloads (Figures 6(c) and 6(d)). Observe that, while for non-random workloads Dynamic WB-SPTF performs better than SPTF (interval F), this is not the case for fully random workload (interval B) where SPTF performs best. Furthermore, the results of Figure 7 show that variability, measured as the standard deviation in response time is among the lowest for the Dynamic WB-SPTF scheduling algorithm. It is always at least twice as low as the standard deviation of SPTF. We conclude that Dynamic WB-SPTF not only adapts its operation to different load conditions to maintain good overall performance especially in overload periods, but does this by maintaining low variability in the response time. The experimental results show that Dynamic WB-SPTF performs similar to the best performing scheduling algorithm for each overload period but it is not the best. This outcome is expected, since we provide only a simple heuristic for finding the optimal window size for WB-SPTF. Identifying additional and more optimal ways of adjusting the window size is subject to future work. 6. Experimental Results with Realistic Workload Disk drive behavior depends on both hardware and workload characteristics. Hence, in this section we test Dynamic WB-SPTF with another disk, i.e., Seagate Cheetah 1, rpm and a different realistic trace. The trace, which captures the disk activity, is measured in a real system that runs an online bookstore according to the TPC-W specification [7] Experimental Set-up TPC-W [13] specifies how to implement an on-line bookstore. The users of such a system are the emulated browsers (EBs). They browse through the pages of the web site and, finally, might purchase books from the on-line store. All requests generated by the EBs are received by a Web server which in our implementation is Apache [12]. The Web server forwards the dynamic requests to the application server, which in our implementation is Tomcat 4. [12]. According to the TPC-W specification, the dynamic requests are simple queries in the bookstore database and the application server sends them down to the database server, which in our implementation is MySQL 4.1 [6]. The database stores the entire information of the on-line bookstore and consists of several tables. The most important one is the ITEMS table, Proceedings of the First International Conference on the Quantitative Evaluation of Systems (QEST 4) /4 $ 2. IEEE Authorized licensed use limited to: Seagate Technology. Downloaded on December 3, 28 at 16:21 from IEEE Xplore. Restrictions apply.

8 Processor Memory OS Emulated Browsers Pentium 4 / 2 GHz 256 MB Linux Redhat 9. Web Server + Application Server Pentium III / 1266MHz 2GB Linux Redhat 9. Database Server Intel Xeon / 1.5 GHz 1GB / 768 MB Linux Redhat 9. Database 2.1 GB in size; 1,, records MB - in ITEMS table Disk SEAGATE: ST373453LC; SCSI; 73 GB; 15, rpm Table 2. Hardware components on the TPC-W-based on-line bookstore implementation which stores all available books for purchase. All hardware components in our experimental setup are shown in Table 2: In our measurements, we trace the entire I/O activity in the bookstore database. For this, we run the MySQL database server, using VMWare [14], in a virtual machine hosted by the Database Server machine in Table 2. The host of the database server has 1 GB of memory but the virtual machine uses only 768 MB. The physical SCSI disk used by the database server in the virtual machine appears as a process in the host machine. We use the strace Linux utility to trace all I/O activity in that disk. The database used in our experiments is 2.1GB in size and has the ITEMS table with 1,, records and 511 MB in size. This determines the highly localized access pattern shown in Figure 8. TPC-W defines three types of traffic; browsing mix with 95% browsing and 5% ordering, shopping mix with 8% browsing and 2% ordering, and ordering mix with 5% browsing and 5% ordering. Browsing the online bookstore generates many database queries that read from the database, mainly from the ITEMS table. Ordering from the on-line bookstore generates update queries that deal with individual records from the tables that handle customer data and item availability. Browsing is the most expensive activity, since it generates queries that search large chunks of data, while ordering deals only with single database records. We measure the I/O activity while the system is under the browsing traffic mix generated by 8 concurrent EBs and the trace is shown in Figure The TPC-W based Workload We run our experiment over a period of 2 minutes and collect a trace with 138,497 disk requests. In Figure 8, we present the arrival intensity of the trace by plotting the number of requests received every second. Observe that the plot in Figure 8 is highly jagged, a characteristic of real traces that is not present in the synthetic trace of Figure 1. We define three different time intervals, A, B, and C, with different arrival intensities. The B and C intervals are the ones that are overloaded. The disk access pattern is captured in Figure 8. The most notable difference with the semi-synthetic trace is the locality that is observed here. The disk has 73 GB capacity and the database uses only 2.1 GB. Note the range of the y axis in Figure 8. It covers only the area of the disk that is accessed by the database. The database is stored in files representing each table and the access pattern follows the individual table accesses. The most frequently accessed table, as mentioned previously, is the ITEMS table (from block to block ). Other trace characteristics are the randomness and the sequentiality, most notably during interval C. Each sequential stream corresponds to the queries that search for either Best Sellers or the New Products within a subject category in the ITEMS table. In Table 3, we highlight workload characteristics for intervals A, B, and C, within the measurement period. Interval A B C Load Medium High High Workload Random Random Random Local Local Local Sequential Sequential Table 3. Characteristics of the TPC-W trace 6.3. Results with the TPC-W-based Workload Using the TPC-W-based trace, we run the same set of simulation experiments as in Section 5. Although, the disk we use in this set of experiments is the Seagate Cheetah 15, rpm with 73 GB capacity, DiskSim provides parameters only for Seagate Cheetah 1, rpm with 9.1 GB capacity. We use the latter in our simulations because there are no conflicts since the data set is only 2.1 GB. We compare Dynamic WB-SPTF performance with SPTF performance and the performance of WB-SPTF for different, i.e., the best performing, window sizes. We present the average request response time and the standard deviation of request response time for the various time intervals in Figures 9 and 1, respectively. Observe that, for the TPC-W-based trace, WB-SPTF outperforms pure SPTF in all three intervals A, B, and C. This indicates that batching requests be- Proceedings of the First International Conference on the Quantitative Evaluation of Systems (QEST 4) /4 $ 2. IEEE Authorized licensed use limited to: Seagate Technology. Downloaded on December 3, 28 at 16:21 from IEEE Xplore. Restrictions apply.

9 Requests per second 4 A B C e+6 1.2e+6 Time in milliseconds Blocks 1.7e+7 A B C 1.65e+7 1.6e e+7 1.5e e+7 1.4e e+7 1.3e e+6 1.2e+6 Time in milliseconds Figure 8. Arrival process for the realistic trace used in our analysis; Arrival rate, Disk access pattern. fore scheduling them using SPTF, is much more profitable when either locality or sequentiality, or both, are characteristics of the workload. The performance of Dynamic WB- SPTF is, consistently, near the best performance (among all algorithms tested), independent from the load and workload characteristics at the disk. 7. Conclusions We proposed a new scheduling algorithm for disk drives that operate under dynamic load conditions. The algorithm, Dynamic WB-SPTF, adapts its parameters to the load conditions in the system maintaining good overall and stable performance during transient overloads. The Dynamic WB- SPTF algorithm adapts the value of only one of its parameters, namely the window size. This parameter is updated on every request service completion without additional computational overhead to the disk. The window size determines how many of the outstanding requests are included in a pool of requests upon which SPTF scheduling is applied. Dynamic WB-SPTF increases or decreases its window size by monitoring the ratio between the number of requests within the current window to the total number of outstanding requests. We evaluated Dynamic WB-SPTF performance via trace-driven simulations. We chose traces that are characterized by burstiness in both arrival intensities and workload characteristics, such as locality, sequentiality, and randomness in disk access pattern. Although, not directly considering characteristics of the access pattern, Dynamic WB-SPTF adapts its operation to the current workload type and improves disk overall performance. The gains of using Dynamic WB-SPTF are higher when the workload consists of local and sequential accesses. Under completely random disk accesses, Dynamic WB-SPTF behaves similarly to the traditional SPTF. This is our first attempt to analyze and propose ways to handle the ever-changing dynamic environment under which the computer system, in general, and the storage subsystem, in particular, operate. Currently, we are further investigating how to adapt disk operation to the conditions of the entire system. We are identifying additional information which might be available at disk level (e.g., workload characteristics, the individual request response time) or made available by higher layers of the system (e.g., file system or application level) that can be used for adaptive operation of the storage subsystem (whether single or multi-disk units) and better overall utilization of system resources. Acknowledgments We would like to thank Qi Zhang, who helped us on the process of collecting the trace described in Section 6. References [1] E. Anderson, M. Hobbs, K. Keeton, S. Spence, M. Uysal, and A. Veitch. Hippodrome: Running circles around storage administration. In Proceedings of the First USENIX Conference on File and Storage Technologies, (FAST 2), 22. [2] M. Andrews, M. A. Bender, and L. Zhang. New algorithms for the disk scheduling problem. Algorithmica, 32(2):277 31, 22. [3] R.P.Doyle,J.S.Chase,O.M.Asad,W.Jin,andA.M.Vahdat. Model-based resource provisioning in a web service utility. In Proceedings of the 4th USENIX Symposium on Internet Technologies and Systems, (USITS 3), Seattle, WA, 23. [4] G. R. Ganger, B. L. Worthington, and Y. N. Patt. The DiskSim simulation environment, Version 2., Reference manual. Technical report, Electrical and Computer Engineering Department, Carnegie Mellon University, [5] D. M. Jacobson and J. Wilkes. Disk scheduling algorithms based on rotational position. Technical Report HPL-CSP-91-7rev1, HP Laboratories, [6] MySQL AB. MySQL. Proceedings of the First International Conference on the Quantitative Evaluation of Systems (QEST 4) /4 $ 2. IEEE Authorized licensed use limited to: Seagate Technology. Downloaded on December 3, 28 at 16:21 from IEEE Xplore. Restrictions apply.

10 2 1 SPTF 1 9 Dynamic 2 1 Interval A SPTF 1 9 Dynamic (c) 2 1 SPTF 1 9 Dynamic (d) 2 1 Interval C SPTF 1 9 Dynamic Figure 9. Average response time under SPTF, WB-SPTF(1), WB-SPTF(), WB-SPTF(9), and the Dynamic WB-SPTF scheduling algorithms overall and for time intervals A, (c) B, and (d) C. Interval A Interval c Standard Deviation Standard Deviation Standard Deviation Standard Deviation SPTF 1 9 Dynamic SPTF 1 9 Dynamic (c) SPTF 1 9 Dynamic (d) SPTF 1 9 Dynamic Figure 1. Standard deviation of response time under SPTF, WB-SPTF(1), WB-SPTF(), WB- SPTF(9), and the Dynamic WB-SPTF scheduling algorithms overall and for time intervals A, (c) B, and (d) C. [7] PHARM Project. Java TPC-W Implementation Distribution. pharm/, Department of Electrical and Computer Engineering and Computer Sciences Department, University of Wisconsin-Madison. [8] A. Riska and E. Riedel. It s not fair - Evaluating effi cient disk scheduling. In Proceedings of the Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems, MASCOTS 3, pages , Oct. 23. [9] C. Ruemmler and J. Wilkes. Unix disk access patterns. In Proceedings of the Winter 1993 USENIX Technical Conference, pages , [1] B. Salmon, E. Thereska, C. A. N. Soules, and G. R. Ganger. A two-tiered software architecture for automated tuning of disk layouts. In Proceedings of the 1st Workshop on Algorithms and Architectures for Self-Managing Systems, San Diego, CA, 23. [11] M. Seltzer, P. Chen, and J. Osterhout. Disk scheduling revisited. In Proceedings of the Winter 199 USENIX Technical Conference, pages , Washington, DC, 199. [12] The Apache Software Foundation. Apache Web Server. [13] Transaction Processing and Performance Council. TPC-W. [14] VMWare INC. VMWare Workstation. [15] M. Welsh and D. Culler. Adaptive overload control for busy Internet servers. In Proceedings of the 4th USENIX Symposium on Internet Technologies and Systems, (USITS 3), Seattle, WA, 23. [16] B. L. Worthington, G. R. Ganger, and Y. N. Patt. Scheduling for modern disk drives and non-random workloads. Technical Report CSE-TR , Computer Science and Engineering Division, University of Michigan, Proceedings of the First International Conference on the Quantitative Evaluation of Systems (QEST 4) /4 $ 2. IEEE Authorized licensed use limited to: Seagate Technology. Downloaded on December 3, 28 at 16:21 from IEEE Xplore. Restrictions apply.

Disk Drive Workload Captured in Logs Collected During the Field Return Incoming Test

Disk Drive Workload Captured in Logs Collected During the Field Return Incoming Test Disk Drive Workload Captured in Logs Collected During the Field Return Incoming Test Alma Riska Seagate Research Pittsburgh, PA 5222 Alma.Riska@seagate.com Erik Riedel Seagate Research Pittsburgh, PA 5222

More information

Evaluating Block-level Optimization through the IO Path

Evaluating Block-level Optimization through the IO Path Evaluating Block-level Optimization through the IO Path Alma Riska Seagate Research 1251 Waterfront Place Pittsburgh, PA 15222 Alma.Riska@seagate.com James Larkby-Lahet Computer Science Dept. University

More information

Reducing Disk Latency through Replication

Reducing Disk Latency through Replication Gordon B. Bell Morris Marden Abstract Today s disks are inexpensive and have a large amount of capacity. As a result, most disks have a significant amount of excess capacity. At the same time, the performance

More information

Hard Disk Drives (HDDs)

Hard Disk Drives (HDDs) Hard Disk Drives (HDDs) Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu EEE3052: Introduction to Operating Systems, Fall 2017, Jinkyu Jeong (jinkyu@skku.edu)

More information

Hard Disk Drives (HDDs) Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Hard Disk Drives (HDDs) Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University Hard Disk Drives (HDDs) Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Virtualization Virtual CPUs Virtual memory Concurrency Threads Synchronization

More information

I/O Device Controllers. I/O Systems. I/O Ports & Memory-Mapped I/O. Direct Memory Access (DMA) Operating Systems 10/20/2010. CSC 256/456 Fall

I/O Device Controllers. I/O Systems. I/O Ports & Memory-Mapped I/O. Direct Memory Access (DMA) Operating Systems 10/20/2010. CSC 256/456 Fall I/O Device Controllers I/O Systems CS 256/456 Dept. of Computer Science, University of Rochester 10/20/2010 CSC 2/456 1 I/O devices have both mechanical component & electronic component The electronic

More information

Hard Disk Drives. Nima Honarmand (Based on slides by Prof. Andrea Arpaci-Dusseau)

Hard Disk Drives. Nima Honarmand (Based on slides by Prof. Andrea Arpaci-Dusseau) Hard Disk Drives Nima Honarmand (Based on slides by Prof. Andrea Arpaci-Dusseau) Storage Stack in the OS Application Virtual file system Concrete file system Generic block layer Driver Disk drive Build

More information

37. Hard Disk Drives. Operating System: Three Easy Pieces

37. Hard Disk Drives. Operating System: Three Easy Pieces 37. Hard Disk Drives Oerating System: Three Easy Pieces AOS@UC 1 Hard Disk Driver Hard disk driver have been the main form of ersistent data storage in comuter systems for decades ( and consequently file

More information

SEDA: An Architecture for Well-Conditioned, Scalable Internet Services

SEDA: An Architecture for Well-Conditioned, Scalable Internet Services SEDA: An Architecture for Well-Conditioned, Scalable Internet Services Matt Welsh, David Culler, and Eric Brewer Computer Science Division University of California, Berkeley Operating Systems Principles

More information

I/O Characterization of Commercial Workloads

I/O Characterization of Commercial Workloads I/O Characterization of Commercial Workloads Kimberly Keeton, Alistair Veitch, Doug Obal, and John Wilkes Storage Systems Program Hewlett-Packard Laboratories www.hpl.hp.com/research/itc/csl/ssp kkeeton@hpl.hp.com

More information

A trace-driven analysis of disk working set sizes

A trace-driven analysis of disk working set sizes A trace-driven analysis of disk working set sizes Chris Ruemmler and John Wilkes Operating Systems Research Department Hewlett-Packard Laboratories, Palo Alto, CA HPL OSR 93 23, 5 April 993 Keywords: UNIX,

More information

Ricardo Rocha. Department of Computer Science Faculty of Sciences University of Porto

Ricardo Rocha. Department of Computer Science Faculty of Sciences University of Porto Ricardo Rocha Department of Computer Science Faculty of Sciences University of Porto Slides based on the book Operating System Concepts, 9th Edition, Abraham Silberschatz, Peter B. Galvin and Greg Gagne,

More information

Long-Range Dependence at the Disk Drive Level

Long-Range Dependence at the Disk Drive Level Long-Range Dependence at the Disk Drive Level Alma Riska Erik Riedel Seagate Research Seagate Research 1251 Waterfront Place 1251 Waterfront Place Pittsburgh, PA 15222 Pittsburgh, PA 15222 Alma.Riska@

More information

Characterizing Data-Intensive Workloads on Modern Disk Arrays

Characterizing Data-Intensive Workloads on Modern Disk Arrays Characterizing Data-Intensive Workloads on Modern Disk Arrays Guillermo Alvarez, Kimberly Keeton, Erik Riedel, and Mustafa Uysal Labs Storage Systems Program {galvarez,kkeeton,riedel,uysal}@hpl.hp.com

More information

Bottlenecks and Their Performance Implications in E-commerce Systems

Bottlenecks and Their Performance Implications in E-commerce Systems Bottlenecks and Their Performance Implications in E-commerce Systems Qi Zhang, Alma Riska 2, Erik Riedel 2, and Evgenia Smirni College of William and Mary, Williamsburg, VA 2387 {qizhang,esmirni}@cs.wm.edu

More information

Modification and Evaluation of Linux I/O Schedulers

Modification and Evaluation of Linux I/O Schedulers Modification and Evaluation of Linux I/O Schedulers 1 Asad Naweed, Joe Di Natale, and Sarah J Andrabi University of North Carolina at Chapel Hill Abstract In this paper we present three different Linux

More information

Erik Riedel Hewlett-Packard Labs

Erik Riedel Hewlett-Packard Labs Erik Riedel Hewlett-Packard Labs Greg Ganger, Christos Faloutsos, Dave Nagle Carnegie Mellon University Outline Motivation Freeblock Scheduling Scheduling Trade-Offs Performance Details Applications Related

More information

Ricardo Rocha. Department of Computer Science Faculty of Sciences University of Porto

Ricardo Rocha. Department of Computer Science Faculty of Sciences University of Porto Ricardo Rocha Department of Computer Science Faculty of Sciences University of Porto Slides based on the book Operating System Concepts, 9th Edition, Abraham Silberschatz, Peter B. Galvin and Greg Gagne,

More information

Distributed Video Systems Chapter 3 Storage Technologies

Distributed Video Systems Chapter 3 Storage Technologies Distributed Video Systems Chapter 3 Storage Technologies Jack Yiu-bun Lee Department of Information Engineering The Chinese University of Hong Kong Contents 3.1 Introduction 3.2 Magnetic Disks 3.3 Video

More information

Performance impacts of autocorrelated flows in multi-tiered systems

Performance impacts of autocorrelated flows in multi-tiered systems Performance Evaluation ( ) www.elsevier.com/locate/peva Performance impacts of autocorrelated flows in multi-tiered systems Ningfang Mi a,, Qi Zhang b, Alma Riska c, Evgenia Smirni a, Erik Riedel c a College

More information

Diskbench: User-level Disk Feature Extraction Tool

Diskbench: User-level Disk Feature Extraction Tool Diskbench: User-level Disk Feature Extraction Tool Zoran Dimitrijević 1,2, Raju Rangaswami 1, David Watson 2, and Anurag Acharya 2 Univ. of California, Santa Barbara 1 and Google 2 {zoran,raju}@cs.ucsb.edu,

More information

CSE 153 Design of Operating Systems Fall 2018

CSE 153 Design of Operating Systems Fall 2018 CSE 153 Design of Operating Systems Fall 2018 Lecture 12: File Systems (1) Disk drives OS Abstractions Applications Process File system Virtual memory Operating System CPU Hardware Disk RAM CSE 153 Lecture

More information

EECS 482 Introduction to Operating Systems

EECS 482 Introduction to Operating Systems EECS 482 Introduction to Operating Systems Winter 2018 Harsha V. Madhyastha Recap: CPU Scheduling First Come First Serve (FCFS) Simple, but long waiting times for short jobs Round Robin Reduces waiting

More information

Department of Computer Engineering University of California at Santa Cruz. File Systems. Hai Tao

Department of Computer Engineering University of California at Santa Cruz. File Systems. Hai Tao File Systems Hai Tao File System File system is used to store sources, objects, libraries and executables, numeric data, text, video, audio, etc. The file system provide access and control function for

More information

CS 471 Operating Systems. Yue Cheng. George Mason University Fall 2017

CS 471 Operating Systems. Yue Cheng. George Mason University Fall 2017 CS 471 Operating Systems Yue Cheng George Mason University Fall 2017 Review: Disks 2 Device I/O Protocol Variants o Status checks Polling Interrupts o Data PIO DMA 3 Disks o Doing an disk I/O requires:

More information

Network-Adaptive Video Coding and Transmission

Network-Adaptive Video Coding and Transmission Header for SPIE use Network-Adaptive Video Coding and Transmission Kay Sripanidkulchai and Tsuhan Chen Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA 15213

More information

Presented by: Nafiseh Mahmoudi Spring 2017

Presented by: Nafiseh Mahmoudi Spring 2017 Presented by: Nafiseh Mahmoudi Spring 2017 Authors: Publication: Type: ACM Transactions on Storage (TOS), 2016 Research Paper 2 High speed data processing demands high storage I/O performance. Flash memory

More information

CS370: System Architecture & Software [Fall 2014] Dept. Of Computer Science, Colorado State University

CS370: System Architecture & Software [Fall 2014] Dept. Of Computer Science, Colorado State University CS 370: SYSTEM ARCHITECTURE & SOFTWARE [DISK SCHEDULING ALGORITHMS] Frequently asked questions from the previous class survey Shrideep Pallickara Computer Science Colorado State University L30.1 L30.2

More information

CS370: Operating Systems [Spring 2017] Dept. Of Computer Science, Colorado State University

CS370: Operating Systems [Spring 2017] Dept. Of Computer Science, Colorado State University CS 370: OPERATING SYSTEMS [DISK SCHEDULING ALGORITHMS] Shrideep Pallickara Computer Science Colorado State University Frequently asked questions from the previous class survey Can a UNIX file span over

More information

DADS: Dynamic and Automatic Disk Scheduling

DADS: Dynamic and Automatic Disk Scheduling DADS: Dynamic and Automatic Disk Scheduling Pilar González-Férez Universidad de Murcia Murcia, Spain pilar@ditec.um.es Juan Piernas Universidad de Murcia Murcia, Spain piernas@ditec.um.es Toni Cortes Universitat

More information

Disk Access Analysis for System Performance Optimization

Disk Access Analysis for System Performance Optimization Disk Access Analysis for System Performance Optimization DANIEL L. MARTENS AND MICHAEL J. KATCHABAW Department of Computer Science The University of Western Ontario London, Ontario CANADA Abstract: - As

More information

Design of Flash-Based DBMS: An In-Page Logging Approach

Design of Flash-Based DBMS: An In-Page Logging Approach SIGMOD 07 Design of Flash-Based DBMS: An In-Page Logging Approach Sang-Won Lee School of Info & Comm Eng Sungkyunkwan University Suwon,, Korea 440-746 wonlee@ece.skku.ac.kr Bongki Moon Department of Computer

More information

Anticipatory scheduling: a disk scheduling framework to overcome deceptive idleness in synchronous I/O

Anticipatory scheduling: a disk scheduling framework to overcome deceptive idleness in synchronous I/O Anticipatory scheduling: a disk scheduling framework to overcome deceptive idleness in synchronous I/O Proceedings of the 18th ACM symposium on Operating systems principles, 2001 Anticipatory Disk Scheduling

More information

Novel Address Mappings for Shingled Write Disks

Novel Address Mappings for Shingled Write Disks Novel Address Mappings for Shingled Write Disks Weiping He and David H.C. Du Department of Computer Science, University of Minnesota, Twin Cities {weihe,du}@cs.umn.edu Band Band Band Abstract Shingled

More information

CS370: Operating Systems [Fall 2018] Dept. Of Computer Science, Colorado State University

CS370: Operating Systems [Fall 2018] Dept. Of Computer Science, Colorado State University CS 370: OPERATING SYSTEMS [DISK SCHEDULING ALGORITHMS] Shrideep Pallickara Computer Science Colorado State University L30.1 Frequently asked questions from the previous class survey ECCs: How does it impact

More information

Adapting Mixed Workloads to Meet SLOs in Autonomic DBMSs

Adapting Mixed Workloads to Meet SLOs in Autonomic DBMSs Adapting Mixed Workloads to Meet SLOs in Autonomic DBMSs Baoning Niu, Patrick Martin, Wendy Powley School of Computing, Queen s University Kingston, Ontario, Canada, K7L 3N6 {niu martin wendy}@cs.queensu.ca

More information

A STUDY OF THE PERFORMANCE TRADEOFFS OF A TRADE ARCHIVE

A STUDY OF THE PERFORMANCE TRADEOFFS OF A TRADE ARCHIVE A STUDY OF THE PERFORMANCE TRADEOFFS OF A TRADE ARCHIVE CS737 PROJECT REPORT Anurag Gupta David Goldman Han-Yin Chen {anurag, goldman, han-yin}@cs.wisc.edu Computer Sciences Department University of Wisconsin,

More information

Oracle Database 12c: JMS Sharded Queues

Oracle Database 12c: JMS Sharded Queues Oracle Database 12c: JMS Sharded Queues For high performance, scalable Advanced Queuing ORACLE WHITE PAPER MARCH 2015 Table of Contents Introduction 2 Architecture 3 PERFORMANCE OF AQ-JMS QUEUES 4 PERFORMANCE

More information

Program-Counter-Based Pattern Classification in Buffer Caching

Program-Counter-Based Pattern Classification in Buffer Caching Program-Counter-Based Pattern Classification in Buffer Caching Chris Gniady Ali R. Butt Y. Charlie Hu Purdue University West Lafayette, IN 47907 {gniady, butta, ychu}@purdue.edu Abstract Program-counter-based

More information

CS370: Operating Systems [Fall 2018] Dept. Of Computer Science, Colorado State University

CS370: Operating Systems [Fall 2018] Dept. Of Computer Science, Colorado State University Frequently asked questions from the previous class survey CS 370: OPERATING SYSTEMS [DISK SCHEDULING ALGORITHMS] Shrideep Pallickara Computer Science Colorado State University ECCs: How does it impact

More information

Stupid File Systems Are Better

Stupid File Systems Are Better Stupid File Systems Are Better Lex Stein Harvard University Abstract File systems were originally designed for hosts with only one disk. Over the past 2 years, a number of increasingly complicated changes

More information

CSE 451: Operating Systems Spring Module 12 Secondary Storage. Steve Gribble

CSE 451: Operating Systems Spring Module 12 Secondary Storage. Steve Gribble CSE 451: Operating Systems Spring 2009 Module 12 Secondary Storage Steve Gribble Secondary storage Secondary storage typically: is anything that is outside of primary memory does not permit direct execution

More information

Survey on File System Tracing Techniques

Survey on File System Tracing Techniques CSE598D Storage Systems Survey on File System Tracing Techniques Byung Chul Tak Department of Computer Science and Engineering The Pennsylvania State University tak@cse.psu.edu 1. Introduction Measuring

More information

Analyzing Memory Access Patterns and Optimizing Through Spatial Memory Streaming. Ogün HEPER CmpE 511 Computer Architecture December 24th, 2009

Analyzing Memory Access Patterns and Optimizing Through Spatial Memory Streaming. Ogün HEPER CmpE 511 Computer Architecture December 24th, 2009 Analyzing Memory Access Patterns and Optimizing Through Spatial Memory Streaming Ogün HEPER CmpE 511 Computer Architecture December 24th, 2009 Agenda Introduction Memory Hierarchy Design CPU Speed vs.

More information

Skewed-Associative Caches: CS752 Final Project

Skewed-Associative Caches: CS752 Final Project Skewed-Associative Caches: CS752 Final Project Professor Sohi Corey Halpin Scot Kronenfeld Johannes Zeppenfeld 13 December 2002 Abstract As the gap between microprocessor performance and memory performance

More information

Ref: Chap 12. Secondary Storage and I/O Systems. Applied Operating System Concepts 12.1

Ref: Chap 12. Secondary Storage and I/O Systems. Applied Operating System Concepts 12.1 Ref: Chap 12 Secondary Storage and I/O Systems Applied Operating System Concepts 12.1 Part 1 - Secondary Storage Secondary storage typically: is anything that is outside of primary memory does not permit

More information

Storage System. Distributor. Network. Drive. Drive. Storage System. Controller. Controller. Disk. Disk

Storage System. Distributor. Network. Drive. Drive. Storage System. Controller. Controller. Disk. Disk HRaid: a Flexible Storage-system Simulator Toni Cortes Jesus Labarta Universitat Politecnica de Catalunya - Barcelona ftoni, jesusg@ac.upc.es - http://www.ac.upc.es/hpc Abstract Clusters of workstations

More information

Robust, Portable I/O Scheduling with the Disk Mimic

Robust, Portable I/O Scheduling with the Disk Mimic Robust, Portable I/O Scheduling with the Disk Mimic Florentina I. Popovici, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau Computer Sciences Department, University of Wisconsin, Madison Abstract

More information

Towards a Zero-Knowledge Model for Disk Drives

Towards a Zero-Knowledge Model for Disk Drives Towards a Zero-Knowledge Model for Disk Drives Francisco Hidrobo Laboratorio SUMA. Facultad de Ciencias Universidad de Los Andes La hechicera. 511. Mérida, Venezuela fhidrobo@ac.upc.es Toni Cortes Departament

More information

Evaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades

Evaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades Evaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades Evaluation report prepared under contract with Dot Hill August 2015 Executive Summary Solid state

More information

Optimizing System Performance Through Dynamic Disk Scheduling Algorithm Selection

Optimizing System Performance Through Dynamic Disk Scheduling Algorithm Selection Optimizing System Performance Through Dynamic Disk Scheduling Algorithm Selection DANIEL L. MARTENS AND MICHAEL J. KATCHABAW Department of Computer Science The University of Western Ontario London, Ontario

More information

Introduction Disks RAID Tertiary storage. Mass Storage. CMSC 420, York College. November 21, 2006

Introduction Disks RAID Tertiary storage. Mass Storage. CMSC 420, York College. November 21, 2006 November 21, 2006 The memory hierarchy Red = Level Access time Capacity Features Registers nanoseconds 100s of bytes fixed Cache nanoseconds 1-2 MB fixed RAM nanoseconds MBs to GBs expandable Disk milliseconds

More information

STORAGE SYSTEMS. Operating Systems 2015 Spring by Euiseong Seo

STORAGE SYSTEMS. Operating Systems 2015 Spring by Euiseong Seo STORAGE SYSTEMS Operating Systems 2015 Spring by Euiseong Seo Today s Topics HDDs (Hard Disk Drives) Disk scheduling policies Linux I/O schedulers Secondary Storage Anything that is outside of primary

More information

Anticipatory Disk Scheduling. Rice University

Anticipatory Disk Scheduling. Rice University Anticipatory Disk Scheduling Sitaram Iyer Peter Druschel Rice University Disk schedulers Reorder available disk requests for performance by seek optimization, proportional resource allocation, etc. Any

More information

CS420: Operating Systems. Mass Storage Structure

CS420: Operating Systems. Mass Storage Structure Mass Storage Structure James Moscola Department of Physical Sciences York College of Pennsylvania Based on Operating System Concepts, 9th Edition by Silberschatz, Galvin, Gagne Overview of Mass Storage

More information

CSE 451: Operating Systems Spring Module 12 Secondary Storage

CSE 451: Operating Systems Spring Module 12 Secondary Storage CSE 451: Operating Systems Spring 2017 Module 12 Secondary Storage John Zahorjan 1 Secondary storage Secondary storage typically: is anything that is outside of primary memory does not permit direct execution

More information

Anticipatory Disk Scheduling. Rice University

Anticipatory Disk Scheduling. Rice University Anticipatory Disk Scheduling Sitaram Iyer Peter Druschel Rice University Disk schedulers Reorder available disk requests for performance by seek optimization, proportional resource allocation, etc. Any

More information

Input/Output Management

Input/Output Management Chapter 11 Input/Output Management This could be the messiest aspect of an operating system. There are just too much stuff involved, it is difficult to develop a uniform and consistent theory to cover

More information

Input/Output. Today. Next. Principles of I/O hardware & software I/O software layers Disks. Protection & Security

Input/Output. Today. Next. Principles of I/O hardware & software I/O software layers Disks. Protection & Security Input/Output Today Principles of I/O hardware & software I/O software layers Disks Next Protection & Security Operating Systems and I/O Two key operating system goals Control I/O devices Provide a simple,

More information

Execution-driven Simulation of Network Storage Systems

Execution-driven Simulation of Network Storage Systems Execution-driven ulation of Network Storage Systems Yijian Wang and David Kaeli Department of Electrical and Computer Engineering Northeastern University Boston, MA 2115 yiwang, kaeli@ece.neu.edu Abstract

More information

Virtual Allocation: A Scheme for Flexible Storage Allocation

Virtual Allocation: A Scheme for Flexible Storage Allocation Virtual Allocation: A Scheme for Flexible Storage Allocation Sukwoo Kang, and A. L. Narasimha Reddy Dept. of Electrical Engineering Texas A & M University College Station, Texas, 77843 fswkang, reddyg@ee.tamu.edu

More information

A Study of the Performance Tradeoffs of a Tape Archive

A Study of the Performance Tradeoffs of a Tape Archive A Study of the Performance Tradeoffs of a Tape Archive Jason Xie (jasonxie@cs.wisc.edu) Naveen Prakash (naveen@cs.wisc.edu) Vishal Kathuria (vishal@cs.wisc.edu) Computer Sciences Department University

More information

CSE 153 Design of Operating Systems

CSE 153 Design of Operating Systems CSE 153 Design of Operating Systems Winter 2018 Lecture 20: File Systems (1) Disk drives OS Abstractions Applications Process File system Virtual memory Operating System CPU Hardware Disk RAM CSE 153 Lecture

More information

CSE 451: Operating Systems Winter Secondary Storage. Steve Gribble. Secondary storage

CSE 451: Operating Systems Winter Secondary Storage. Steve Gribble. Secondary storage CSE 451: Operating Systems Winter 2005 Secondary Storage Steve Gribble Secondary storage Secondary storage typically: is anything that is outside of primary memory does not permit direct execution of instructions

More information

Deploy a High-Performance Database Solution: Cisco UCS B420 M4 Blade Server with Fusion iomemory PX600 Using Oracle Database 12c

Deploy a High-Performance Database Solution: Cisco UCS B420 M4 Blade Server with Fusion iomemory PX600 Using Oracle Database 12c White Paper Deploy a High-Performance Database Solution: Cisco UCS B420 M4 Blade Server with Fusion iomemory PX600 Using Oracle Database 12c What You Will Learn This document demonstrates the benefits

More information

I/O Systems and Storage Devices

I/O Systems and Storage Devices CSC 256/456: Operating Systems I/O Systems and Storage Devices John Criswell! University of Rochester 1 I/O Device Controllers I/O devices have both mechanical component & electronic component! The electronic

More information

Unit 2 Buffer Pool Management

Unit 2 Buffer Pool Management Unit 2 Buffer Pool Management Based on: Sections 9.4, 9.4.1, 9.4.2 of Ramakrishnan & Gehrke (text); Silberschatz, et. al. ( Operating System Concepts ); Other sources Original slides by Ed Knorr; Updates

More information

Characterizing Storage Resources Performance in Accessing the SDSS Dataset Ioan Raicu Date:

Characterizing Storage Resources Performance in Accessing the SDSS Dataset Ioan Raicu Date: Characterizing Storage Resources Performance in Accessing the SDSS Dataset Ioan Raicu Date: 8-17-5 Table of Contents Table of Contents...1 Table of Figures...1 1 Overview...4 2 Experiment Description...4

More information

Anatomy of a disk. Stack of magnetic platters. Disk arm assembly

Anatomy of a disk. Stack of magnetic platters. Disk arm assembly Anatomy of a disk Stack of magnetic platters - Rotate together on a central spindle @3,600-15,000 RPM - Drives speed drifts slowly over time - Can t predict rotational position after 100-200 revolutions

More information

McGill University - Faculty of Engineering Department of Electrical and Computer Engineering

McGill University - Faculty of Engineering Department of Electrical and Computer Engineering McGill University - Faculty of Engineering Department of Electrical and Computer Engineering ECSE 494 Telecommunication Networks Lab Prof. M. Coates Winter 2003 Experiment 5: LAN Operation, Multiple Access

More information

Flash: an efficient and portable web server

Flash: an efficient and portable web server Flash: an efficient and portable web server High Level Ideas Server performance has several dimensions Lots of different choices on how to express and effect concurrency in a program Paper argues that

More information

Active Disks For Large-Scale Data Mining and Multimedia

Active Disks For Large-Scale Data Mining and Multimedia For Large-Scale Data Mining and Multimedia Erik Riedel University www.pdl.cs.cmu.edu/active NSIC/NASD Workshop June 8th, 1998 Outline Opportunity Applications Performance Model Speedups in Prototype Opportunity

More information

WebSphere Application Server Base Performance

WebSphere Application Server Base Performance WebSphere Application Server Base Performance ii WebSphere Application Server Base Performance Contents WebSphere Application Server Base Performance............. 1 Introduction to the WebSphere Application

More information

STORING DATA: DISK AND FILES

STORING DATA: DISK AND FILES STORING DATA: DISK AND FILES CS 564- Spring 2018 ACKs: Dan Suciu, Jignesh Patel, AnHai Doan WHAT IS THIS LECTURE ABOUT? How does a DBMS store data? disk, SSD, main memory The Buffer manager controls how

More information

Computer Architecture Lecture 24: Memory Scheduling

Computer Architecture Lecture 24: Memory Scheduling 18-447 Computer Architecture Lecture 24: Memory Scheduling Prof. Onur Mutlu Presented by Justin Meza Carnegie Mellon University Spring 2014, 3/31/2014 Last Two Lectures Main Memory Organization and DRAM

More information

Program Counter Based Pattern Classification in Pattern Based Buffer Caching

Program Counter Based Pattern Classification in Pattern Based Buffer Caching Purdue University Purdue e-pubs ECE Technical Reports Electrical and Computer Engineering 1-12-2004 Program Counter Based Pattern Classification in Pattern Based Buffer Caching Chris Gniady Y. Charlie

More information

Improving Real-Time Performance on Multicore Platforms Using MemGuard

Improving Real-Time Performance on Multicore Platforms Using MemGuard Improving Real-Time Performance on Multicore Platforms Using MemGuard Heechul Yun University of Kansas 2335 Irving hill Rd, Lawrence, KS heechul@ittc.ku.edu Abstract In this paper, we present a case-study

More information

Today: Secondary Storage! Typical Disk Parameters!

Today: Secondary Storage! Typical Disk Parameters! Today: Secondary Storage! To read or write a disk block: Seek: (latency) position head over a track/cylinder. The seek time depends on how fast the hardware moves the arm. Rotational delay: (latency) time

More information

Design and Implementation of A P2P Cooperative Proxy Cache System

Design and Implementation of A P2P Cooperative Proxy Cache System Design and Implementation of A PP Cooperative Proxy Cache System James Z. Wang Vipul Bhulawala Department of Computer Science Clemson University, Box 40974 Clemson, SC 94-0974, USA +1-84--778 {jzwang,

More information

Disks: Structure and Scheduling

Disks: Structure and Scheduling Disks: Structure and Scheduling COMS W4118 References: Opera;ng Systems Concepts (9e), Linux Kernel Development, previous W4118s Copyright no2ce: care has been taken to use only those web images deemed

More information

Enhancements to Linux I/O Scheduling

Enhancements to Linux I/O Scheduling Enhancements to Linux I/O Scheduling Seetharami R. Seelam, UTEP Rodrigo Romero, UTEP Patricia J. Teller, UTEP William Buros, IBM-Austin 21 July 2005 Linux Symposium 2005 1 Introduction Dynamic Adaptability

More information

Performance Characterization of the Dell Flexible Computing On-Demand Desktop Streaming Solution

Performance Characterization of the Dell Flexible Computing On-Demand Desktop Streaming Solution Performance Characterization of the Dell Flexible Computing On-Demand Desktop Streaming Solution Product Group Dell White Paper February 28 Contents Contents Introduction... 3 Solution Components... 4

More information

Managing Performance Variance of Applications Using Storage I/O Control

Managing Performance Variance of Applications Using Storage I/O Control Performance Study Managing Performance Variance of Applications Using Storage I/O Control VMware vsphere 4.1 Application performance can be impacted when servers contend for I/O resources in a shared storage

More information

1 of 8 14/12/2013 11:51 Tuning long-running processes Contents 1. Reduce the database size 2. Balancing the hardware resources 3. Specifying initial DB2 database settings 4. Specifying initial Oracle database

More information

Is Traditional Power Management + Prefetching == DRPM for Server Disks?

Is Traditional Power Management + Prefetching == DRPM for Server Disks? Is Traditional Power Management + Prefetching == DRPM for Server Disks? Vivek Natarajan Sudhanva Gurumurthi Anand Sivasubramaniam Department of Computer Science and Engineering, The Pennsylvania State

More information

Chapter 10: Mass-Storage Systems. Operating System Concepts 9 th Edition

Chapter 10: Mass-Storage Systems. Operating System Concepts 9 th Edition Chapter 10: Mass-Storage Systems Silberschatz, Galvin and Gagne 2013 Objectives To describe the physical structure of secondary storage devices and its effects on the uses of the devices To explain the

More information

Chapter 6 Memory 11/3/2015. Chapter 6 Objectives. 6.2 Types of Memory. 6.1 Introduction

Chapter 6 Memory 11/3/2015. Chapter 6 Objectives. 6.2 Types of Memory. 6.1 Introduction Chapter 6 Objectives Chapter 6 Memory Master the concepts of hierarchical memory organization. Understand how each level of memory contributes to system performance, and how the performance is measured.

More information

WHITE PAPER. Optimizing Virtual Platform Disk Performance

WHITE PAPER. Optimizing Virtual Platform Disk Performance WHITE PAPER Optimizing Virtual Platform Disk Performance Optimizing Virtual Platform Disk Performance 1 The intensified demand for IT network efficiency and lower operating costs has been driving the phenomenal

More information

Chapter 12: Mass-Storage

Chapter 12: Mass-Storage Chapter 12: Mass-Storage Systems Chapter 12: Mass-Storage Systems Revised 2010. Tao Yang Overview of Mass Storage Structure Disk Structure Disk Attachment Disk Scheduling Disk Management Swap-Space Management

More information

Chapter-6. SUBJECT:- Operating System TOPICS:- I/O Management. Created by : - Sanjay Patel

Chapter-6. SUBJECT:- Operating System TOPICS:- I/O Management. Created by : - Sanjay Patel Chapter-6 SUBJECT:- Operating System TOPICS:- I/O Management Created by : - Sanjay Patel Disk Scheduling Algorithm 1) First-In-First-Out (FIFO) 2) Shortest Service Time First (SSTF) 3) SCAN 4) Circular-SCAN

More information

Chapter 4. Routers with Tiny Buffers: Experiments. 4.1 Testbed experiments Setup

Chapter 4. Routers with Tiny Buffers: Experiments. 4.1 Testbed experiments Setup Chapter 4 Routers with Tiny Buffers: Experiments This chapter describes two sets of experiments with tiny buffers in networks: one in a testbed and the other in a real network over the Internet2 1 backbone.

More information

Best Practices for SSD Performance Measurement

Best Practices for SSD Performance Measurement Best Practices for SSD Performance Measurement Overview Fast Facts - SSDs require unique performance measurement techniques - SSD performance can change as the drive is written - Accurate, consistent and

More information

Measurement-based Analysis of TCP/IP Processing Requirements

Measurement-based Analysis of TCP/IP Processing Requirements Measurement-based Analysis of TCP/IP Processing Requirements Srihari Makineni Ravi Iyer Communications Technology Lab Intel Corporation {srihari.makineni, ravishankar.iyer}@intel.com Abstract With the

More information

Deconstructing Storage Arrays

Deconstructing Storage Arrays Deconstructing Storage Arrays Timothy E. Denehy, John Bent, Florentina I. opovici, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau Department of Computer Sciences, University of Wisconsin, Madison

More information

Comparing Performance of Solid State Devices and Mechanical Disks

Comparing Performance of Solid State Devices and Mechanical Disks Comparing Performance of Solid State Devices and Mechanical Disks Jiri Simsa Milo Polte, Garth Gibson PARALLEL DATA LABORATORY Carnegie Mellon University Motivation Performance gap [Pugh71] technology

More information

I/O Management and Disk Scheduling. Chapter 11

I/O Management and Disk Scheduling. Chapter 11 I/O Management and Disk Scheduling Chapter 11 Categories of I/O Devices Human readable used to communicate with the user video display terminals keyboard mouse printer Categories of I/O Devices Machine

More information

VERITAS Storage Foundation 4.0 for Oracle

VERITAS Storage Foundation 4.0 for Oracle J U N E 2 0 0 4 VERITAS Storage Foundation 4.0 for Oracle Performance Brief OLTP Solaris Oracle 9iR2 VERITAS Storage Foundation for Oracle Abstract This document details the high performance characteristics

More information

Whitepaper / Benchmark

Whitepaper / Benchmark Whitepaper / Benchmark Web applications on LAMP run up to 8X faster with Dolphin Express DOLPHIN DELIVERS UNPRECEDENTED PERFORMANCE TO THE LAMP-STACK MARKET Marianne Ronström Open Source Consultant iclaustron

More information

CHAPTER 12: MASS-STORAGE SYSTEMS (A) By I-Chen Lin Textbook: Operating System Concepts 9th Ed.

CHAPTER 12: MASS-STORAGE SYSTEMS (A) By I-Chen Lin Textbook: Operating System Concepts 9th Ed. CHAPTER 12: MASS-STORAGE SYSTEMS (A) By I-Chen Lin Textbook: Operating System Concepts 9th Ed. Chapter 12: Mass-Storage Systems Overview of Mass-Storage Structure Disk Structure Disk Attachment Disk Scheduling

More information

Chapter 14 Performance and Processor Design

Chapter 14 Performance and Processor Design Chapter 14 Performance and Processor Design Outline 14.1 Introduction 14.2 Important Trends Affecting Performance Issues 14.3 Why Performance Monitoring and Evaluation are Needed 14.4 Performance Measures

More information