HAT: An Efficient Buffer Management Method for Flash-based Hybrid Storage Systems

Size: px
Start display at page:

Download "HAT: An Efficient Buffer Management Method for Flash-based Hybrid Storage Systems"

Transcription

1 Front.Comput.Sci. DOI RESEARCH ARTICLE HAT: An Efficient Buffer Management Method for Flash-based Hybrid Storage Systems Yanfei LV 1,2, Bin CUI 1, Xuexuan CHEN 1, Jing LI 3 1 Department of Computer Science & Key Lab of High Confidence Software Technologies (Ministry of Education), Peking University, Beijing, China, National Computer network Emergency Response technical Team/ Coordination Center of China 3 University of California, San Diego, USA Higher Education Press and Springer-Verlag Berlin Heidelberg 2012 Abstract Flash solid-state drives (SSDs) provide much faster access to data compared with traditional hard disk drives (HDDs). The current price and performance of SSD suggest it can be adopted as a data buffer between main memory and HDD, and buffer management policy in such hybrid systems has attracted more and more interest from research community recently. In this paper, we propose a novel approach to manage the buffer in flash-based hybrid storage systems, named Hotness Aware Hit (HAT). HAT exploits a page reference queue to record the access history as well as the status of accessed pages, i.e., hot, warm and cold. Additionally, the page reference queue is further split into hot and warm regions which correspond to the memory and flash in general. The HAT approach updates the page status and deals with the page migration in the memory hierarchy according to the current page status and hit position in the page reference queue. Compared with the existing hybrid storage approaches, the proposed HAT can manage the memory and flash cache layers more effectively. Our empirical evaluation on benchmark traces demonstrates the superiority of the proposed strategy against the state-of-the-art competitors. Keywords Flash memory, SSD, Hybrid Storage, Buffer management, Hotness aware Received month dd, yyyy; accepted month dd, yyyy bin.cui@pku.edu.cn 1 Introduction With the development of flash memory technology, the NAND flash-based solid state drive (SSD) has been widely used as the storage device for various systems, ranging from personal computer to enterprise scale data center. Although the SSD shows better read/write performance than the traditional hard disk drive (HDD), the adoption of the SSD is still limited by its price and capacity. As shown in Table 1, the price, capacity, and access latency are compared among the mainstream commercialized SSD, HDD, and DRAM-based main memory found on the market. It is easy to find that the price per bit of SSD is still much higher than that of HDD. Thus, it may take a long time for the SSD to completely replace the HDD [1]. Therefore, flash-hdd hybrid storage becomes more and more attractive because it can leverage the advantages from both technologies. Recently, various flash-hdd hybrid storage devices have been presented. Seagate provides a mixed storage hard disk with 4GB flash chip to improve the overall performance [2]. Windows operating system has supported Ready Boost since Vista to accelerate the booting [3]. Moreover, the hybrid storage has been implemented in some data-centers [4]. As shown in Table 1, SSD displays a moderate I/O performance and price per GB between DRAM and hard disk. Consequently, it is straightforward to adopt flash memory as a level of memory between the HDD and main memory because of its advantages in performance [5, 6]. The key

2 2 Yanfei LV et al. HAT: An Efficient Buffer Management Method for Flash-based Hybrid Storage Systems problem is how to design an efficient buffer management policy to improve the I/O performance of such a memory hierarchy. There are several existing approaches attempting to better utilize the memory hierarchy in flash-based hybrid storage systems [7 9]. TAC (Temperature-Aware Caching) [7] uses the concept temperature to perform hotness detection, which divides the pages on disk into regions and maintains the whole history of page accesses. TAC identifies the hot ones by monitoring the number of accesses on each region. The pages with higher temperature will be held on the flash memory. The main drawback of TAC is not adaptive to the access pattern change. Though the temperature-based design records the number of page accesses to reflect the page hotness, they react slowly to the pattern evolvement. It needs a rather long time to replace the old pages cached on the flash, e.g., those old pages have high temperature though they will be seldom used in the future workload. Even TAC exploits aging policy to capture access pattern change, ascertaining a suitable aging parameter is not a trivial task. The paper [8] analyzes the design of hybrid storage system and presents three alternative designs implemented in SQL Server: CW (clean-write), DW (dual-write), and LC (lazy-cleaning). As illustrated in the paper, LC is the best design, which keeps the random accessed pages on the flash. In LC policy, a dirty page is written to flash first and flush to disk afterward. LC method shows better performance than TAC on write intensive traces. FaCE (Flash as Cache Extension) [9] adopts the SSD in a FIFO manner. In this way, FaCE can facilitate the high sequential write performance of SSD. Furthermore, FaCE proposes GSC to increase the hit ratio on flash memory. The GSC gives a page a second chance for eviction if the page is referenced while staying in the flash cache. FaCE also modifies the recovery component to extend persistent scope to flash memory in hybrid storage. The main drawback of these designs is they do not make full use of the storage hierarchy. All the pages replaced out of main memory will be kept on the flash no matter whether they will be reused again. Nevertheless, sometimes a page is visited only once but never referenced in the future, this kind of page may waste the flash memory and bring unnecessary write to the flash. In order to overcome the problems in the existing approaches, in this paper we propose a novel strategy named Hotness Aware Hit (HAT) for efficient buffer management in flash-based hybrid storage systems. The pages in HAT are divided into three hotness categories: hot, warm and cold. In general, the hot, warm and cold pages are kept in main memory, flash and hard disk respectively. Furthermore, we construct a page reference queue which is an LRU list to record the access history and the status of accessed pages, and the queue itself is split into hot and warm regions. Based on these data structures, we design a novel light weight page replacement mechanism for hybrid storage systems. The current status of the accessed page and the hit position in the page reference queue are taken into consideration for the page replacement in the storage hierarchy as well as the page status update. We enumerate 6 types of page access scenarios, and design the relevant operations on HAT structure accordingly. The page access scenarios include cold page access in hard disk, cold page hit in warm region, cold page hit in hot region, warm page hit in warm region, warm page hit in hot region and hot page hit in hot region, which cover all the access cases for a data access workload. The details about page status update and buffer replacement in the storage hierarchy will be given in Section 3. Instead of recording the exact access frequency of a page, our proposed HAT mechanism shows that integration of page status and page hit position is effective and incurs lower computational cost. Moreover, our approach is more adaptive to the access pattern change and shows considerable improvements on different workloads. Compared with the existing approaches, HAT has the following advantages. 1. Integral buffer management: We utilize a single page reference queue to record the page access history, thus the main memory and flash are managed as a whole buffer in HAT. The hotness detection starts from main memory, and the hot pages are kept in main memory whereas the warm ones in flash memory generally. In addition, only the hot pages evicted from main memory will be held on the flash. The cold ones, however, are evicted to disk directly. Consequently, HAT can enlarge the effective buffer size and increase buffer efficiency. 2. Workload evolvement adaption: Since HAT takes both page status and hit position in page reference queue into consideration for buffer management, thus a new page will be detected as hot page only after its in-memory hit happening when the access pattern changes. Therefore HAT can better capture frequently accessed pages and automatically adapt itself to the workload evolvement. 3. Low computational cost. The HAT approach utilizes three page statuses and an LRU based page reference queue for buffer replacement. The time consumption is

3 Front. Comput. Sci. 3 Table 1 Comparison on different storage media Price($) Capacity(GB) Price($)/GB Read(µs) Write(µs) DRAM(DDR3) GB SSD GB Disk(7.2K) TB * The price is obtained from O(1) on average for each page access operation. Compared with the existing TAC method which takes logarithmic time to update the temperature of a page, our approach is more computational efficient. We evaluate the performance of the proposed HAT approaches by comparing with the state-of-the-art buffer strategies on flash-based hybrid systems. The experiments are conducted on both synthetic and real traces from public benchmarks including TPC-B, TATP, TPC-H and MLK (Make Linux Kernel), and our experimental study shows that the HAT approach is superior to the the existing buffer replacement methods. This paper extends a preliminary work [10] with an in-depth investigation and performance analysis of the proposed HAT mechanism. Specifically, this paper makes the following additional contributions and is extended in several substantial ways. First, we provide a comprehensive analysis of related work. Second, we present an in-depth discussion of the problems, issues and solutions on buffer management for flash-based hybrid storage systems, and deliver the detailed HAT algorithms. Third, we redesign the experimental study and conduct more extensive experiments and performance analysis. The remainder of the paper is organized as follows. Related work is introduced in Section 2. Section 3 describes our framework and detailed algorithms. Experimental results are shown in Section 4, and we make a conclusion in Section 5. 2 Related work In this section, we briefly review the related work on flash memory and flash-based hybrid system management. Flash-based systems have become a hot topic for several years and many efforts has been made to design more effective systems on the flash. Detailed studies [11, 12] have been conducted to reveal the internal I/O feature of flash disks. [13] discusses the design trade-offs on a flash-based system to improve the overall performance. FTL designs [14, 15] are also investigated in recent years. Nowadays, flash-based hybrid storage has been gradually recognized as an economical way for a practical system by more and more researchers. Some hard disks leverage a small flash memory to improve I/O performance [16]. With the increment on the capacity, SSD is more and more widely deployed in storage systems. The early SSD is skilled in reading but uncompetitive in writing. Thus migration methods [6, 17] are proposed to dynamically transfer read intensive pages to flash and write intensive ones to disk. The authors further proposed to exploit concurrency to improve latency and throughput in a hybrid storage system [18]. Recently, SSD thoroughly surpasses disk on both read and write speed, and hence the popular method is to adopt flash as a middle-level cache between disk and main memory. Existing works can be separated into two categories, i.e., static deployment and dynamical loading. An object placement method [19] is developed to give a proper deployment for the objects of Database. By comparing the object performance on SSD and disk beforehand, those with higher benefit per size are chosen to be placed on SSD. Other methods suggest putting certain part of the system to flash. FlashLogging [20] illustrates that storing the log of DMBS to flash can largely improve the overall performance. Debnath et al. [21, 22] proposed FlashStore and SkimpyStash to discuss the proper way to put the key-value pair to SSD. The static methods need to know the specific information about the application and cannot be self-adaptive to various environments. Dynamical page transferring is more attractive compared with static strategies. Ou Yi et al. [5] tested the performance for different hybrid structures, which shows global structure outperforms local with less flash/main memory ratio, and vise versa. TAC (Temperature-Aware Caching) [7] is the dynamic version of object placement strategy. It allocates temperature to the extents according to access pattern and I/O cost and keeps the data with higher temperature to higher level of the storage structure. To deal with access pattern changes, the authors of TAC proposed to use the aging policy [23] to reflect changing access patterns, that is, the temperatures of pages are halved periodically to give higher priority to the recently accessed pages. The aging

4 4 Yanfei LV et al. HAT: An Efficient Buffer Management Method for Flash-based Hybrid Storage Systems policy forgets the access history of all the pages no matter the page is still hot or it is just accessed. In addition, the aging frequency is difficult to determine. Our testing on a variety of traces shows that the best aging interval ranges from thousands of accesses to the length of trace (which corresponds to the case without aging). Researchers from Microsoft [8] discussed several possible designs for hybrid storage methods. According to the test, the LC (Lazy-Cleaning) method is the best design. LC method shows better performance than TAC on write intensive traces and similar on read-intensive traces. FaCE proposes to use the flash in FIFO manner to improve throughput and provide faster recovery, which yields better performance than LC [9]. HStorageDB [24] adopts semantic information to exploit the capability of hybrid storage system, which is from another aspect to solve the hybrid storage problem. On the other hand, hotness-aware buffering has been studied recently [25, 26]. Though these methods also apply hotness-aware strategy, they target different problems. AD-LRU [25] and CCF-LRU [26] are designed for flash-only storage system, and the key issue of buffer design is to reduce the number of flash write operation. Our proposed HAT is specifically designed for flash-disk hybrid storage system, and the main purpose is to reduce the number of disk accesses Fig. 1 Flash Read from Flash Write to Flash Disk Main Memory Write to Disk Illustration of hybrid storage system the hard disk directly; and 3) how to elevate a hot page in the flash and move it to the main memory. Different answers to these questions lead to different management strategies. The optimal decisions, however, depend on the workload pattern as well as the detailed I/O cost which make the problem more complex. 3.2 Overview of HAT In this section, we will introduce the basic idea of our approach for effective buffer management in flash-based hybrid system, named Hotness Aware Hit (HAT). Read from Disk 3 The HAT Approach 3.1 Hybrid storage structure The typical structure of a hybrid storage system is illustrated in Figure 1. Compared with traditional storage hierarchy, this architecture contains an additional flash-level storage device to accelerate data access speed by buffering certain part of pages on the flash. All the data is stored on hard disk and organized as data pages. A page needs to be loaded into main memory before being accessed. Since flash-based device has better I/O performance compared with hard disk, it works as the level between main memory and disk. When a page miss happens in main memory, the flash will be checked first. The disk is only accessed when the page is not found in the flash. The big challenge in hybrid storage system design is that we have to face many choices on data allocation and migration. These choices include: 1) whether a page newly referenced from the disk should replace a page in the main memory, and which page should be replaced; 2) whether a page evicted from the main memory should be written to the flash which may benefit future reference or just evicted to Fig. 2 Basic Structure of HAT Page reference management In HAT, we exploit a page reference queue to record the page reference history. Based on the reference information, the pages are marked with different hotness levels, e.g., hot, warm and cold. Pages are allocated to different levels in the storage hierarchy according to the page access sequence and pages hotness. For ease of the following presentation, we define some key notations and provide their detailed description as follows. Page reference queue In order to perform hotness detection, we record a recent part of the page access history with a page reference queue, as illustrated in Figure 2. Only

5 Front. Comput. Sci. 5 the IDs of accessed pages are kept in the page reference queue. The length of page reference queue corresponds to the size of memory and flash, as well as the page access pattern, and less recently visited pages will be discarded eventually. The page reference queue is organized in an LRU manner, that is, a newly referenced page will be added or moved to the MRU end of the queue. In the rest part of the paper, we name the MRU end of the queue as the head of the queue, while name LRU end of the queue as the tail for simplicity. Hot Region and Warm Region In such an LRU based page reference queue, pages near to the head of the queue have higher hotness, while pages at the tail are colder. The page reference queue is divided into two regions named Hot Region and Warm Region respectively. The sizes of Hot Region and Warm Region are determined by the reference history and buffer sizes of memory and flash in general, which will be introduced in detail later. In our design, instead of recording the exact access frequency of a page, ascertaining the hotness level is sufficient as there are two level caches in the hybrid storage system. Note that, the Hot Region and Warm Region are proposed based on reference history and do not correspond to the buffer space holding the real data pages. HAT uses these regions to facilitate page hotness detection and thus make the page replacement computation more efficient. Hit We name a page reference a hit on a region in the page reference queue if a page is referenced when it is currently in the region. For example, in Figure 2, if page 3 is referenced again, we can name the reference a page 3 hit on the hot region, and page 3 turns to be hot afterward. Similarly, a reference of page 6 hits on the warm region. We use the information of hit as the measurement for hot detection. Hit can reflect the reference status of a page effectively. As described above, a page hit happens only when the page ID is already contained in the corresponding region which indicates the historical access information of this page. Thereby the hit can provide accurate hotness judgement by considering both the historical information and the current status. At the same time, the hit reacts quickly for the page hotness change. If a page turns to be hot, then this change can be detected only after one hit in hot region and HAT can adjust to this change efficiently. Additionally, the information needed to detect hit is the last reference of a certain page, and the process of hit detection can be performed easily and quickly, and hence the hit-based page hotness detection is efficient both on space and time consumption Page status and deployment HAT categories the pages into three priority levels based on their hotness, namely hot, warm and cold, which are referred as page status in this paper. The status of a page is determined according to the hit region on the page reference queue. Generally, a hit on the hot region marks the corresponding page to hot, and a hit on warm region marks the page to warm. Each status has different behaviors on page accesses and is used to conduct page deployment. We introduce three types of pages in detail as follows. Hot page Hot page is with the highest priority in the HAT approach, and thus all the hot pages are determined to be kept in main memory and occupy a fixed percentage of main memory space. A page is marked as hot if a reference on this page hits on the hot region. For example, in Figure 2 a hit on page 1 will turn page 1 to hot page. If the number of hot pages exceeds the threshold, the hot page with the maximum recency is degraded to warm page, and the hot region shrinks. Warm page A page may turn to warm in two ways: 1) a reference hit on the warm region will turn a cold page to warm; 2) a hot page is degraded to warm from the hot region as discussed previously. The number of warm pages on the flash is limited by the capacity of flash device, and thus if the flash is full, the warm page with the largest recency will be degraded to cold and evicted from the flash to the disk. All the pages on the flash are warm pages. However, although the amount is not large, some recently accessed warm pages may be buffered in the main memory. For example a newly referenced page will be kept in main memory for a while. A special case is an in-memory cold or hot page may turn to warm, and in this case, the page becomes in-memory warm page. After evicted from the main memory, the warm page will be moved to the flash. Cold page A newly referenced page from the hard disk is considered to be cold, though its page ID will be queued in the hot region of page reference queue. A certain percentage of main memory space is allocated to store the newly referenced cold pages and warm pages regardless of their status. After evicted from the main memory, the cold page will be flushed back to the disk directly. Tagging pages with different status based on their hotness is one of the key operations in our buffer management strategy. Here we provide details about how to use the tag information to effectively manage the data in storage hierarchy. The page deployment in the hierarchy roughly corresponds to the status of the page, and we basically try to

6 6 Yanfei LV et al. HAT: An Efficient Buffer Management Method for Flash-based Hybrid Storage Systems put the hot pages in the main memory, warm pages in the flash and remain the cold ones on the disk. However some newly referenced cold/warm pages are likely to be referenced again, so HAT allocates a certain percentage of main memory to buffer these cold/warm pages, puts their IDs in the hot region of page reference queue but remains their status unchanged. These new comers will be temporarily kept in main memory for further hotness examination. If these pages are accessed again, they can be obtained directly from main memory and upgraded to hot pages. The memory space used to cache the hot pages is named hot buffer zone, and the memory space used to cache the non-hot pages is named non-hot buffer zone. The sum of hot buffer zone and non-hot buffer zone is the size of main memory buffer. Thus, the page deployment of HAT is listed as follows. 1. Put all the hot pages to hot buffer zone in the main memory. 2. Allocate the non-hot buffer zone to the newly arrival cold/warm pages. 3. Keep the warm pages at least on flash memory. Note that, some recently visited warm pages may reside in the non-hot buffer zone. 4. Remain all the other cold pages on the disk. The principle of data placement also determines the data transferring between different storage hierarchies, which will be presented in details in the following section. For example, if a page turns to be hot, it will be moved to the main memory. Another example is a cold page evicted from the main memory is flushed back to the disk, while a warm one is flushed to the flash, which is a key difference from FaCE [9] and LC [8] Page replacement strategy In the previous sections, we have introduced some key notations and basic HAT structure features. In the following, we will present the page replacement strategy of HAT in the hybrid storage hierarchy. Since the capacities of main memory and flash are limited, we set some quantity constraints on the number of pages with a certain status. Constraint 1: The number of hot pages should not exceed the hot buffer zone size. Constraint 2: The number of flash-resident warm pages should not exceed the capacity of flash memory. Constraint 3: The number of in-memory cold and warm pages should not exceed the non-hot buffer zone size. When a certain constraint is violated, data page replacement in the storage hierarchy should be performed. The key innovation of HAT is how to conduct the place deployment of relevant pages according to the status and hit position of the accessed page, and modify the relevant pages status accordingly. In our approaches, we can enumerate 6 data page access scenarios for buffer replacement in the hybrid storage hierarchy according to current page status and its access history. These scenarios can cover all the cases for a data access workload. We present the operations and structure update for each case as follows: Cold page access on hard disk: The page resides in the hard disk, and no historical access information is maintained by page reference queue. In this case, we put the page ID in the hot region of page reference queue, buffer the data page in the non-hot buffer zone, and check Constraint 3. If Constraint 3 is violated, we flush the most out-of-date data page in the non-hot buffer zone. Cold page hit in hot region: This indicates the accessed cold page is in hot region and this page is referenced again. In this case, we upgrade the page status to hot, move the page ID to the head of hot region, move the data page to the hot page buffer zone, and check Constraint 1. If Constraint 1 is not satisfied, move the tail of hot region to the warm region, and downgrade the status of tail page to warm. After that, check Constraint 3, and replace out the most out-of-date data page in the non-hot buffer zone if needed. Then we check Constraint 2, and move out the tail from warm region if Constraint 2 is violated. Cold page hit in warm region: We change the page status to warm and move the page ID to the hot region of page reference queue. If the data page is in the non-hot zone of memory, no further actions are needed; otherwise, load the page to the non-hot buffer zone, and check Constraint 3. If Constraint 3 is violated, replace the most out-of-date data page. Warm page hit in warm region: We move the page ID to the hot region of page reference queue. If the data page is in the non-hot buffer zone, no further actions are needed; otherwise, load the page to the non-hot buffer zone, and check Constraint 3. If Constraint 3 is violated, replace the most out-of-date data page.

7 Front. Comput. Sci. 7 Warm page hit in hot region: We update the page status to hot, move the page ID to the head of hot region, move the data page to the hot page buffer zone, and check Constraint 1. If Constraint 1 is not satisfied, move the tail of hot region to the warm region, and downgrade the status of tail page to warm. After that, check Constraint 2, and update the warm region accordingly. Hot page hit in hot region: We simply move the page ID to the head of the queue without any other adjustment. Note that, in some cases above, the constraint violation may appear cascadingly. In this case, the HAT will conduct a sequence of adjustments to ensure the system satisfy all the constraints. Second, we call the pages whose status to be downgraded as the victim of status downgrade. The victim hot and warm pages are all determined in LRU manner, i.e., the hot/warm pages nearest to the tail of hot/warm regions will be selected. We can find the victim by scanning from the tail of corresponding region. In this process of the victim selection, the downgraded pages are the most out-of-data page references of the corresponding region and will be removed accordingly from the region. The references removed from the warm region will be removed from the page reference queue, and thus the HAT forgets the historical access information naturally. Third, the Non-hot buffer zone in main memory is also managed in LRU manner. The cold/warm page with the largest recency in the reference queue will be evicted. The evicted warm or cold page will be flushed to flash or disk respectively Example For better understanding our buffer replacement mechanism, we proceed to give a detailed example to show how the HAT approach works with an access sequence. The first subfigure in Figure 3 shows the initial state of HAT, and the following access sequence is on page 5, 14, 14, 12, 10. The size of main memory is 4, flash size is 5, and the maximum number of hot page, i.e., hot zone buffer size, is set to 2 in this example. The first access is on page 5, which is new to the buffer manager and corresponds to the aforementioned Cold page access on hard disk" case. We read the page from the disk, and put the ID 5 to the head of the hot region. After checking Constraint 3, we find that the main memory contains too many pages. We then flush out page 12 from main memory, which is the most out-of-date page in the non-hot buffer zone. This page is a cold page, so we should flush it to disk. Since Constraint 2 is currently satisfied, no additional actions should be taken. The next access is on page 14, which is a warm page hit in warm region. This page has its data on the flash memory, so we first load it into main memory, and move its page ID to the head of the hot region. Constraint 3 is violated again, so we flush the most out-of-date page, which is now page 4, to the flash. Constraint 2 continues being satisfied, so that is done. The system continues to access page 14, and this is a warm page hit in hot region. Since its page ID is already in the head of the hot region, we only change its page status to hot. By checking Constraint 1, we find that we have too many hot pages. And hence, we move the tail of the hot region, i.e., page 8, to the warm region, and change its status from hot to warm. These actions do not violate Constraint 2 either. The fourth access is on page 12, which is a cold page hit in hot region. This is the most complicated case in this example. The page ID is moved to the head of the page reference queue and its status is upgraded to hot. Because the data of this page does not reside in the main memory, we need to load it from the disk. We proceed to check the constraints. Constraint 1 is violated again, so page 9, which is the tail of the hot region, is moved to the warm region with its status changed to warm. So are page 4 and page 5 that are not hot, because the hot region should always end with a hot page, which is page 14 in this case. Constraint 3 is also violated because we have too many pages in non-hot buffer zone, thus we flush page 8 into the flash. We find that Constraint 2 is also violated, and we flush page 6 from flash memory which is at the very tail of the warm region. The warm region has to be shrunken until it finds page 1 as its new tail, which must be a warm page in flash. Note that, we guarantee that the tail of hot region is hot page and tail of warm region is a flash-resident warm page, which is an implementation issue to facilitate the victim search process, more details can refer to Section The last access is on page 10, which corresponds to the cold page hit in warm region case. We move the ID to the head of hot region, upgrade the status to warm and load its data from disk, just the same as we described above. Main memory usage exceeds Constraint 3, so we flush page 9 to the flash. Afterward, the flash has too many pages due to Constraint 2, and we drop page 1 from it. The warm region is shrunken as well. To better illustrate the process, we further present the page status change of the aforementioned example. The results are presented in Table 2, which includes the page access sequence and the changes of each region, as well as

8 8 Yanfei LV et al. HAT: An Efficient Buffer Management Method for Flash-based Hybrid Storage Systems (a) The initial state of HAT (b) After page 5 accessed (c) After page 14 accessed (d) After page 14 accessed again (e) After page 12 accessed (f) After page 10 accessed Fig. 3 An example of access sequence process by HAT the hotness of each page. 3.3 Implementation of HAT In this section, we will present the implementation detail of the HAT approach, including the data structure and algorithms. Figure 4 shows the structure of HAT in our implementation. We utilize two linked LRU lists, i.e., hot list and warm list, to represent hot region and warm region to record the access history. The two lists work together as one LRU list to represent the whole page reference queue, that is, no matter a page is hit on hot region or warm region, this page will be moved to the head of hot region, and the tail page evicted from the tail of hot list will be moved to the head of warm list. Note that, only page ID is recorded in the node of hot and warm region to record the access history instead of the real data pages. Consequently, some pages whose page IDs are recorded in the lists may reside in

9 Front. Comput. Sci. 9 Table 2 Page hotness change in the example(h:hot, W:Warm, C:Cold) Initial state Page 5 accessed Page 14 accessed Page 14 accessed Page 12 accessed Page 10 accessed Page 1 W W W W W C Page 2 W W W W W W Page 3 W W W W W W Page 4 W W W W W W Page 5 C C C C C C Page 6 W W W W C C Page 7 C C C C C C Page 8 H H H W W W Page 9 H H H H W W Page 10 C C C C C W Page 11 C C C C C C Page 12 C C C C H H Page 13 C C C C C C Page 14 W W W H H H neither main memory nor flash memory. Additionally, we design another auxiliary LRU list to maintain the pages in non-hot buffer zone, which can facilitate the fast access of pages in this area. Three flags are used to identify the page status, named hot, warm and cold flag respectively. Fig Brush of region tails Illustration of HAT The size of hot buffer zone is set to a fixed percentage of main memory buffer capacity, and if the number of hot page exceeds the threshold, a hot page will be downgraded. Our implementation of HAT ensures that the tail of hot region is hot page and tail of warm region is a flash-resident warm page. This design can facilitate victim searching and simplify the implementation. When a hot victim is needed, the tail page of hot list is selected directly and the same operation holds for the warm victim. There are two cases that may cause the tail of hot region not ended with a hot page. First, when the number of hot page exceeds threshold, the tail of hot region will be degraded to warm and moved to the head of warm region. Thus the tail of hot region may no longer be a hot page. Second, this case may also take place when the tail page of hot list is referenced and moved to the head. In this case, we move pages from the tail of hot region to the head of warm region until encountering a hot one. We name this process Brush. The same process can take place for the warm region. When 1) the number of on-flash warm pages exceeds the capacity of flash memory, the tail of warm region is evicted, or 2) the tail of warm list is referenced and it is moved to the head of hot region. If these happen, a warm brush is also conducted to remove some pages from the tail to ensure the tail of warm region is still a flash-resident warm page. In the process of warm brush, once encountering a memory-resident warm page, this page will be degraded to cold as it must be less recently visited than other warm pages. In the process of brush, HAT can forget some out-of-date page references, and adjust the size of Hot Region and Warm Region automatically Auxiliary list for non-hot buffer zone When the main memory buffer is full and an empty slot is required for new page access, an in-memory page has to be evicted out of the maim memory. In this case, we evict the non-hot page with the largest recency, that is, the in-memory non-hot page which is nearest to the tail in the page reference queue. This page can be obtained by scanning backward from the tail of reference queue, but the process could be time consuming, as the queue maintains at lease all the pages residing in the memory and flash. We introduce an auxiliary LRU list in order to accelerate this searching process. The pages in auxiliary list are all non-hot pages that are buffered in the main memory. As illustrated in Figure 4, the auxiliary list maintains 2 pages. The auxiliary list may also hold the pages that are evicted

10 10 Yanfei LV et al. HAT: An Efficient Buffer Management Method for Flash-based Hybrid Storage Systems from page reference queue but still in main memory, which may be incurred by the warm region brush operation. The pages in the auxiliary list are organized in LRU manner as page reference queue. Thus if a page in the auxiliary list is referenced, we should update its order in auxiliary list besides the operations on page reference queue. When a page is needed to be replaced out of the main memory, the tail of auxiliary list (page 8 in the figure) is simply selected. We adopt a structure named frame to store all the information of pages, including the hotness flag and the page location in storage hierarchy. The frames are organized in a hash table to facilitate fast searching. As discussed above, the ratio of non-hot buffer zone (Non-hot Ratio) in the system is a parameter of our approach. Small hot buffer zone will lead to more main memory space for the new non-hot pages. Thus, this renders longer hotness examination time for the newly arrival pages before they are identified to be hot, as they can stay in memory for longer time. On the contrary, if we give more memory space to hot buffer zone, a new page will be replaced from the main memory more quickly. We will evaluate the effect of this parameter in the experiments The algorithms of HAT The detailed algorithm of HAT is listed as Algorithm 1. The process of HAT algorithm can be divided into 4 steps, namely page load (lines 1-7), flag update (lines 8-17), list maintenance (lines 18-22) and constraint check (line 23). To begin with, the data page is loaded from the flash or hard disk, and then the flag of the page is updated. If the page hit is in Hot Region, it is marked as hot and added to the hot buffer zone (lines 8-10). If the page hit is in Warm Region, the page is marked as warm (lines 11-13). Otherwise the page is a cold page (lines 15-16). Either the cold or warm page will be held in non-hot zone. In the list maintenance step, we first record the page reference information at the head of Hot Region (line 18) and the old reference on this page is removed from page reference queue. After that, the Auxiliary List is updated if the page accessed is in non-hot buffer zone (lines 19-22). At last, the constraints are checked, which is detailed separately in Algorithm 2. First, the number of hot pages is checked (lines 1-6). Once Constraint 1 is violated, the tail of Hot region is downgraded to warm and inserted into Auxiliary List. And then the Hot Region brush is conducted to ensure the Hot Region is ended with a hot page. Second, the non-hot region constraint is checked (lines 7-14). If this constraint is unsatisfied, i.e., the non-hot buffer zone is full, the tail of Auxiliary List is evicted. At last, the flash memory constraint is checked (lines 15-22). If the flash is full, the tail of warm region is degraded to cold and evicted out (lines 16-17). The Warm Region brush action will be taken (lines 18-22). Note that, the order of constraint checks is important, because the process of previous constraints may cause the violation of the latter one. Algorithm 1: The HAT algorithm Input: an access on page P 1 if P is not in main memory then 2 if P is in flash memory then 3 Load page P from flash memory; 4 else 5 Load page P from disk; 6 end 7 end // Update the flags 8 if P hit in Hot Region then 9 mark P as hot page; 10 hold P in hot buffer zone in main memory; 11 else if P hit in Warm Region then 12 mark P as warm; 13 hold P in non-hot buffer zone; 14 else 15 mark P as cold; 16 hold P in non-hot buffer zone; 17 end // Maintain the Lists 18 move the node to the head of Hot Region; 19 remove P from the Auxiliary List if existing; 20 if P is warm or cold and in main memory then 21 insert the page back to the Auxiliary List; 22 end 23 invoke CheckConstraints(); 4 Performance evaluation In this section, a trace-driven simulation is conducted to evaluate the effectiveness of our HAT approach, and the experimental results are illustrated in comparison with some state-of-the-art flash-based hybrid buffer replacement algorithms, including FaCE [9] and TAC [7]. We implement the FaCE approach with GSC (Group Second Chance), since the experiments indicate FaCE+GSC performs the best among FaCE variants. The aging frequency of TAC is an important parameter. We tested TAC with different aging intervals ranging from 0.1M to 10M and choose the parameter with the best performance for the comparison. The simulation is developed in Visual Studio 2010 using C#.

11 Front. Comput. Sci. 11 Algorithm 2: Subroutine CheckConstraints() // Check Constraint 1 1 if hot page count > Hot buffer zone size then 2 find the tail of Hot Region; 3 mark it as warm; 4 insert it into Auxiliary List; // Hot Region Brush 5 shrink the hot bound to the next hot page; 6 end // Check Constraint 3 7 if Non-hot buffer zone is full then 8 remove the tail page P ta of Auxiliary List; 9 if P ta is warm then 10 flush it to flash; 11 else 12 write back to disk if P ta is dirty; 13 end 14 end // Check Constraint 2 15 if flash is full then 16 mark the tail page of Warm Region as cold; 17 remove the tail of Warm Region and flush to disk; // Warm Region Brush 18 get the tail page P tw ; 19 while not (P tw is warm and in flash) do 20 change P tw to cold; 21 remove P tw from Warm Region; 22 end 23 end All experiments are run on a Windows 2008 server with two 2.4 GHz Intel E5530 CPU and 32 GB of physical memory equipped with Samsung SSD (64GB, 470 series) and Seagate disk ( ST380011A). 4.1 Experimental setup We use both real and synthetic traces for performance evaluation. We exploit four real traces, TPC-B, TPC-H, TATP and making Linux kernel (MLK for short) to evaluate the performance on various workloads. The three benchmarks are run on PostgreSQL with default settings, e.g., the page size is 8KB. The MLK is a record of the page accesses of making Linux kernel For the synthetic traces, we make a series of traces varying from very stable access pattern to unstable one. The stable trace is generated conforming to the 80/20 distribution [27], whereas the unstable traces are produced by combining multiple stable traces. We utilize a tool named strace to monitor these processes and obtain the disk access history. Specification on these traces is shown in Table 3. The total I/O time including both flash and disk accesses Table 3 Specification on the Traces Filename Number of Pages (10 3 ) Number of References (10 6 ) Write Ratio TPC-B % TATP % TPC-H % MLK % Synthetic % is used as the primary metric to evaluate the performance, while we also show the buffer hit ratio and number of accesses in our experiments. The parameters used in our experiments are listed in Table 4. The first parameter S M is the memory buffer size, and we also consider various memory and flash sizes to test the performance under different environments, where parameter Ratio F/M is used to represent the ratio between the flash and memory. The costs of flash I/O and disk I/O are obtained from testing on Samsung SSD (64GB, 470 series) and Seagate disk ( ST380011A), where C Fr, C Fw, C Dr and C Dw represent read and write costs of the flash and disk respectively. We conduct the experiments on different SSDs including Samsung and Intel, and our approach yields similar performance on different devices, and hence we only present the results one Samsung SSD due to space constraint. The R nonhot represents the percentage of non-hot buffer zone out of the overall memory buffer space. Unless stated explicitly, the default parameter values, given in bold, are used. Table 4 Experimental Parameters Parameter Value S M (10 3 pages) 0.2 (for TPC-B) 2 (for TATP) 1 (for MLK) 2 (for Synthetic) 50, 100, 200, 400 (for TPC-H) Ratio F/M 1:1, 2:1,... 5:1,... 20:1 C Fr, C Fw (ms) 0.245, C Dr, C Dw (ms) 12.7, 13.7 R nonhot 0.02, 0.04,..., 0.08, 0.1, 0.2,..., Parameter tuning The ratio of non-hot buffer zone (R nonhot ) in the main memory is the only parameter of HAT. As discussed in the previous section, larger non-hot buffer size can provide more space for the newly arrival pages and thus can better adapt to the workload changes. On the contrary, smaller non-hot buffer zone means the system can allocate more space to hot

12 12 Yanfei LV et al. HAT: An Efficient Buffer Management Method for Flash-based Hybrid Storage Systems pages, and hence improve the buffer efficiency on frequently accessed pages. Therefore, the optimal performance is a tradeoff between pattern change adoption and hot page buffer. In this experiment, we show how the parameter R nonhot affects the performance of HAT based on both the synthetic trace and benchmark traces, and the results are illustrated in Figure 5. The real benchmark traces shows similar results, and thus only the result on TPC-H is given as a representative. Fig. 5 Parameter Tuning In Figure 5, the performance of our approach on synthetic workload is rather stable when the R nonhot is low. This is because the access pattern of synthetic trace conforms to a fixed distribution. Although with a small R nonhot, HAT may evict a new page quickly, but it can still recognize hot pages after it is accessed again. The total I/O time slightly increases when non-hot page occupies over 70% of the main memory, in which the main memory is polluted with cold pages. The access pattern of TPC-H is not stable, thus the total I/O time first decreases and then increases. The optimal performance appears when the parameter is around 0.1. The experimental results suggest that we should reserve the majority of memory size to buffer the frequently used pages, and leave a small percentage buffer space to hold newly pages for further hotness examination. Only if the pages are really hot, it will be moved to hot zone, thus the utility of hot buffer zone can be improved. Our design can better utilize the overall buffer space, but also adapt to the access pattern change. We use 0.1 as the default value of R nonhot in the following experiments. 4.3 Comparison with other techniques We proceed to show the comparison with other existing approaches, i.e., TAC and FaCE. We fix the main memory size and vary the flash memory size to evaluate total I/O time on each trace. The results are illustrated in Figure 6. The horizontal ordinate stands for the ratio between the flash and memory. With the increment of ratio, the total buffer size of the system also increases as we consider the flash as a second level buffer. Consequently, the total I/O time decreases for all the approaches. However, our approach is better than other approaches in most of the cases and achieves up to 50% speedup against the competitors. On TPC-B trace, when the flash/memory ratio is low, TAC yields better performance than other approaches. TPC-B is a trace with stable pattern; TAC can precisely detect the hottest pages and store them in the flash on this kind of workload, which shows superiority especially when the buffer is small. However, the performance of TAC degrades very fast when the size of the flash increases. The reason to this result is that the TPC-B trace is a write-intensive trace, the ratio of write operations is around 20% as shown in Table 3. TAC adopts the flash as a write through cache, and thus, when the flash size is large, the drawback of this policy is more obvious which also verifies the results in paper [8]. FaCE and HAT outperform TAC with the increment of flash memory size ratio. Our HAT strategy steadily outperforms FaCE, as HAT has better hot page detection mechanism than FaCE. The results in Figure 6 (b) on TATP shows similar trend with TPC-B. As an OLAP workload, TPC-H is read-intensive and includes 22 complex queries. Figure 6 (c) illustrates the performance comparison on the TPC-H trace. As the access pattern varies among these 22 queries, the lines in this figure are not as smooth as in TPC-B trace for all the approaches. Our approach HAT steadily outperforms TAC and FaCE, and achieves up to 30% performance improvement. The temperature-based statistics become invalid when the access pattern changes. FaCE has similar performance with TAC. The change of access pattern also has effects on the page management in FaCE. The workload size of TPC-H is larger than TPC-B and TATP, which validates the efficiency of our approach on various data sizes. The making Linux kernel process needs to compile the source code to object code which will be then linked to the executable files, and thus this process contains a large number of read operations and a few write operations. As shown in Table 3, the ratio of write operations is very low in MLK trace and most of the write operations are sequential one. As the access pattern of MLK trace is also not stable, HAT is always better than TAC. Although using the second chance mechanism, the hot detection of FaCE is still inefficient in hot page detection, which can not learn

13 Front. Comput. Sci. 13 (a) Performance on TPC-B trace (b) Performance on TATP trace (c) Performance on TPC-H trace (d) Performance on MLK trace Fig. 6 I/O performance comparison on traces with various flash/main memory ratio frequently accessed data well. Hence, a large number of cold pages are flushed from main memory to flash which wastes the flash capacity. Our approach has an effective hot page detection mechanism so that HAT can prevent this kind of drawback and shows better performance. The total hit ratio of memory and flash is illustrated in Figure 7. Our approach HAT shows comparable or better buffer hit ratio than others in all the cases. The locality of TPC-B is very strong as the buffer hit ratio is larger than 90% for all the approaches even though the total buffer size is very small. When the buffer size is low, on this high locality trace, TAC shows the highest hit ratio, since TAC can record the temperature of pages and leverage the long history to detect exact hotness of a page which better manages the precious buffer resource. HAT records less historical access information than TAC and thus lose accuracy on very hot page detection. But when the buffer size is high, HAT shows its priority. FaCE is not skilled at the hotness detection and thus performs the worst. Compared with the results in Figure 6 (a), TAC has higher hit ratio than FaCE but also higher I/O cost. This result is caused by the write through cache design of TAC, as the ratio of write operations in TPC-B is high. The buffer hit ratio on other workloads is coincident with the total I/O cost results shown in Figure 6, and can be explained accordingly. To further investigate the behavior of these approaches, we list the number of read and write counts on TPC-H in detail in Table 5 by enlarging the sizes of memory and flash. Our algorithm yields better performance compared with others in almost all the cases, which is mainly owing to the smallest number of disk reads in our approach. When the total buffer size is small, HAT has more flash reads but fewer disk operations, which means our algorithm manages flash more efficiently than others. TAC has the fewest flash write operations. TAC adopts temperature to determine the page placement and the temperature of a page is very stable, and hence the page replacement on flash is low. This low replacement on flash makes TAC react slowly to the workload change. FaCE keeps all the pages replaced out of main memory to the flash, and hence has the most flash write operations especially when the main memory size is small and more memory cache miss happens. When the buffer size is very large, i.e., 3.2G memory and 16G flash in the last case, all the competitors yield similar performance,

14 14 Yanfei LV et al. HAT: An Efficient Buffer Management Method for Flash-based Hybrid Storage Systems (a) Performance on TPC-B trace (b) Performance on TATP trace (c) Performance on TPC-H trace (d) Performance on MLK trace Fig. 7 Total buffer hit ratio on traces with total buffer size as almost all the visited pages can be buffered. Note that the overall data size of TPC-H is around 17.5G as shown in Table 3. We also replay the traces on real SSD and hard disk devices with different hybrid storage management approaches as illustrated in Figure 8 and 9. As the test on real devises has fluctuation, we conduct the evaluation three times and present the average time. HAT is better than the competitors on all the tests. Although FaCE has more I/O operations on TPC-H according to Table 5, it performs better on real devices. This is because FaCE turns the flash I/O accesses into sequential accesses which can better fit the flash device. The performance difference between TAC and HAT is similar with the simulation results and HAT outperforms TAC in all the cases. We further vary the flash/memory ratio and examine the total run time on hybrid storage system in Figure 9. HAT has the lowest run time. The run time of TAC does not show obvious decrement with the enlargement of the buffer size. This is due to the heavy writes to the disk. FaCE performs worse than our HAT approach although FaCE is enhanced with sequential I/O accesses, the reason is that the hot Fig. 8 Performance on SSD devices detection capability of FaCE is weak and needs more buffer to hold the hot pages. We finally evaluate the computational cost of different approaches, and the results of are illustrated in Figure 10. Note that, the average cost for each access of the workload is proportional to the total computation cost. In this experiment, we use the same parameter setting as in Figure 8. In general, computational cost grows proportional to number of page references as given in Table 3. TAC takes

15 Front. Comput. Sci. 15 Table 5 Read and write counts on TPC-H for different algorithms Main/Flash Size Algorithm Flash Read Flash Write Disk Read Disk Write Total I/O Cost FaCE 7,562,176 17,533,638 10,372, , ,190, MB/2GB TAC 1,346,622 1,787,085 9,510, , ,660,434 HAT 1,720,603 2,256,064 9,189, , ,152,433 FaCE 5,174,275 11,580,971 8,175, , ,921, MB/4GB TAC 3,652,386 1,846,571 7,005, ,251 98,281,977 HAT 4,622,445 2,311,450 6,037, ,380 86,735,198 FaCE 6,359,417 5,902,331 4,853, ,887 74,452, GB/8GB TAC 5,452,735 2,830,796 5,062, ,450 74,574,629 HAT 5,650,503 3,559,431 4,421, ,533 67,209,372 FaCE 8,688,859 3,302,628 1,704, ,887 32,949, GB/16GB TAC 8,146,849 1,791,877 1,791, ,158 32,713,115 HAT 6,961,194 3,547,915 1,704, ,887 32,723,556 Fig. 9 Performance on TPC-B traces Fig. 10 Computational time the most computational cost on TPC-B, TATP and MLK benchmark. The reason is that TAC has to maintain the temperature of pages and the cost is very expensive. However, HAT performs worst on TPC-H benchmark. Since TPC-H has the most number of pages, the page reference queue in HAT become larger and leads to more maintenance cost. Compared with Figure 8, the computational time is one or two magnitudes less than I/O time. Thus I/O time is the key cost in hybrid storage management, so the computational time difference has little effect on the overall performance. hot, warm and cold. We exploited the hotness aware hit" to process the buffer replacement, which migrates the relevant pages in the memory hierarchy according to the current page status and hit position in the page reference queue. Compared with the existing methods, HAT can effectively buffer frequently accessed pages with a low computational cost and better adapt to the workload changes. Experiments on different traces show that HAT can achieve better performance against existing approaches. Acknowledgements This research was supported by NSFC under Grant No and MIIT grant 2010ZX Conclusion In this paper, we have presented a novel buffer management strategy HAT for flash-based hybrid storage systems. HAT utilizes a page reference queue to maintain the historical access information, and the queue itself is divided into hot region and warm region. Furthermore, we proposed to categorize the status of accessed pages to three levels, i.e., References 1. Times E. SSDs: Still not a Solid State Business. Stillnot-a solid-state business, Momentus XT Solid State Hybrid Drives

16 16 Yanfei LV et al. HAT: An Efficient Buffer Management Method for Flash-based Hybrid Storage Systems 4. Oracle. Deploying hybrid storage pools with oracle flash technology and the oracle solaris zfs file system Ou Y, Härder T. Trading memory for performance and energy. In: DASFAA Workshops. 2011, Wu X, Reddy A L N. Managing storage space in a flash and disk hybrid storage system. In: MASCOTS. 2009, Canim M, Mihaila G A, Bhattacharjee B, Ross K A, Lang C A. Ssd bufferpool extensions for database systems. PVLDB, 2010, 3(2): Do J, Zhang D, Patel J M, DeWitt D J, Naughton J F, Halverson A. Turbocharging dbms buffer pool using ssds. In: SIGMOD Conference. 2011, Kang W H, Lee S W, Moon B. Flash-based extended cache for higher throughput and faster recovery. PVLDB, 2012, 5(11): Lv Y, Cui B, Chen X, Li J. Hotness-aware buffer management for flash-based hybrid storage systems. In: CIKM Conference Bouganim L, Jónsson B T, Bonnet P. uflip: Understanding flash io patterns. In: CIDR Chen F, Koufaty D A, Zhang X. Understanding intrinsic characteristics and system implications of flash memory based solid state drives. In: SIGMETRICS/Performance. 2009, Agrawal N, Prabhakaran V, Wobber T, Davis J D, Manasse M S, Panigrahy R. Design tradeoffs for ssd performance. In: USENIX Annual Technical Conference. 2008, Cho j H, Shin D, Eom Y I. Kast: K-associative sector translation for nand flash memory in real-time systems. In: DATE. 2009, Gupta A, Kim Y, Urgaonkar B. Dftl: a flash translation layer employing demand-based selective caching of page-level address mappings. In: ASPLOS. 2009, Bisson T, Brandt S A, Long D D E. A hybrid disk-aware spin-down algorithm with i/o subsystem support. In: IPCCC. 2007, Koltsidas I, Viglas S. Flashing up the storage layer. PVLDB, 2008, 1(1): Wu X, Reddy A L N. Exploiting concurrency to improve latency and throughput in a hybrid storage system. In: MASCOTS. 2010, Canim M, Bhattacharjee B, Mihaila G A, Lang C A, Ross K A. An object placement advisor for db2 using solid state storage. PVLDB, 2009, 2(2): Chen S. Flashlogging: exploiting flash devices for synchronous logging performance. In: SIGMOD Conference. 2009, Debnath B K, Sengupta S, Li J. Flashstore: High throughput persistent key-value store. PVLDB, 2010, 3(2): Debnath B K, Sengupta S, Li J. Skimpystash: Ram space skimpy keyvalue store on flash-based storage. In: SIGMOD Conference. 2011, Zhou Y, Chen Z, Li K. Second-level buffer cache management. IEEE Trans. Parallel Distrib. Syst., 2004, 15(6): Luo T, Lee R, Mesnier M P, Chen F, Zhang X. hstorage-db: Heterogeneity-aware data management to exploit the full capability of hybrid storage systems. PVLDB, 2012, 5(10): Jin P, Ou Y, Häoder T, Li Z. Ad-lru: An efficient buffer replacement algorithm for flash-based databases. Data Knowl. Eng., 2012, Li Z, Jin P, Su X, Cui K, Yue L. Ccf-lru: a new buffer replacement algorithm for flash memory. IEEE Trans. Consumer Electronics, 2009, 55(3): Johnson T, Shasha D. 2Q: A low overhead high performance buffer management replacement algorithm. In: VLDB. 1994, Yanfei Lv is a staff in National Computer network Emergency Response technical Team/Coordination Center of China. He obtained his B.Sc. from Northeastern University in 2006, and Ph.D. in 2013 from Peking University. His research interests include flashbased database, Hadoop and big data. Dr. Bin Cui is a professor in the School of EECS, Peking University. His research interests include database performance issues, query and index techniques, Web data management and data mining. He has served in the Technical Program Committee of various international conferences including SIGMOD, VLDB and ICDE. He is currently in the Editorial Board of VLDB Journal, TKDE, DAPD, and Information Systems. Jing Li is currently a PhD student at Department of Computer Science and Engineering, University of California, San Diego. Prior to that, Jing Li obtained his bachelor degree from Peking University in His research interests include database, architecture and mobile computing. Xuexuan Chen is an employee of Google Switzerland working as a software engineer in search ads quality. He obtained his B.Sc. and M.Sc. from Department of Computer Science, Peking University, in 2010 and 2013 respectively. From 2008 to 2013, his research work focused on flash-based database systems, especially on performance evaluation, buffer management algorithms, index structures of relational database systems on top of flash-based SSDs.

Operation-Aware Buffer Management in Flash-based Systems

Operation-Aware Buffer Management in Flash-based Systems Operation-Aware uffer Management in Flash-based Systems Yanfei Lv 1 in Cui 1 1 Department of Computer Science & Key Lab of High Confidence Software Technologies (Ministry of Education), Peking University

More information

Cascade Mapping: Optimizing Memory Efficiency for Flash-based Key-value Caching

Cascade Mapping: Optimizing Memory Efficiency for Flash-based Key-value Caching Cascade Mapping: Optimizing Memory Efficiency for Flash-based Key-value Caching Kefei Wang and Feng Chen Louisiana State University SoCC '18 Carlsbad, CA Key-value Systems in Internet Services Key-value

More information

A Memory Management Scheme for Hybrid Memory Architecture in Mission Critical Computers

A Memory Management Scheme for Hybrid Memory Architecture in Mission Critical Computers A Memory Management Scheme for Hybrid Memory Architecture in Mission Critical Computers Soohyun Yang and Yeonseung Ryu Department of Computer Engineering, Myongji University Yongin, Gyeonggi-do, Korea

More information

SFS: Random Write Considered Harmful in Solid State Drives

SFS: Random Write Considered Harmful in Solid State Drives SFS: Random Write Considered Harmful in Solid State Drives Changwoo Min 1, 2, Kangnyeon Kim 1, Hyunjin Cho 2, Sang-Won Lee 1, Young Ik Eom 1 1 Sungkyunkwan University, Korea 2 Samsung Electronics, Korea

More information

Embedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institute of Technology, Delhi

Embedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institute of Technology, Delhi Embedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institute of Technology, Delhi Lecture - 13 Virtual memory and memory management unit In the last class, we had discussed

More information

CFDC A Flash-aware Replacement Policy for Database Buffer Management

CFDC A Flash-aware Replacement Policy for Database Buffer Management CFDC A Flash-aware Replacement Policy for Database Buffer Management Yi Ou University of Kaiserslautern Germany Theo Härder University of Kaiserslautern Germany Peiquan Jin University of Science and Technology

More information

Using Synology SSD Technology to Enhance System Performance Synology Inc.

Using Synology SSD Technology to Enhance System Performance Synology Inc. Using Synology SSD Technology to Enhance System Performance Synology Inc. Synology_WP_ 20121112 Table of Contents Chapter 1: Enterprise Challenges and SSD Cache as Solution Enterprise Challenges... 3 SSD

More information

STORING DATA: DISK AND FILES

STORING DATA: DISK AND FILES STORING DATA: DISK AND FILES CS 564- Spring 2018 ACKs: Dan Suciu, Jignesh Patel, AnHai Doan WHAT IS THIS LECTURE ABOUT? How does a DBMS store data? disk, SSD, main memory The Buffer manager controls how

More information

Design of Flash-Based DBMS: An In-Page Logging Approach

Design of Flash-Based DBMS: An In-Page Logging Approach SIGMOD 07 Design of Flash-Based DBMS: An In-Page Logging Approach Sang-Won Lee School of Info & Comm Eng Sungkyunkwan University Suwon,, Korea 440-746 wonlee@ece.skku.ac.kr Bongki Moon Department of Computer

More information

Using Transparent Compression to Improve SSD-based I/O Caches

Using Transparent Compression to Improve SSD-based I/O Caches Using Transparent Compression to Improve SSD-based I/O Caches Thanos Makatos, Yannis Klonatos, Manolis Marazakis, Michail D. Flouris, and Angelos Bilas {mcatos,klonatos,maraz,flouris,bilas}@ics.forth.gr

More information

Program Counter Based Pattern Classification in Pattern Based Buffer Caching

Program Counter Based Pattern Classification in Pattern Based Buffer Caching Purdue University Purdue e-pubs ECE Technical Reports Electrical and Computer Engineering 1-12-2004 Program Counter Based Pattern Classification in Pattern Based Buffer Caching Chris Gniady Y. Charlie

More information

AMC: an adaptive multi-level cache algorithm in hybrid storage systems

AMC: an adaptive multi-level cache algorithm in hybrid storage systems CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. (5) Published online in Wiley Online Library (wileyonlinelibrary.com)..5 SPECIAL ISSUE PAPER AMC: an adaptive multi-level

More information

LAST: Locality-Aware Sector Translation for NAND Flash Memory-Based Storage Systems

LAST: Locality-Aware Sector Translation for NAND Flash Memory-Based Storage Systems : Locality-Aware Sector Translation for NAND Flash Memory-Based Storage Systems Sungjin Lee, Dongkun Shin, Young-Jin Kim and Jihong Kim School of Information and Communication Engineering, Sungkyunkwan

More information

Parallelizing Inline Data Reduction Operations for Primary Storage Systems

Parallelizing Inline Data Reduction Operations for Primary Storage Systems Parallelizing Inline Data Reduction Operations for Primary Storage Systems Jeonghyeon Ma ( ) and Chanik Park Department of Computer Science and Engineering, POSTECH, Pohang, South Korea {doitnow0415,cipark}@postech.ac.kr

More information

Maximizing Data Center and Enterprise Storage Efficiency

Maximizing Data Center and Enterprise Storage Efficiency Maximizing Data Center and Enterprise Storage Efficiency Enterprise and data center customers can leverage AutoStream to achieve higher application throughput and reduced latency, with negligible organizational

More information

BPCLC: An Efficient Write Buffer Management Scheme for Flash-Based Solid State Disks

BPCLC: An Efficient Write Buffer Management Scheme for Flash-Based Solid State Disks BPCLC: An Efficient Write Buffer Management Scheme for Flash-Based Solid State Disks Hui Zhao 1, Peiquan Jin *1, Puyuan Yang 1, Lihua Yue 1 1 School of Computer Science and Technology, University of Science

More information

PowerVault MD3 SSD Cache Overview

PowerVault MD3 SSD Cache Overview PowerVault MD3 SSD Cache Overview A Dell Technical White Paper Dell Storage Engineering October 2015 A Dell Technical White Paper TECHNICAL INACCURACIES. THE CONTENT IS PROVIDED AS IS, WITHOUT EXPRESS

More information

Page Mapping Scheme to Support Secure File Deletion for NANDbased Block Devices

Page Mapping Scheme to Support Secure File Deletion for NANDbased Block Devices Page Mapping Scheme to Support Secure File Deletion for NANDbased Block Devices Ilhoon Shin Seoul National University of Science & Technology ilhoon.shin@snut.ac.kr Abstract As the amount of digitized

More information

Addresses in the source program are generally symbolic. A compiler will typically bind these symbolic addresses to re-locatable addresses.

Addresses in the source program are generally symbolic. A compiler will typically bind these symbolic addresses to re-locatable addresses. 1 Memory Management Address Binding The normal procedures is to select one of the processes in the input queue and to load that process into memory. As the process executed, it accesses instructions and

More information

LEVERAGING EMC FAST CACHE WITH SYBASE OLTP APPLICATIONS

LEVERAGING EMC FAST CACHE WITH SYBASE OLTP APPLICATIONS White Paper LEVERAGING EMC FAST CACHE WITH SYBASE OLTP APPLICATIONS Abstract This white paper introduces EMC s latest innovative technology, FAST Cache, and emphasizes how users can leverage it with Sybase

More information

Maintaining Mutual Consistency for Cached Web Objects

Maintaining Mutual Consistency for Cached Web Objects Maintaining Mutual Consistency for Cached Web Objects Bhuvan Urgaonkar, Anoop George Ninan, Mohammad Salimullah Raunak Prashant Shenoy and Krithi Ramamritham Department of Computer Science, University

More information

Buffer Caching Algorithms for Storage Class RAMs

Buffer Caching Algorithms for Storage Class RAMs Issue 1, Volume 3, 29 Buffer Caching Algorithms for Storage Class RAMs Junseok Park, Hyunkyoung Choi, Hyokyung Bahn, and Kern Koh Abstract Due to recent advances in semiconductor technologies, storage

More information

Presented by: Nafiseh Mahmoudi Spring 2017

Presented by: Nafiseh Mahmoudi Spring 2017 Presented by: Nafiseh Mahmoudi Spring 2017 Authors: Publication: Type: ACM Transactions on Storage (TOS), 2016 Research Paper 2 High speed data processing demands high storage I/O performance. Flash memory

More information

Ultra-Low Latency Down to Microseconds SSDs Make It. Possible

Ultra-Low Latency Down to Microseconds SSDs Make It. Possible Ultra-Low Latency Down to Microseconds SSDs Make It Possible DAL is a large ocean shipping company that covers ocean and land transportation, storage, cargo handling, and ship management. Every day, its

More information

Solid State Drive (SSD) Cache:

Solid State Drive (SSD) Cache: Solid State Drive (SSD) Cache: Enhancing Storage System Performance Application Notes Version: 1.2 Abstract: This application note introduces Storageflex HA3969 s Solid State Drive (SSD) Cache technology

More information

Preface. Fig. 1 Solid-State-Drive block diagram

Preface. Fig. 1 Solid-State-Drive block diagram Preface Solid-State-Drives (SSDs) gained a lot of popularity in the recent few years; compared to traditional HDDs, SSDs exhibit higher speed and reduced power, thus satisfying the tough needs of mobile

More information

and data combined) is equal to 7% of the number of instructions. Miss Rate with Second- Level Cache, Direct- Mapped Speed

and data combined) is equal to 7% of the number of instructions. Miss Rate with Second- Level Cache, Direct- Mapped Speed 5.3 By convention, a cache is named according to the amount of data it contains (i.e., a 4 KiB cache can hold 4 KiB of data); however, caches also require SRAM to store metadata such as tags and valid

More information

Accelerating Microsoft SQL Server Performance With NVDIMM-N on Dell EMC PowerEdge R740

Accelerating Microsoft SQL Server Performance With NVDIMM-N on Dell EMC PowerEdge R740 Accelerating Microsoft SQL Server Performance With NVDIMM-N on Dell EMC PowerEdge R740 A performance study with NVDIMM-N Dell EMC Engineering September 2017 A Dell EMC document category Revisions Date

More information

Cache Controller with Enhanced Features using Verilog HDL

Cache Controller with Enhanced Features using Verilog HDL Cache Controller with Enhanced Features using Verilog HDL Prof. V. B. Baru 1, Sweety Pinjani 2 Assistant Professor, Dept. of ECE, Sinhgad College of Engineering, Vadgaon (BK), Pune, India 1 PG Student

More information

White paper ETERNUS Extreme Cache Performance and Use

White paper ETERNUS Extreme Cache Performance and Use White paper ETERNUS Extreme Cache Performance and Use The Extreme Cache feature provides the ETERNUS DX500 S3 and DX600 S3 Storage Arrays with an effective flash based performance accelerator for regions

More information

Evaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades

Evaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades Evaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades Evaluation report prepared under contract with Dot Hill August 2015 Executive Summary Solid state

More information

Chapter 6 Memory 11/3/2015. Chapter 6 Objectives. 6.2 Types of Memory. 6.1 Introduction

Chapter 6 Memory 11/3/2015. Chapter 6 Objectives. 6.2 Types of Memory. 6.1 Introduction Chapter 6 Objectives Chapter 6 Memory Master the concepts of hierarchical memory organization. Understand how each level of memory contributes to system performance, and how the performance is measured.

More information

Micron and Hortonworks Power Advanced Big Data Solutions

Micron and Hortonworks Power Advanced Big Data Solutions Micron and Hortonworks Power Advanced Big Data Solutions Flash Energizes Your Analytics Overview Competitive businesses rely on the big data analytics provided by platforms like open-source Apache Hadoop

More information

!! What is virtual memory and when is it useful? !! What is demand paging? !! When should pages in memory be replaced?

!! What is virtual memory and when is it useful? !! What is demand paging? !! When should pages in memory be replaced? Chapter 10: Virtual Memory Questions? CSCI [4 6] 730 Operating Systems Virtual Memory!! What is virtual memory and when is it useful?!! What is demand paging?!! When should pages in memory be replaced?!!

More information

Basic Memory Management

Basic Memory Management Basic Memory Management CS 256/456 Dept. of Computer Science, University of Rochester 10/15/14 CSC 2/456 1 Basic Memory Management Program must be brought into memory and placed within a process for it

More information

Nowadays data-intensive applications play a

Nowadays data-intensive applications play a Journal of Advances in Computer Engineering and Technology, 3(2) 2017 Data Replication-Based Scheduling in Cloud Computing Environment Bahareh Rahmati 1, Amir Masoud Rahmani 2 Received (2016-02-02) Accepted

More information

Virtual Memory Outline

Virtual Memory Outline Virtual Memory Outline Background Demand Paging Copy-on-Write Page Replacement Allocation of Frames Thrashing Memory-Mapped Files Allocating Kernel Memory Other Considerations Operating-System Examples

More information

CS6401- Operating System UNIT-III STORAGE MANAGEMENT

CS6401- Operating System UNIT-III STORAGE MANAGEMENT UNIT-III STORAGE MANAGEMENT Memory Management: Background In general, to rum a program, it must be brought into memory. Input queue collection of processes on the disk that are waiting to be brought into

More information

Chunling Wang, Dandan Wang, Yunpeng Chai, Chuanwen Wang and Diansen Sun Renmin University of China

Chunling Wang, Dandan Wang, Yunpeng Chai, Chuanwen Wang and Diansen Sun Renmin University of China Chunling Wang, Dandan Wang, Yunpeng Chai, Chuanwen Wang and Diansen Sun Renmin University of China Data volume is growing 44ZB in 2020! How to store? Flash arrays, DRAM-based storage: high costs, reliability,

More information

Best Practices. Deploying Optim Performance Manager in large scale environments. IBM Optim Performance Manager Extended Edition V4.1.0.

Best Practices. Deploying Optim Performance Manager in large scale environments. IBM Optim Performance Manager Extended Edition V4.1.0. IBM Optim Performance Manager Extended Edition V4.1.0.1 Best Practices Deploying Optim Performance Manager in large scale environments Ute Baumbach (bmb@de.ibm.com) Optim Performance Manager Development

More information

CS 5523 Operating Systems: Memory Management (SGG-8)

CS 5523 Operating Systems: Memory Management (SGG-8) CS 5523 Operating Systems: Memory Management (SGG-8) Instructor: Dr Tongping Liu Thank Dr Dakai Zhu, Dr Palden Lama, and Dr Tim Richards (UMASS) for providing their slides Outline Simple memory management:

More information

Benefits of Automatic Data Tiering in OLTP Database Environments with Dell EqualLogic Hybrid Arrays

Benefits of Automatic Data Tiering in OLTP Database Environments with Dell EqualLogic Hybrid Arrays TECHNICAL REPORT: Performance Study Benefits of Automatic Data Tiering in OLTP Database Environments with Dell EqualLogic Hybrid Arrays ABSTRACT The Dell EqualLogic hybrid arrays PS6010XVS and PS6000XVS

More information

CSE502: Computer Architecture CSE 502: Computer Architecture

CSE502: Computer Architecture CSE 502: Computer Architecture CSE 502: Computer Architecture Memory Hierarchy & Caches Motivation 10000 Performance 1000 100 10 Processor Memory 1 1985 1990 1995 2000 2005 2010 Want memory to appear: As fast as CPU As large as required

More information

An Efficient Memory-Mapped Key-Value Store for Flash Storage

An Efficient Memory-Mapped Key-Value Store for Flash Storage An Efficient Memory-Mapped Key-Value Store for Flash Storage Anastasios Papagiannis, Giorgos Saloustros, Pilar González-Férez, and Angelos Bilas Institute of Computer Science (ICS) Foundation for Research

More information

EMC XTREMCACHE ACCELERATES ORACLE

EMC XTREMCACHE ACCELERATES ORACLE White Paper EMC XTREMCACHE ACCELERATES ORACLE EMC XtremSF, EMC XtremCache, EMC VNX, EMC FAST Suite, Oracle Database 11g XtremCache extends flash to the server FAST Suite automates storage placement in

More information

ELEC3441: Computer Architecture Second Semester, Homework 3 (r1.1) SOLUTION. r1.1 Page 1 of 12

ELEC3441: Computer Architecture Second Semester, Homework 3 (r1.1) SOLUTION. r1.1 Page 1 of 12 Homework 3, Part ELEC3441: Computer Architecture Second Semester, 2015 16 Homework 3 (r1.1) r1.1 Page 1 of 12 A.1 Cache Access Part A: Problem Set Consider the following sequence of memory accesses to

More information

BUFFER HASH KV TABLE

BUFFER HASH KV TABLE BUFFER HASH KV TABLE CHEAP AND LARGE CAMS FOR HIGH PERFORMANCE DATA-INTENSIVE NETWORKED SYSTEMS PAPER BY ASHOK ANAND, CHITRA MUTHUKRISHNAN, STEVEN KAPPES, ADITYA AKELLA AND SUMAN NATH PRESENTED BY PRAMOD

More information

Flash Memory Based Storage System

Flash Memory Based Storage System Flash Memory Based Storage System References SmartSaver: Turning Flash Drive into a Disk Energy Saver for Mobile Computers, ISLPED 06 Energy-Aware Flash Memory Management in Virtual Memory System, islped

More information

Why memory hierarchy? Memory hierarchy. Memory hierarchy goals. CS2410: Computer Architecture. L1 cache design. Sangyeun Cho

Why memory hierarchy? Memory hierarchy. Memory hierarchy goals. CS2410: Computer Architecture. L1 cache design. Sangyeun Cho Why memory hierarchy? L1 cache design Sangyeun Cho Computer Science Department Memory hierarchy Memory hierarchy goals Smaller Faster More expensive per byte CPU Regs L1 cache L2 cache SRAM SRAM To provide

More information

The Benefits of Solid State in Enterprise Storage Systems. David Dale, NetApp

The Benefits of Solid State in Enterprise Storage Systems. David Dale, NetApp The Benefits of Solid State in Enterprise Storage Systems David Dale, NetApp SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA unless otherwise noted. Member companies

More information

Module 10: "Design of Shared Memory Multiprocessors" Lecture 20: "Performance of Coherence Protocols" MOESI protocol.

Module 10: Design of Shared Memory Multiprocessors Lecture 20: Performance of Coherence Protocols MOESI protocol. MOESI protocol Dragon protocol State transition Dragon example Design issues General issues Evaluating protocols Protocol optimizations Cache size Cache line size Impact on bus traffic Large cache line

More information

CISC 7310X. C08: Virtual Memory. Hui Chen Department of Computer & Information Science CUNY Brooklyn College. 3/22/2018 CUNY Brooklyn College

CISC 7310X. C08: Virtual Memory. Hui Chen Department of Computer & Information Science CUNY Brooklyn College. 3/22/2018 CUNY Brooklyn College CISC 7310X C08: Virtual Memory Hui Chen Department of Computer & Information Science CUNY Brooklyn College 3/22/2018 CUNY Brooklyn College 1 Outline Concepts of virtual address space, paging, virtual page,

More information

Fig 7.30 The Cache Mapping Function. Memory Fields and Address Translation

Fig 7.30 The Cache Mapping Function. Memory Fields and Address Translation 7-47 Chapter 7 Memory System Design Fig 7. The Mapping Function Example: KB MB CPU Word Block Main Address Mapping function The cache mapping function is responsible for all cache operations: Placement

More information

CHAPTER 6 Memory. CMPS375 Class Notes (Chap06) Page 1 / 20 Dr. Kuo-pao Yang

CHAPTER 6 Memory. CMPS375 Class Notes (Chap06) Page 1 / 20 Dr. Kuo-pao Yang CHAPTER 6 Memory 6.1 Memory 341 6.2 Types of Memory 341 6.3 The Memory Hierarchy 343 6.3.1 Locality of Reference 346 6.4 Cache Memory 347 6.4.1 Cache Mapping Schemes 349 6.4.2 Replacement Policies 365

More information

Cooperating Write Buffer Cache and Virtual Memory Management for Flash Memory Based Systems

Cooperating Write Buffer Cache and Virtual Memory Management for Flash Memory Based Systems Cooperating Write Buffer Cache and Virtual Memory Management for Flash Memory Based Systems Liang Shi, Chun Jason Xue and Xuehai Zhou Joint Research Lab of Excellence, CityU-USTC Advanced Research Institute,

More information

Lecture 15: Caches and Optimization Computer Architecture and Systems Programming ( )

Lecture 15: Caches and Optimization Computer Architecture and Systems Programming ( ) Systems Group Department of Computer Science ETH Zürich Lecture 15: Caches and Optimization Computer Architecture and Systems Programming (252-0061-00) Timothy Roscoe Herbstsemester 2012 Last time Program

More information

STORAGE LATENCY x. RAMAC 350 (600 ms) NAND SSD (60 us)

STORAGE LATENCY x. RAMAC 350 (600 ms) NAND SSD (60 us) 1 STORAGE LATENCY 2 RAMAC 350 (600 ms) 1956 10 5 x NAND SSD (60 us) 2016 COMPUTE LATENCY 3 RAMAC 305 (100 Hz) 1956 10 8 x 1000x CORE I7 (1 GHZ) 2016 NON-VOLATILE MEMORY 1000x faster than NAND 3D XPOINT

More information

Week 2: Tiina Niklander

Week 2: Tiina Niklander Virtual memory Operations and policies Chapters 3.4. 3.6 Week 2: 17.9.2009 Tiina Niklander 1 Policies and methods Fetch policy (Noutopolitiikka) When to load page to memory? Placement policy (Sijoituspolitiikka

More information

FILE SYSTEMS, PART 2. CS124 Operating Systems Fall , Lecture 24

FILE SYSTEMS, PART 2. CS124 Operating Systems Fall , Lecture 24 FILE SYSTEMS, PART 2 CS124 Operating Systems Fall 2017-2018, Lecture 24 2 Last Time: File Systems Introduced the concept of file systems Explored several ways of managing the contents of files Contiguous

More information

LSbM-tree: Re-enabling Buffer Caching in Data Management for Mixed Reads and Writes

LSbM-tree: Re-enabling Buffer Caching in Data Management for Mixed Reads and Writes 27 IEEE 37th International Conference on Distributed Computing Systems LSbM-tree: Re-enabling Buffer Caching in Data Management for Mixed Reads and Writes Dejun Teng, Lei Guo, Rubao Lee, Feng Chen, Siyuan

More information

Operating System Concepts

Operating System Concepts Chapter 9: Virtual-Memory Management 9.1 Silberschatz, Galvin and Gagne 2005 Chapter 9: Virtual Memory Background Demand Paging Copy-on-Write Page Replacement Allocation of Frames Thrashing Memory-Mapped

More information

arxiv: v1 [cs.db] 1 Aug 2012

arxiv: v1 [cs.db] 1 Aug 2012 Flash-Based Extended Cache for Higher Throughput and Faster Recovery Woon-Hak Kang woonagi319@skku.edu Sang-Won Lee swlee@skku.edu Bongki Moon bkmoon@cs.arizona.edu College of Info. & Comm. Engr., Sungkyunkwan

More information

A Novel Buffer Management Scheme for SSD

A Novel Buffer Management Scheme for SSD A Novel Buffer Management Scheme for SSD Qingsong Wei Data Storage Institute, A-STAR Singapore WEI_Qingsong@dsi.a-star.edu.sg Bozhao Gong National University of Singapore Singapore bzgong@nus.edu.sg Cheng

More information

EMC XTREMCACHE ACCELERATES MICROSOFT SQL SERVER

EMC XTREMCACHE ACCELERATES MICROSOFT SQL SERVER White Paper EMC XTREMCACHE ACCELERATES MICROSOFT SQL SERVER EMC XtremSF, EMC XtremCache, EMC VNX, Microsoft SQL Server 2008 XtremCache dramatically improves SQL performance VNX protects data EMC Solutions

More information

6. Results. This section describes the performance that was achieved using the RAMA file system.

6. Results. This section describes the performance that was achieved using the RAMA file system. 6. Results This section describes the performance that was achieved using the RAMA file system. The resulting numbers represent actual file data bytes transferred to/from server disks per second, excluding

More information

Baoping Wang School of software, Nanyang Normal University, Nanyang , Henan, China

Baoping Wang School of software, Nanyang Normal University, Nanyang , Henan, China doi:10.21311/001.39.7.41 Implementation of Cache Schedule Strategy in Solid-state Disk Baoping Wang School of software, Nanyang Normal University, Nanyang 473061, Henan, China Chao Yin* School of Information

More information

a process may be swapped in and out of main memory such that it occupies different regions

a process may be swapped in and out of main memory such that it occupies different regions Virtual Memory Characteristics of Paging and Segmentation A process may be broken up into pieces (pages or segments) that do not need to be located contiguously in main memory Memory references are dynamically

More information

Join Processing for Flash SSDs: Remembering Past Lessons

Join Processing for Flash SSDs: Remembering Past Lessons Join Processing for Flash SSDs: Remembering Past Lessons Jaeyoung Do, Jignesh M. Patel Department of Computer Sciences University of Wisconsin-Madison $/MB GB Flash Solid State Drives (SSDs) Benefits of

More information

File System Performance (and Abstractions) Kevin Webb Swarthmore College April 5, 2018

File System Performance (and Abstractions) Kevin Webb Swarthmore College April 5, 2018 File System Performance (and Abstractions) Kevin Webb Swarthmore College April 5, 2018 Today s Goals Supporting multiple file systems in one name space. Schedulers not just for CPUs, but disks too! Caching

More information

Operating Systems, Fall

Operating Systems, Fall Policies and methods Virtual memory Operations and policies Chapters 3.4. 3.6 Week 2: 17.9.2009 Tiina Niklander 1 Fetch policy (Noutopolitiikka) When to load page to memory? Placement policy (Sijoituspolitiikka

More information

Hybrid Storage for Data Warehousing. Colin White, BI Research September 2011 Sponsored by Teradata and NetApp

Hybrid Storage for Data Warehousing. Colin White, BI Research September 2011 Sponsored by Teradata and NetApp Hybrid Storage for Data Warehousing Colin White, BI Research September 2011 Sponsored by Teradata and NetApp HYBRID STORAGE FOR DATA WAREHOUSING Ever since the advent of enterprise data warehousing some

More information

Optimizing Flash-based Key-value Cache Systems

Optimizing Flash-based Key-value Cache Systems Optimizing Flash-based Key-value Cache Systems Zhaoyan Shen, Feng Chen, Yichen Jia, Zili Shao Department of Computing, Hong Kong Polytechnic University Computer Science & Engineering, Louisiana State University

More information

A Buffer Replacement Algorithm Exploiting Multi-Chip Parallelism in Solid State Disks

A Buffer Replacement Algorithm Exploiting Multi-Chip Parallelism in Solid State Disks A Buffer Replacement Algorithm Exploiting Multi-Chip Parallelism in Solid State Disks Jinho Seol, Hyotaek Shim, Jaegeuk Kim, and Seungryoul Maeng Division of Computer Science School of Electrical Engineering

More information

Page Replacement for Write References in NAND Flash Based Virtual Memory Systems

Page Replacement for Write References in NAND Flash Based Virtual Memory Systems Regular Paper Journal of Computing Science and Engineering, Vol. 8, No. 3, September 2014, pp. 1-16 Page Replacement for Write References in NAND Flash Based Virtual Memory Systems Hyejeong Lee and Hyokyung

More information

Understanding Data Locality in VMware vsan First Published On: Last Updated On:

Understanding Data Locality in VMware vsan First Published On: Last Updated On: Understanding Data Locality in VMware vsan First Published On: 07-20-2016 Last Updated On: 09-30-2016 1 Table of Contents 1. Understanding Data Locality in VMware vsan 1.1.Introduction 1.2.vSAN Design

More information

Upgrade to Microsoft SQL Server 2016 with Dell EMC Infrastructure

Upgrade to Microsoft SQL Server 2016 with Dell EMC Infrastructure Upgrade to Microsoft SQL Server 2016 with Dell EMC Infrastructure Generational Comparison Study of Microsoft SQL Server Dell Engineering February 2017 Revisions Date Description February 2017 Version 1.0

More information

Memory management. Requirements. Relocation: program loading. Terms. Relocation. Protection. Sharing. Logical organization. Physical organization

Memory management. Requirements. Relocation: program loading. Terms. Relocation. Protection. Sharing. Logical organization. Physical organization Requirements Relocation Memory management ability to change process image position Protection ability to avoid unwanted memory accesses Sharing ability to share memory portions among processes Logical

More information

Migration Based Page Caching Algorithm for a Hybrid Main Memory of DRAM and PRAM

Migration Based Page Caching Algorithm for a Hybrid Main Memory of DRAM and PRAM Migration Based Page Caching Algorithm for a Hybrid Main Memory of DRAM and PRAM Hyunchul Seok Daejeon, Korea hcseok@core.kaist.ac.kr Youngwoo Park Daejeon, Korea ywpark@core.kaist.ac.kr Kyu Ho Park Deajeon,

More information

Chapter 9: Virtual-Memory

Chapter 9: Virtual-Memory Chapter 9: Virtual-Memory Management Chapter 9: Virtual-Memory Management Background Demand Paging Page Replacement Allocation of Frames Thrashing Other Considerations Silberschatz, Galvin and Gagne 2013

More information

The Role of Database Aware Flash Technologies in Accelerating Mission- Critical Databases

The Role of Database Aware Flash Technologies in Accelerating Mission- Critical Databases The Role of Database Aware Flash Technologies in Accelerating Mission- Critical Databases Gurmeet Goindi Principal Product Manager Oracle Flash Memory Summit 2013 Santa Clara, CA 1 Agenda Relational Database

More information

Identifying Performance Bottlenecks with Real- World Applications and Flash-Based Storage

Identifying Performance Bottlenecks with Real- World Applications and Flash-Based Storage Identifying Performance Bottlenecks with Real- World Applications and Flash-Based Storage TechTarget Dennis Martin 1 Agenda About Demartek Enterprise Data Center Environments Storage Performance Metrics

More information

Accelerate Applications Using EqualLogic Arrays with directcache

Accelerate Applications Using EqualLogic Arrays with directcache Accelerate Applications Using EqualLogic Arrays with directcache Abstract This paper demonstrates how combining Fusion iomemory products with directcache software in host servers significantly improves

More information

B.H.GARDI COLLEGE OF MASTER OF COMPUTER APPLICATION

B.H.GARDI COLLEGE OF MASTER OF COMPUTER APPLICATION Introduction :- An exploits the hardware resources of one or more processors to provide a set of services to system users. The OS also manages secondary memory and I/O devices on behalf of its users. So

More information

Second-Tier Cache Management Using Write Hints

Second-Tier Cache Management Using Write Hints Second-Tier Cache Management Using Write Hints Xuhui Li University of Waterloo Aamer Sachedina IBM Toronto Lab Ashraf Aboulnaga University of Waterloo Shaobo Gao University of Waterloo Kenneth Salem University

More information

Systems Programming and Computer Architecture ( ) Timothy Roscoe

Systems Programming and Computer Architecture ( ) Timothy Roscoe Systems Group Department of Computer Science ETH Zürich Systems Programming and Computer Architecture (252-0061-00) Timothy Roscoe Herbstsemester 2016 AS 2016 Caches 1 16: Caches Computer Architecture

More information

Abstract. The Challenges. ESG Lab Review InterSystems IRIS Data Platform: A Unified, Efficient Data Platform for Fast Business Insight

Abstract. The Challenges. ESG Lab Review InterSystems IRIS Data Platform: A Unified, Efficient Data Platform for Fast Business Insight ESG Lab Review InterSystems Data Platform: A Unified, Efficient Data Platform for Fast Business Insight Date: April 218 Author: Kerry Dolan, Senior IT Validation Analyst Abstract Enterprise Strategy Group

More information

Donn Morrison Department of Computer Science. TDT4255 Memory hierarchies

Donn Morrison Department of Computer Science. TDT4255 Memory hierarchies TDT4255 Lecture 10: Memory hierarchies Donn Morrison Department of Computer Science 2 Outline Chapter 5 - Memory hierarchies (5.1-5.5) Temporal and spacial locality Hits and misses Direct-mapped, set associative,

More information

SUPA: A Single Unified Read-Write Buffer and Pattern-Change-Aware FTL for the High Performance of Multi-Channel SSD

SUPA: A Single Unified Read-Write Buffer and Pattern-Change-Aware FTL for the High Performance of Multi-Channel SSD SUPA: A Single Unified Read-Write Buffer and Pattern-Change-Aware FTL for the High Performance of Multi-Channel SSD DONGJIN KIM, KYU HO PARK, and CHAN-HYUN YOUN, KAIST To design the write buffer and flash

More information

EMC XTREMCACHE ACCELERATES VIRTUALIZED ORACLE

EMC XTREMCACHE ACCELERATES VIRTUALIZED ORACLE White Paper EMC XTREMCACHE ACCELERATES VIRTUALIZED ORACLE EMC XtremSF, EMC XtremCache, EMC Symmetrix VMAX and Symmetrix VMAX 10K, XtremSF and XtremCache dramatically improve Oracle performance Symmetrix

More information

MQSim: A Framework for Enabling Realistic Studies of Modern Multi-Queue SSD Devices

MQSim: A Framework for Enabling Realistic Studies of Modern Multi-Queue SSD Devices MQSim: A Framework for Enabling Realistic Studies of Modern Multi-Queue SSD Devices Arash Tavakkol, Juan Gómez-Luna, Mohammad Sadrosadati, Saugata Ghose, Onur Mutlu February 13, 2018 Executive Summary

More information

Virtual Memory. Chapter 8

Virtual Memory. Chapter 8 Virtual Memory 1 Chapter 8 Characteristics of Paging and Segmentation Memory references are dynamically translated into physical addresses at run time E.g., process may be swapped in and out of main memory

More information

Evaluation Report: HP StoreFabric SN1000E 16Gb Fibre Channel HBA

Evaluation Report: HP StoreFabric SN1000E 16Gb Fibre Channel HBA Evaluation Report: HP StoreFabric SN1000E 16Gb Fibre Channel HBA Evaluation report prepared under contract with HP Executive Summary The computing industry is experiencing an increasing demand for storage

More information

RECORD LEVEL CACHING: THEORY AND PRACTICE 1

RECORD LEVEL CACHING: THEORY AND PRACTICE 1 RECORD LEVEL CACHING: THEORY AND PRACTICE 1 Dr. H. Pat Artis Performance Associates, Inc. 72-687 Spyglass Lane Palm Desert, CA 92260 (760) 346-0310 drpat@perfassoc.com Abstract: In this paper, we will

More information

Technical Notes. Considerations for Choosing SLC versus MLC Flash P/N REV A01. January 27, 2012

Technical Notes. Considerations for Choosing SLC versus MLC Flash P/N REV A01. January 27, 2012 Considerations for Choosing SLC versus MLC Flash Technical Notes P/N 300-013-740 REV A01 January 27, 2012 This technical notes document contains information on these topics:...2 Appendix A: MLC vs SLC...6

More information

Chapter 8: Virtual Memory. Operating System Concepts

Chapter 8: Virtual Memory. Operating System Concepts Chapter 8: Virtual Memory Silberschatz, Galvin and Gagne 2009 Chapter 8: Virtual Memory Background Demand Paging Copy-on-Write Page Replacement Allocation of Frames Thrashing Memory-Mapped Files Allocating

More information

Storage Architecture and Software Support for SLC/MLC Combined Flash Memory

Storage Architecture and Software Support for SLC/MLC Combined Flash Memory Storage Architecture and Software Support for SLC/MLC Combined Flash Memory Soojun Im and Dongkun Shin Sungkyunkwan University Suwon, Korea {lang33, dongkun}@skku.edu ABSTRACT We propose a novel flash

More information

Assignment 1 due Mon (Feb 4pm

Assignment 1 due Mon (Feb 4pm Announcements Assignment 1 due Mon (Feb 19) @ 4pm Next week: no classes Inf3 Computer Architecture - 2017-2018 1 The Memory Gap 1.2x-1.5x 1.07x H&P 5/e, Fig. 2.2 Memory subsystem design increasingly important!

More information

Performance Modeling and Analysis of Flash based Storage Devices

Performance Modeling and Analysis of Flash based Storage Devices Performance Modeling and Analysis of Flash based Storage Devices H. Howie Huang, Shan Li George Washington University Alex Szalay, Andreas Terzis Johns Hopkins University MSST 11 May 26, 2011 NAND Flash

More information

Marten van Dijk Syed Kamran Haider, Chenglu Jin, Phuong Ha Nguyen. Department of Electrical & Computer Engineering University of Connecticut

Marten van Dijk Syed Kamran Haider, Chenglu Jin, Phuong Ha Nguyen. Department of Electrical & Computer Engineering University of Connecticut CSE 5095 & ECE 4451 & ECE 5451 Spring 2017 Lecture 5a Caching Review Marten van Dijk Syed Kamran Haider, Chenglu Jin, Phuong Ha Nguyen Department of Electrical & Computer Engineering University of Connecticut

More information

Managing Performance Variance of Applications Using Storage I/O Control

Managing Performance Variance of Applications Using Storage I/O Control Performance Study Managing Performance Variance of Applications Using Storage I/O Control VMware vsphere 4.1 Application performance can be impacted when servers contend for I/O resources in a shared storage

More information