CFLRU:A A Replacement Algorithm for Flash Memory

Similar documents
CFDC A Flash-aware Replacement Policy for Database Buffer Management

A Memory Management Scheme for Hybrid Memory Architecture in Mission Critical Computers

Cooperating Write Buffer Cache and Virtual Memory Management for Flash Memory Based Systems

A Buffer Replacement Algorithm Exploiting Multi-Chip Parallelism in Solid State Disks

Virtual memory. Virtual memory - Swapping. Large virtual memory. Processes

SRM-Buffer: An OS Buffer Management SRM-Buffer: An OS Buffer Management Technique toprevent Last Level Cache from Thrashing in Multicores

Cascade Mapping: Optimizing Memory Efficiency for Flash-based Key-value Caching

STORING DATA: DISK AND FILES

LBM: A Low-power Buffer Management Policy for Heterogeneous Storage in Mobile Consumer Devices

Page Replacement for Write References in NAND Flash Based Virtual Memory Systems

Storage and File Structure

Page 1. Memory Hierarchies (Part 2)

Chunling Wang, Dandan Wang, Yunpeng Chai, Chuanwen Wang and Diansen Sun Renmin University of China

Caching and reliability

Adapted from instructor s supplementary material from Computer. Patterson & Hennessy, 2008, MK]

Virtual Memory Management in Linux (Part II)

박사학위논문 Ph. D. Dissertation. Software Optimization Methods for High-Performance Flash-based Storage Devices

Outline. 1 Paging. 2 Eviction policies. 3 Thrashing 1 / 28

SFS: Random Write Considered Harmful in Solid State Drives

Amnesic Cache Management for Non-Volatile Memory

CSE 120. Translation Lookaside Buffer (TLB) Implemented in Hardware. July 18, Day 5 Memory. Instructor: Neil Rhodes. Software TLB Management


CS3350B Computer Architecture

Migration Based Page Caching Algorithm for a Hybrid Main Memory of DRAM and PRAM

Performance metrics for caches

HydraFS: a High-Throughput File System for the HYDRAstor Content-Addressable Storage System

Design of Flash-Based DBMS: An In-Page Logging Approach

CSF Improving Cache Performance. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005]

Using Transparent Compression to Improve SSD-based I/O Caches

Adaptive Compressed Caching: Embracing and extending for the Linux 2.6 kernel

Swapping. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Cache Memory COE 403. Computer Architecture Prof. Muhamed Mudawar. Computer Engineering Department King Fahd University of Petroleum and Minerals

Swapping. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

Memory. Principle of Locality. It is impossible to have memory that is both. We create an illusion for the programmer. Employ memory hierarchy

Compressed Swap for Embedded Linux. Alexander Belyakov, Intel Corp.

Buffer Caching Algorithms for Storage Class RAMs

Virtual Memory. Patterson & Hennessey Chapter 5 ELEC 5200/6200 1

EE 4683/5683: COMPUTER ARCHITECTURE

Flash Memory Based Storage System

LRU-WSR: Integration of LRU and Writes Sequence Reordering for Flash Memory

GCMA: Guaranteed Contiguous Memory Allocator. SeongJae Park

Memory Management 3/29/14 21:38

Chapter 5. Large and Fast: Exploiting Memory Hierarchy. Part II Virtual Memory

Embedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institute of Technology, Delhi

Caching and Demand-Paged Virtual Memory

Memory Management Virtual Memory

Operating Systems CSE 410, Spring Virtual Memory. Stephen Wagner Michigan State University

MTD Based Compressed Swapping for Embedded Linux.

Computer Architecture Memory hierarchies and caches

Operating Systems. Operating Systems Professor Sina Meraji U of T

V. Primary & Secondary Memory!

NBM: An Efficient Cache Replacement Algorithm for Nonvolatile Buffer Caches

Move back and forth between memory and disk. Memory Hierarchy. Two Classes. Don t

Memory Management Ch. 3

LECTURE 12. Virtual Memory

Virtual Memory. Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University

UNIT-V MEMORY ORGANIZATION

Chapter 6 Objectives

Data Storage and Query Answering. Data Storage and Disk Structure (2)

A Cache Hierarchy in a Computer System

Chapter 8. Virtual Memory

Each time a file is opened, assign it one of several access patterns, and use that pattern to derive a buffer management policy.

Topics. File Buffer Cache for Performance. What to Cache? COS 318: Operating Systems. File Performance and Reliability

LRU. Pseudo LRU A B C D E F G H A B C D E F G H H H C. Copyright 2012, Elsevier Inc. All rights reserved.

Design Issues 1 / 36. Local versus Global Allocation. Choosing

SUPA: A Single Unified Read-Write Buffer and Pattern-Change-Aware FTL for the High Performance of Multi-Channel SSD

AN ALTERNATIVE TO ALL- FLASH ARRAYS: PREDICTIVE STORAGE CACHING

Memory management: outline

Memory management: outline

Memory Management Outline. Operating Systems. Motivation. Paging Implementation. Accessing Invalid Pages. Performance of Demand Paging

Optimizing Flash-based Key-value Cache Systems

Advanced Computer Architecture

associativity terminology

Week 2: Tiina Niklander

Caches. Hiding Memory Access Times

University of Kaiserslautern Department of Computer Science Database and Information Systems. Caching for flash-based databases

Virtual Memory Management

ASEP: An Adaptive Sequential Prefetching Scheme for Second-level Storage System

FlashTier: A Lightweight, Consistent and Durable Storage Cache

Energy and Thermal Aware Buffer Cache Replacement Algorithm

Chapter 6 Caches. Computer System. Alpha Chip Photo. Topics. Memory Hierarchy Locality of Reference SRAM Caches Direct Mapped Associative

Operating Systems, Fall

Most common example today: wireless (cell) phones

ECE7995 (6) Improving Cache Performance. [Adapted from Mary Jane Irwin s slides (PSU)]

Managing Storage: Above the Hardware

The Google File System

OS and Hardware Tuning

ECE 571 Advanced Microprocessor-Based Design Lecture 13

Cache Performance and Memory Management: From Absolute Addresses to Demand Paging. Cache Performance

Recap: Machine Organization

1. Creates the illusion of an address space much larger than the physical memory

FAB: Flash-Aware Buffer Management Policy for Portable Media Players

Donn Morrison Department of Computer Science. TDT4255 Memory hierarchies

Feng Chen and Xiaodong Zhang Dept. of Computer Science and Engineering The Ohio State University

Memory. Objectives. Introduction. 6.2 Types of Memory

OS and HW Tuning Considerations!

Perform page replacement. (Fig 8.8 [Stal05])

Page 1. Multilevel Memories (Improving performance using a little cash )

Improving Cache Performance and Memory Management: From Absolute Addresses to Demand Paging. Highly-Associative Caches

Virtual Memory. Reading. Sections 5.4, 5.5, 5.6, 5.8, 5.10 (2) Lecture notes from MKP and S. Yalamanchili

Transcription:

CFLRU:A A Replacement Algorithm for Flash Memory CASES'06, October 23 25, 2006, Seoul, Korea. Copyright 2006 ACM 1-59593-543-6/06/0010 Yen-Ting Liu

Outline Introduction CFLRU Algorithm Simulation Implementation Conclusion

Introduction The characteristics of flash memory are significantly different from magnetic disks Flash memory has no latency Flash memory has asymmetric read and write operation characteristics in terms of performance and energy consumption Flash memory not support in-place update

Introduction Motivation Most operation systems are customized for disk-based storage system Replacement policies only concern the number of cache hits Propose the CFLRU replacement algorithm

Introduction Two kinds of replacement costs One is generated when a requested page is fetched from secondary storage to the page cache in RAM The other cost is generated when a page is evicted from the page cache to secondary storage Clean page and dirty page Replacement cost and cache hit

CFLRU Algorithm Clean-First LRU (CFLRU) Divide the LRU list into two regions LRU : P8 P7 P6 P5 CFLRU:P7 P5 P8 P6

CFLRU Algorithm The size of the clean-first region is called a window size Large windows will increase the cache miss rate Small windows will increase the number of evicted dirty pages

CFLRU Algorithm Cost of a flash write operation Cost of a flash read operation The number of dirty pages that should have been evicted in the LRU order but are kept in the cache The benefit of the CFLRU algorithm The number of clean pages that are evicted instead of dirty pages within the clear-first region The probability if the future reference of a clear page, i, which is evicted at the k-th position The cost of the CFLRU algorithm

CFLRU Algorithm

CFLRU Algorithm In the real world, it s s not easy to find to the probability of the future reference Static Repetitive experiment Dynamically Periodically collected information about flash read and write operation

Simulation Simulation is performed with two different types of real workload traces Trace for swap system Gather virtual memory reference traces using a profiling tool, called Valgrind Trace for file system Gather block reference traces directly from the buffer cache under the ext2 file system

Simulation Chose five different application and executed them on Linux/x86 machine 32 MB SDRAM and 128 MB flash memory

Simulation

Simulation CFLRU-static:reduced by 28.4% with LRU CFLRU-dynamic:reduced by 23.1% with LRU

Simulation CFLRU-static:reduced by 26.2% with LRU CFLRU-dynamic:reduced by 23.5% with LRU

Implementation Original Linux page reclamation Page cache consists of two pseudo LRU lists Active list and inactive list When kernel decides to make free space, it starts the reclaiming phase First, scan the pages in the inactive list Second, if there are too many process-mapped pages, start the swap-out phase

Implementation CFLRU Implementation Insert an additional reclamation function Clean pages in the inactive list are evicted first until the enough number of pages is freed The concept of priority is correctly matched with that of the window size of the clean-first region in CFLRU A kernel daemon periodically checks the replacement cost and compare with last replacement

Implementation Optimization Read-ahead ahead has a bad effect on cache hit rate Remove the sequential read-ahead ahead function

Implementation CFLRU replacement algorithm is evaluated on a system with Pentium IV processor 32 MB SDRAM running the Linux kernel 2.4.28 64 MB flash memory for swap space 256 MB flash memory for file system (ext2) Compare four Linux kernel implementation Plain Linux kernel Linux kernel without swap read-ahead ahead CFLRU-static CFLRU-dynamic The window size of the CFLRU static algorithm is ¼ of the inactive list Five applications gcc, tar, diff, encoding and file system benchmark

Implementation

Implementation No read-ahead:reduced by 2.4% CFLRU-static:reduced by 6.2% CFLRU-dynamic:reduced by 5.7% No read-ahead:saving by 4.4% CFLRU-static:saving by 11.4% CFLRU-dynamic:saving by 12.1%

Conclusion This paper presents the CFLRU replacement algorithm for flash memory CFLRU tries to reduce the number of costly write Static and dynamic method of CFLRU reduce the replacement cost in simulation and implementation