Chunling Wang, Dandan Wang, Yunpeng Chai, Chuanwen Wang and Diansen Sun Renmin University of China

Similar documents
Skylight A Window on Shingled Disk Operation. Abutalib Aghayev, Peter Desnoyers Northeastern University

SMORE: A Cold Data Object Store for SMR Drives

CFDC A Flash-aware Replacement Policy for Database Buffer Management

Elastic Queue: A Universal SSD Lifetime Extension Plug-in for Cache Replacement Algorithms

Cascade Mapping: Optimizing Memory Efficiency for Flash-based Key-value Caching

STORING DATA: DISK AND FILES

Storage Systems : Disks and SSDs. Manu Awasthi July 6 th 2018 Computer Architecture Summer School 2018

Storage Systems : Disks and SSDs. Manu Awasthi CASS 2018

A Light-weight Compaction Tree to Reduce I/O Amplification toward Efficient Key-Value Stores

Comparing Performance of Solid State Devices and Mechanical Disks

CFLRU:A A Replacement Algorithm for Flash Memory

AN ALTERNATIVE TO ALL- FLASH ARRAYS: PREDICTIVE STORAGE CACHING

EE 4683/5683: COMPUTER ARCHITECTURE

Caches and Memory Hierarchy: Review. UCSB CS240A, Winter 2016

Optimizing Flash-based Key-value Cache Systems

CS356: Discussion #9 Memory Hierarchy and Caches. Marco Paolieri Illustrations from CS:APP3e textbook

Why memory hierarchy? Memory hierarchy. Memory hierarchy goals. CS2410: Computer Architecture. L1 cache design. Sangyeun Cho

A Caching-Oriented FTL Design for Multi-Chipped Solid-State Disks. Yuan-Hao Chang, Wei-Lun Lu, Po-Chun Huang, Lue-Jane Lee, and Tei-Wei Kuo

Caches and Memory Hierarchy: Review. UCSB CS240A, Fall 2017

ASEP: An Adaptive Sequential Prefetching Scheme for Second-level Storage System

From server-side to host-side:

Copyright 2012 EMC Corporation. All rights reserved.

Course Administration

Shingled Magnetic Recording (SMR) Panel: Data Management Techniques Examined Tom Coughlin Coughlin Associates

Design of Flash-Based DBMS: An In-Page Logging Approach

Page 1. Memory Hierarchies (Part 2)

CS 31: Intro to Systems Caching. Kevin Webb Swarthmore College March 24, 2015

FlashTier: A Lightweight, Consistent and Durable Storage Cache

Computer Systems C S Cynthia Lee Today s materials adapted from Kevin Webb at Swarthmore College

CSCI-UA.0201 Computer Systems Organization Memory Hierarchy

Using SMR Drives with Smart Storage Stack-Based HBA and RAID Solutions

Computer Architecture and System Software Lecture 09: Memory Hierarchy. Instructor: Rob Bergen Applied Computer Science University of Winnipeg

Presented by: Nafiseh Mahmoudi Spring 2017

Computer Architecture Memory hierarchies and caches

HAT: An Efficient Buffer Management Method for Flash-based Hybrid Storage Systems

Flash Memory Based Storage System

Understanding the Relation between the Performance and Reliability of NAND Flash/SCM Hybrid Solid- State Drive

Computer Organization: A Programmer's Perspective

L7: Performance. Frans Kaashoek Spring 2013

Accelerating Spectrum Scale with a Intelligent IO Manager

CMPSC 311- Introduction to Systems Programming Module: Caching

Performance metrics for caches

10/1/ Introduction 2. Existing Methods 3. Future Research Issues 4. Existing works 5. My Research plan. What is Data Center

LightNVM: The Linux Open-Channel SSD Subsystem Matias Bjørling (ITU, CNEX Labs), Javier González (CNEX Labs), Philippe Bonnet (ITU)

Open-Channel SSDs Then. Now. And Beyond. Matias Bjørling, March 22, Copyright 2017 CNEX Labs

Outline. 1 Paging. 2 Eviction policies. 3 Thrashing 1 / 28

CMPSC 311- Introduction to Systems Programming Module: Caching

Evaluating Phase Change Memory for Enterprise Storage Systems

An Efficient Memory-Mapped Key-Value Store for Flash Storage

CSF Cache Introduction. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005]

Cache Memories. From Bryant and O Hallaron, Computer Systems. A Programmer s Perspective. Chapter 6.

SOS : Software-based Out-of-Order Scheduling for High-Performance NAND Flash-Based SSDs

1. Creates the illusion of an address space much larger than the physical memory

Middleware and Flash Translation Layer Co-Design for the Performance Boost of Solid-State Drives

Moneta: A High-performance Storage Array Architecture for Nextgeneration, Micro 2010

Storage Systems for Shingled Disks

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2014 Lecture 14

CSF Improving Cache Performance. [Adapted from Computer Organization and Design, Patterson & Hennessy, 2005]

Cooperating Write Buffer Cache and Virtual Memory Management for Flash Memory Based Systems

Basic Memory Hierarchy Principles. Appendix C (Not all will be covered by the lecture; studying the textbook is recommended!)

Virtual Memory. Reading. Sections 5.4, 5.5, 5.6, 5.8, 5.10 (2) Lecture notes from MKP and S. Yalamanchili

EEC 170 Computer Architecture Fall Cache Introduction Review. Review: The Memory Hierarchy. The Memory Hierarchy: Why Does it Work?

SFS: Random Write Considered Harmful in Solid State Drives

SSD-based Information Retrieval Systems

Assignment 1 due Mon (Feb 4pm

University of Waterloo Midterm Examination Sample Solution

CSE 431 Computer Architecture Fall Chapter 5A: Exploiting the Memory Hierarchy, Part 1

Leveraging Flash in HPC Systems

Caches (Writing) Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University. P & H Chapter 5.2 3, 5.5

CS 33. Architecture and Optimization (3) CS33 Intro to Computer Systems XVI 1 Copyright 2018 Thomas W. Doeppner. All rights reserved.

A New Metric for Analyzing Storage System Performance Under Varied Workloads

Cloud Transcoder: Bridging the Format and Resolution Gap between Internet Videos and Mobile Devices

SSD-based Information Retrieval Systems

Novel Address Mappings for Shingled Write Disks

Reducing Miss Penalty: Read Priority over Write on Miss. Improving Cache Performance. Non-blocking Caches to reduce stalls on misses

File Systems. Kartik Gopalan. Chapter 4 From Tanenbaum s Modern Operating System

5 Solutions. Solution a. no solution provided. b. no solution provided

Locality. CS429: Computer Organization and Architecture. Locality Example 2. Locality Example

COSC 243. Memory and Storage Systems. Lecture 10 Memory and Storage Systems. COSC 243 (Computer Architecture)

CS 61C: Great Ideas in Computer Architecture. The Memory Hierarchy, Fully Associative Caches

Memory Hierarchies &

The Memory Hierarchy 10/25/16

Lecture 20: Multi-Cache Designs. Spring 2018 Jason Tang

CBM: A Cooperative Buffer Management for SSD

TDT 4260 lecture 3 spring semester 2015

CSE502: Computer Architecture CSE 502: Computer Architecture

and data combined) is equal to 7% of the number of instructions. Miss Rate with Second- Level Cache, Direct- Mapped Speed

Donn Morrison Department of Computer Science. TDT4255 Memory hierarchies

A Memory Management Scheme for Hybrid Memory Architecture in Mission Critical Computers

Chapter 6 Caches. Computer System. Alpha Chip Photo. Topics. Memory Hierarchy Locality of Reference SRAM Caches Direct Mapped Associative

Locality and Data Accesses video is wrong one notes when video is correct

I, J A[I][J] / /4 8000/ I, J A(J, I) Chapter 5 Solutions S-3.

Serial ATA (SATA) Interface. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Linux Kernel Support for Hybrid SMR Devices. Damien Le Moal, Director, System Software Group, Western Digital

IBM System Storage DS8870 Release R7.3 Performance Update

GearDB: A GC-free Key-Value Store on HM-SMR Drives with Gear Compaction

CS3350B Computer Architecture

Rapid Prototyping and Evaluation of Intelligence Functions of Active Storage Devices

VSSIM: Virtual Machine based SSD Simulator

COS 318: Operating Systems. Storage Devices. Vivek Pai Computer Science Department Princeton University

Transcription:

Chunling Wang, Dandan Wang, Yunpeng Chai, Chuanwen Wang and Diansen Sun Renmin University of China

Data volume is growing 44ZB in 2020! How to store? Flash arrays, DRAM-based storage: high costs, reliability, or limited capacity Conventional Magnetic Recording (CMR): limited recording density(1t bit/in 2 ) Shingled Magnetic Recording (SMR): larger, cheaper, but slower SSD+SMR = Hybrid Storage: larger, cheaper, but faster??

Hybrid Storage + LRU faster Why? Do not consider Write Amplification of SMR disk Cache Hit Rates vs. SMR Write Amplification Be larger, cheaper, but faster

overlapped tracks write a block write a band eg: a Seagate 5TB SMR disk (ST5000AS0011) a 20GB non-overlapped tracks write buffer an aggressive manner to clean the FIFO queue band size: 17~36MB Max write amplification: 5TB/20GB=256!

41.3x ~171.5x 113.0x on average

1.37x 93.2% of CMR Small LBA range + SMR CMR 8.6% of CMR 15.11x

Write Requests Write Requests Write Requests S S S SSD Cache SSD Cache S1 S2 S1 x Rwa1 > S2 x Rwa2 Rwa Rwa1 Rwa2 SMR LBA Range SMR LBA Range SMR LBA Range Limit written LBA range => SMR Write Amplification Challenge: conflict objectives High cache hit rate Low SMR Write Amplification (a) SMR-only Storage (b) Hybrid Storage + LRU etc. (c) Hybrid Storage + a better plan

Partially Open Region for Eviction (PORE) : a new SMR-orientated cache framework CLEAN block / DIRTY block To reduce Write Amplification SSD Cache Open Region & Forbidden Region a b c d To protect cache hit rates Block-level eviction hot cold Periodically region division eviction SMR LBA Range Open Region Forbidden Region

Basic Unit of Region Division: Zone Open Zone & Forbidden Zone Open Zones Forbidden Zones SMR Bands

/ DIRTY block Forbidden Zone Open Zone CLEAN block SSD Cache c b a Chose a to evict, but a is from Forbidden Zone, skip hot cold

/ DIRTY block Forbidden Zone Open Zone CLEAN block SSD Cache Chose b to evict, b is clean, evict without flushing to SMR disk hot c b a cold

/ DIRTY block Forbidden Zone Open Zone CLEAN block SSD Cache Chose c to evict, c is from Open Zone, evict and flush to SMR disk hot c a cold

/ DIRTY block Forbidden Zone Open Zone CLEAN block SSD Cache After flushing c hot a cold c

/ DIRTY block Forbidden Zone Open Zone After a periodical time, re-divide Open Zone and Forbidden Zone CLEAN block hot SSD Cache a cold

Which Zones should be evicted from SSD cache? Zones in Z4 Open Zone Which Zones should be protected in SSD cache? Zones in Z1 Forbidden Zone Zones in Z2, Z3 need to be considered

Coverage First (CF) Minimal SMR Write Amplification Popularity First (PF) Maximal cache hit rates BaLancing between Zone Coverage and Popularity (BL) Both considered

SSD-SMR prototype storage (https://github.com/wcl14/smr-ssd-cache) Trace replay module SSD cache module SMR disk emulator module Statistics module System Linux 2.6.32 DRAM CMR SSD SMR 8 GB 7200RPM 500GB 240GB PCIe 5900RPM 5TB

Traces

Managing a Read-Write Cache Total I/O time vs. SMR-only: 11.8x vs. CMR-only: 3.3 vs. LRU: 2.8x vs. LRU-band: 1.6x

compared with LRU 16.15% Managing a Read-Write Cache SMR-only: 45.60 LRU: 53.28 LRU-band: 28.37 PORE: 12.80 Read Hit Rates compared with LRU 4.9% Write Amplification Write Hit Rates

BL is always fastest! PF is highest CF/BL is little lower Write Hit Rates CF/BL is much smaller Total I/O time Write Amplification

10MB~20MB is best No effect 10MB~20MB is best

Period Length / SMR write buffer = 1 is best! Longer is better, but it has limitation Period Length / SMR write buffer = 1 is best!

33.4 hours written LBA range of mix reaches 1.15 TB 87.5% 85.0% 13.0 hours 5.69 hours

SSD+SMR using PORE can make SMR be primary storage in much more situations Come up with a new way to reduce SMR Write Amplification and improve hybrid storage performance by limiting written LBA range Compared with LRU, PORE improves 2.84x on average, while compared with CMR-only, PORE improves 3.26x on average.

https://github.com/wcl14/smr-ssd-cache ypchai@ruc.edu.cn

compared with LRU 7.89% Managing a Write-Only Cache 34.9x 10.2x 4.60x 2.02x Write Hit Rates SMR-only: 43.85 LRU: 52.41 LRU-band: 21.99 PORE: 7.76 Total I/O time Write Amplification