Optimizing Flash-based Key-value Cache Systems

Size: px
Start display at page:

Download "Optimizing Flash-based Key-value Cache Systems"

Transcription

1 Optimizing Flash-based Key-value Cache Systems Zhaoyan Shen, Feng Chen, Yichen Jia, Zili Shao Department of Computing, Hong Kong Polytechnic University Computer Science & Engineering, Louisiana State University The 8th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage 16)

2 Key-value Information Key-value access is dominant in web services Many apps simply store and retrieve key-value pairs Key-value cache is the first line of defense Benefits: Improve throughput, reduce latency, reduce server load In-memory KV cache is popular (Memcache) High speed but has cost, power, capacity problem 1

3 Flash based Key-value Cache Key-value cache Flash SSD + SHA-1(Key_1) Key (Slab, Slot) Slab Log-Structured Logical Flash Space Hash Index (Memory) Key-value Slabs (Flash LBA) In-memory hash map to track key-to-value mapping Slabs are used in a log-structured way Updated value item written to a new location and old values recycled later 2

4 Critical Issues Redundant mapping Double garbage collection Over-over-provisioning 3

5 Critical Issue 1: Redundant Mapping Critical Issues Redundant mapping at application- and FTL-level KVC: An in-memory hash table (Key Slab, Offset) FTL: An on-device page mapping table (LBA PBA) Problems Two mapping structures (unnecessarily) co-exist at two levels A significant waste of on-device DRAM space (e.g., 1GB for 1TB) The on-device DRAM buffer is costly, unreliable, and could be used for buffering. SHA-1(Key) K1 (1, 2) Hash table AAA,BBB BBB,CCC DDD,EEE Slab Space KVC software Mapping LBA Mapping Table AAA, BBB BBB,CCC DDD,EEE PBA FTL Mapping in hardware 4

6 Critical Issue 2: Double Garbage Collection Garbage collection (GC) at app- and FTL- levels KVC: Recycle deleted or changed key-value items FTL: Recycle trimmed or changed pages Problems Semantic validity of a key-value entry is not seen at FTL Redundant data copy operation Critical Issues SHA-1(Key) K1 AAA,BBB AAA,BBB BBB,CCC 0 BBB,CCC 1 DDD,EEE 2 DDD,EEE (1, (2, 2) 1) 3 AAA,BBB 4 5 Hash table KVC-level GC Slab Space LBA Mapping Table FTL-level GC BBB,CCC DDD,EEE PBA 5

7 Critical Issue 3: Over-over-provisioning Critical Issues Over-provisioning at FTL-level FTL has a portion (20-30%) of flash as Over-Provisioning Space (OPS) OPS space is invisible and unusable to the host applications Problems SHA-1(Key) OPS is reserved for dealing the worst-case situation, as a storage Over-over provisioning for Key-value caches Key-value caches are dominated by read (GET) traffic, not writes Key-value cache hit ratio is highly sensitive to usable cache size If 20-30% space can be released, the cache hit ratio can be greatly improved K1 (1, 2) Hash table AAA,BBB BBB,CCC DDD,EEE Slab Space LBA Mapping Table AAA,BBB BBB,CCC DDD,EEE FFF,GGG HHH,III JJJ, KKK Unusable PBA 6

8 Semantic Gap Problem Critical Issues Key-value cache Flash SSD Semantic Gap Fine-grained GC Key-to-value mapping Validity of slab entries Physical data layout on flash Direct flash memory control Proper mapping granularity In the current SW/HW architecture, we can do little to address these three issues. 7

9 Optimization Goals Redundant mapping Single mapping Double garbage collection App-driven GC Over-over-provisioning Dynamic OPS 8

10 Design Overview Key-value Cache K/V Mapping Key-value Cache Slab Mgm. Cache Mgm. Slab Manager Key/Slab Mapping KVC-driven GC Cache Manager OPS Management Page Cache Operating Systems I/O Scheduling Device Driver Bad Block Mgm. Library (libssd) Slab/Flash Translation Operation Conversion Garbage Collection Bad Block Mgm. Flash SSD Page Mapping Flash Operations Wear Leveling OPS Mgm. Page Cache Operating Systems I/O Scheduling Open-channel SSD Device Driver Flash Operations An enhanced flash-aware key-value cache A thin intermediate library layer (libssd) A specialized flash memory PCI-E SSD hardware 9

11 An Enhanced Flash-aware Key-value Cache Key-value Cache Slab management Unified direct mapping Garbage collection OPS management 10

12 Key-value Cache Slab Management Our choice Directly use a flash block as a slab (8MB) Slab buffer: An in-memory slab maintained for each class Parallelize and asynchronize the slab write I/Os to the flash memory Round-robin allocation of in-flash slab for load-balancing across channels Key/Value In-Memory Slab Buffer Class i Class i+1 Class i+2 Class n Flash SSD Channel #0 Channel #1 Channel #2 Channel #3 Channel #4 11

13 Unified Direct Mapping Key-value Cache Collapse multiple levels of indirect mapping to only one Prior mapping: Key Slab LBA PBA Current mapping: Key Slab (i.e., Flash Block) Benefits Removes the time overhead for lookup intermediate layers (Speed+) Only one single must-have in-memory hash table is needed (Cost-) On-device RAM space can be completely removed (or for other uses) SHA-1(Key) SHA-1(Key) K1 (1, 2) K1 (Slab, Offset) AAA,BBB BBB,CCC DDD,EEE Hash table Hash table Slab Space AAA,BBB BBB,CCC Slab #1 DDD,EEE0 CDC, CDC1 2 Slab #2 3 Mapping Flash Memory Table AAA,BBB BBB,CCC DDD,EEE Flash Memory 12

14 Key-value Cache App-driven Garbage Collection One single GC is driven directly by key-value cache system All slab writes are in units of blocks (no need for device-level GC) GC is directly triggered and controlled by application-level KVC Two GC policies Greedy GC: the least occupied slab is evicted to move minimum slots Quick clean: the LRU slab is immediately dropped recycling valid slots Adaptively used for different circumstances (busy or idle) AAA,BBB BBB,CCC DDD,EEE copy AAA,BBB BBB,CCC DDD,EEE 333, , , ,999 eee,fff ccc,ddd bbb,ccc aaa,bbb Victim Slab Target Slab LRU Slab MRU Slab Greedy GC Quick Clean 13

15 OPS Over-Provisioning Space Management Key-value Cache Dynamically tuning OPS space online Rationale KVC is typically read-intensive and OPS can be small Goal keep just enough OPS space (adaptive to intensity of writes) OPS management policies Heuristic method: An OPS window (W L and W H ) to estimate size Low watermark hit Trigger quick clean, raise the OPS window High watermark hit OPS is over-allocated, lower the OPS window Max OSP W H Time W L Quick clean initiated 14

16 Experiments Preliminary Experiments Implementation Key-value cache on Twitter s Fatcache to fit hardward Libssd Library (621 lines of code in C) Experimental Setup Intel Xeon E-1225, 32GB Memory, 1TB Disk, Open-Channel SSD Ubuntu LTS, Linux , Ext4 filesystem Hardware: Open-channel SSD A PCI-E based with 12 channel, and 192 LUNs Direct control to the device (via ioctl interface) 15

17 Experiments SET Throughput and Latency Our Scheme Our Scheme Stock Fatcache Stock Fatcache SET Workloads: 40milliom requests of 1KB key/value pairs Both set throughput/latency from our scheme are the best 16

18 Conclusion Conclusion KV stores become critical as they are one of the most important building blocks in modern web infrastructures and high-performance data-intensive applications. We build a highly efficient flash-based cache system which enables three benefits, namely a unified single-level direct mapping, a cache-driven fine-grained garbage collection, and an adaptive over-provisioning scheme We are implementing a prototype on the Open-Channel SSD hardware and our preliminary results show that it is highly promising 17

19 Thank You!

20 Backup Slides 19

21 GET Throughput and Latency Our Scheme Our Scheme Stock Fatcache Stock Fatcache GET performance is largely determined by the raw speed GET latencies are among the lowest in the set of SSDs 20

22 SET Latencies A Zoom-in View Our Scheme Stock Fatchche + KingSpec The direct control successfully removes high cost GC effects Quick clean removes long I/Os under high write pressure 21

23 Block Erase Count Erase Count Quick Clean kicks in SSDSim Open-channel SSD Trace collected with running Fatcache on Samsung SSD Block trace is replayed on MSR s SSD simulator for erase count Our solution reduces erase count by 30% 22

24 Effect of the In-memory Slab Buffer 10x buffer size difference does not affect latency significantly A small (56MB) in-memory slab buffer is sufficient for use 23

DIDACache: An Integration of Device and Application for Flash-based Key-value Caching

DIDACache: An Integration of Device and Application for Flash-based Key-value Caching DIDACache: An Integration of Device and Application for Flash-based Key-value Caching ZHAOYAN SHEN, The Hong Kong Polytechnic University FENG CHEN, Louisiana State University YICHEN JIA, Louisiana State

More information

Cascade Mapping: Optimizing Memory Efficiency for Flash-based Key-value Caching

Cascade Mapping: Optimizing Memory Efficiency for Flash-based Key-value Caching Cascade Mapping: Optimizing Memory Efficiency for Flash-based Key-value Caching Kefei Wang and Feng Chen Louisiana State University SoCC '18 Carlsbad, CA Key-value Systems in Internet Services Key-value

More information

Optimizing Translation Information Management in NAND Flash Memory Storage Systems

Optimizing Translation Information Management in NAND Flash Memory Storage Systems Optimizing Translation Information Management in NAND Flash Memory Storage Systems Qi Zhang 1, Xuandong Li 1, Linzhang Wang 1, Tian Zhang 1 Yi Wang 2 and Zili Shao 2 1 State Key Laboratory for Novel Software

More information

Part II: Data Center Software Architecture: Topic 2: Key-value Data Management Systems. SkimpyStash: Key Value Store on Flash-based Storage

Part II: Data Center Software Architecture: Topic 2: Key-value Data Management Systems. SkimpyStash: Key Value Store on Flash-based Storage ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective Part II: Data Center Software Architecture: Topic 2: Key-value Data Management Systems SkimpyStash: Key Value

More information

FlashTier: A Lightweight, Consistent and Durable Storage Cache

FlashTier: A Lightweight, Consistent and Durable Storage Cache FlashTier: A Lightweight, Consistent and Durable Storage Cache Mohit Saxena PhD Candidate University of Wisconsin-Madison msaxena@cs.wisc.edu Flash Memory Summit 2012 Santa Clara, CA Flash is a Good Cache

More information

ParaFS: A Log-Structured File System to Exploit the Internal Parallelism of Flash Devices

ParaFS: A Log-Structured File System to Exploit the Internal Parallelism of Flash Devices ParaFS: A Log-Structured File System to Exploit the Internal Parallelism of Devices Jiacheng Zhang, Jiwu Shu, Youyou Lu Tsinghua University 1 Outline Background and Motivation ParaFS Design Evaluation

More information

SlimCache: Exploiting Data Compression Opportunities in Flash-based Key-value Caching

SlimCache: Exploiting Data Compression Opportunities in Flash-based Key-value Caching SlimCache: Exploiting Data Compression Opportunities in Flash-based Key-value Caching Yichen Jia Computer Science and Engineering Louisiana State University yjia@csc.lsu.edu Zili Shao Computer Science

More information

A File-System-Aware FTL Design for Flash Memory Storage Systems

A File-System-Aware FTL Design for Flash Memory Storage Systems 1 A File-System-Aware FTL Design for Flash Memory Storage Systems Po-Liang Wu, Yuan-Hao Chang, Po-Chun Huang, and Tei-Wei Kuo National Taiwan University 2 Outline Introduction File Systems Observations

More information

A Memory Management Scheme for Hybrid Memory Architecture in Mission Critical Computers

A Memory Management Scheme for Hybrid Memory Architecture in Mission Critical Computers A Memory Management Scheme for Hybrid Memory Architecture in Mission Critical Computers Soohyun Yang and Yeonseung Ryu Department of Computer Engineering, Myongji University Yongin, Gyeonggi-do, Korea

More information

Flash Memory Based Storage System

Flash Memory Based Storage System Flash Memory Based Storage System References SmartSaver: Turning Flash Drive into a Disk Energy Saver for Mobile Computers, ISLPED 06 Energy-Aware Flash Memory Management in Virtual Memory System, islped

More information

Toward SLO Complying SSDs Through OPS Isolation

Toward SLO Complying SSDs Through OPS Isolation Toward SLO Complying SSDs Through OPS Isolation October 23, 2015 Hongik University UNIST (Ulsan National Institute of Science & Technology) Sam H. Noh 1 Outline Part 1: FAST 2015 Part 2: Beyond FAST 2

More information

Open-Channel SSDs Offer the Flexibility Required by Hyperscale Infrastructure Matias Bjørling CNEX Labs

Open-Channel SSDs Offer the Flexibility Required by Hyperscale Infrastructure Matias Bjørling CNEX Labs Open-Channel SSDs Offer the Flexibility Required by Hyperscale Infrastructure Matias Bjørling CNEX Labs 1 Public and Private Cloud Providers 2 Workloads and Applications Multi-Tenancy Databases Instance

More information

Presented by: Nafiseh Mahmoudi Spring 2017

Presented by: Nafiseh Mahmoudi Spring 2017 Presented by: Nafiseh Mahmoudi Spring 2017 Authors: Publication: Type: ACM Transactions on Storage (TOS), 2016 Research Paper 2 High speed data processing demands high storage I/O performance. Flash memory

More information

White Paper: Understanding the Relationship Between SSD Endurance and Over-Provisioning. Solid State Drive

White Paper: Understanding the Relationship Between SSD Endurance and Over-Provisioning. Solid State Drive White Paper: Understanding the Relationship Between SSD Endurance and Over-Provisioning Solid State Drive 2 Understanding the Relationship Between Endurance and Over-Provisioning Each of the cells inside

More information

A Caching-Oriented FTL Design for Multi-Chipped Solid-State Disks. Yuan-Hao Chang, Wei-Lun Lu, Po-Chun Huang, Lue-Jane Lee, and Tei-Wei Kuo

A Caching-Oriented FTL Design for Multi-Chipped Solid-State Disks. Yuan-Hao Chang, Wei-Lun Lu, Po-Chun Huang, Lue-Jane Lee, and Tei-Wei Kuo A Caching-Oriented FTL Design for Multi-Chipped Solid-State Disks Yuan-Hao Chang, Wei-Lun Lu, Po-Chun Huang, Lue-Jane Lee, and Tei-Wei Kuo 1 June 4, 2011 2 Outline Introduction System Architecture A Multi-Chipped

More information

QLC Challenges. QLC SSD s Require Deep FTL Tuning Karl Schuh Micron. Flash Memory Summit 2018 Santa Clara, CA 1

QLC Challenges. QLC SSD s Require Deep FTL Tuning Karl Schuh Micron. Flash Memory Summit 2018 Santa Clara, CA 1 QLC Challenges QLC SSD s Require Deep FTL Tuning Karl Schuh Micron Santa Clara, CA 1 The Wonders of QLC TLC QLC Cost Capacity Performance Error Rate depends upon compensation for transaction history Endurance

More information

Linux Kernel Abstractions for Open-Channel SSDs

Linux Kernel Abstractions for Open-Channel SSDs Linux Kernel Abstractions for Open-Channel SSDs Matias Bjørling Javier González, Jesper Madsen, and Philippe Bonnet 2015/03/01 1 Market Specific FTLs SSDs on the market with embedded FTLs targeted at specific

More information

A Buffer Replacement Algorithm Exploiting Multi-Chip Parallelism in Solid State Disks

A Buffer Replacement Algorithm Exploiting Multi-Chip Parallelism in Solid State Disks A Buffer Replacement Algorithm Exploiting Multi-Chip Parallelism in Solid State Disks Jinho Seol, Hyotaek Shim, Jaegeuk Kim, and Seungryoul Maeng Division of Computer Science School of Electrical Engineering

More information

SFS: Random Write Considered Harmful in Solid State Drives

SFS: Random Write Considered Harmful in Solid State Drives SFS: Random Write Considered Harmful in Solid State Drives Changwoo Min 1, 2, Kangnyeon Kim 1, Hyunjin Cho 2, Sang-Won Lee 1, Young Ik Eom 1 1 Sungkyunkwan University, Korea 2 Samsung Electronics, Korea

More information

SHRD: Improving Spatial Locality in Flash Storage Accesses by Sequentializing in Host and Randomizing in Device

SHRD: Improving Spatial Locality in Flash Storage Accesses by Sequentializing in Host and Randomizing in Device SHRD: Improving Spatial Locality in Flash Storage Accesses by Sequentializing in Host and Randomizing in Device Hyukjoong Kim 1, Dongkun Shin 1, Yun Ho Jeong 2 and Kyung Ho Kim 2 1 Samsung Electronics

More information

arxiv: v1 [cs.db] 25 Nov 2018

arxiv: v1 [cs.db] 25 Nov 2018 Enabling Efficient Updates in KV Storage via Hashing: Design and Performance Evaluation Yongkun Li, Helen H. W. Chan, Patrick P. C. Lee, and Yinlong Xu University of Science and Technology of China The

More information

Virtual Memory. Reading. Sections 5.4, 5.5, 5.6, 5.8, 5.10 (2) Lecture notes from MKP and S. Yalamanchili

Virtual Memory. Reading. Sections 5.4, 5.5, 5.6, 5.8, 5.10 (2) Lecture notes from MKP and S. Yalamanchili Virtual Memory Lecture notes from MKP and S. Yalamanchili Sections 5.4, 5.5, 5.6, 5.8, 5.10 Reading (2) 1 The Memory Hierarchy ALU registers Cache Memory Memory Memory Managed by the compiler Memory Managed

More information

FFS: The Fast File System -and- The Magical World of SSDs

FFS: The Fast File System -and- The Magical World of SSDs FFS: The Fast File System -and- The Magical World of SSDs The Original, Not-Fast Unix Filesystem Disk Superblock Inodes Data Directory Name i-number Inode Metadata Direct ptr......... Indirect ptr 2-indirect

More information

Using Transparent Compression to Improve SSD-based I/O Caches

Using Transparent Compression to Improve SSD-based I/O Caches Using Transparent Compression to Improve SSD-based I/O Caches Thanos Makatos, Yannis Klonatos, Manolis Marazakis, Michail D. Flouris, and Angelos Bilas {mcatos,klonatos,maraz,flouris,bilas}@ics.forth.gr

More information

ECE 598-MS: Advanced Memory and Storage Systems Lecture 7: Unified Address Translation with FlashMap

ECE 598-MS: Advanced Memory and Storage Systems Lecture 7: Unified Address Translation with FlashMap ECE 598-MS: Advanced Memory and Storage Systems Lecture 7: Unified Address Translation with Map Jian Huang Use As Non-Volatile Memory DRAM (nanoseconds) Application Memory Component SSD (microseconds)

More information

Middleware and Flash Translation Layer Co-Design for the Performance Boost of Solid-State Drives

Middleware and Flash Translation Layer Co-Design for the Performance Boost of Solid-State Drives Middleware and Flash Translation Layer Co-Design for the Performance Boost of Solid-State Drives Chao Sun 1, Asuka Arakawa 1, Ayumi Soga 1, Chihiro Matsui 1 and Ken Takeuchi 1 1 Chuo University Santa Clara,

More information

The Unwritten Contract of Solid State Drives

The Unwritten Contract of Solid State Drives The Unwritten Contract of Solid State Drives Jun He, Sudarsun Kannan, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau Department of Computer Sciences, University of Wisconsin - Madison Enterprise SSD

More information

SOS : Software-based Out-of-Order Scheduling for High-Performance NAND Flash-Based SSDs

SOS : Software-based Out-of-Order Scheduling for High-Performance NAND Flash-Based SSDs SOS : Software-based Out-of-Order Scheduling for High-Performance NAND Flash-Based SSDs Sangwook Shane Hahn, Sungjin Lee and Jihong Kim Computer Architecture & Embedded Systems Laboratory School of Computer

More information

SMORE: A Cold Data Object Store for SMR Drives

SMORE: A Cold Data Object Store for SMR Drives SMORE: A Cold Data Object Store for SMR Drives Peter Macko, Xiongzi Ge, John Haskins Jr.*, James Kelley, David Slik, Keith A. Smith, and Maxim G. Smith Advanced Technology Group NetApp, Inc. * Qualcomm

More information

VSSIM: Virtual Machine based SSD Simulator

VSSIM: Virtual Machine based SSD Simulator 29 th IEEE Conference on Mass Storage Systems and Technologies (MSST) Long Beach, California, USA, May 6~10, 2013 VSSIM: Virtual Machine based SSD Simulator Jinsoo Yoo, Youjip Won, Joongwoo Hwang, Sooyong

More information

PebblesDB: Building Key-Value Stores using Fragmented Log Structured Merge Trees

PebblesDB: Building Key-Value Stores using Fragmented Log Structured Merge Trees PebblesDB: Building Key-Value Stores using Fragmented Log Structured Merge Trees Pandian Raju 1, Rohan Kadekodi 1, Vijay Chidambaram 1,2, Ittai Abraham 2 1 The University of Texas at Austin 2 VMware Research

More information

A DEDUPLICATION-INSPIRED FAST DELTA COMPRESSION APPROACH W EN XIA, HONG JIANG, DA N FENG, LEI T I A N, M I N FU, YUKUN Z HOU

A DEDUPLICATION-INSPIRED FAST DELTA COMPRESSION APPROACH W EN XIA, HONG JIANG, DA N FENG, LEI T I A N, M I N FU, YUKUN Z HOU A DEDUPLICATION-INSPIRED FAST DELTA COMPRESSION APPROACH W EN XIA, HONG JIANG, DA N FENG, LEI T I A N, M I N FU, YUKUN Z HOU PRESENTED BY ROMAN SHOR Overview Technics of data reduction in storage systems:

More information

Azor: Using Two-level Block Selection to Improve SSD-based I/O caches

Azor: Using Two-level Block Selection to Improve SSD-based I/O caches Azor: Using Two-level Block Selection to Improve SSD-based I/O caches Yannis Klonatos, Thanos Makatos, Manolis Marazakis, Michail D. Flouris, Angelos Bilas {klonatos, makatos, maraz, flouris, bilas}@ics.forth.gr

More information

LSM-trie: An LSM-tree-based Ultra-Large Key-Value Store for Small Data

LSM-trie: An LSM-tree-based Ultra-Large Key-Value Store for Small Data LSM-trie: An LSM-tree-based Ultra-Large Key-Value Store for Small Data Xingbo Wu Yuehai Xu Song Jiang Zili Shao The Hong Kong Polytechnic University The Challenge on Today s Key-Value Store Trends on workloads

More information

CFDC A Flash-aware Replacement Policy for Database Buffer Management

CFDC A Flash-aware Replacement Policy for Database Buffer Management CFDC A Flash-aware Replacement Policy for Database Buffer Management Yi Ou University of Kaiserslautern Germany Theo Härder University of Kaiserslautern Germany Peiquan Jin University of Science and Technology

More information

Chunling Wang, Dandan Wang, Yunpeng Chai, Chuanwen Wang and Diansen Sun Renmin University of China

Chunling Wang, Dandan Wang, Yunpeng Chai, Chuanwen Wang and Diansen Sun Renmin University of China Chunling Wang, Dandan Wang, Yunpeng Chai, Chuanwen Wang and Diansen Sun Renmin University of China Data volume is growing 44ZB in 2020! How to store? Flash arrays, DRAM-based storage: high costs, reliability,

More information

Application-Managed Flash

Application-Managed Flash Application-Managed Flash Sungjin Lee*, Ming Liu, Sangwoo Jun, Shuotao Xu, Jihong Kim and Arvind *Inha University Massachusetts Institute of Technology Seoul National University Operating System Support

More information

SUPA: A Single Unified Read-Write Buffer and Pattern-Change-Aware FTL for the High Performance of Multi-Channel SSD

SUPA: A Single Unified Read-Write Buffer and Pattern-Change-Aware FTL for the High Performance of Multi-Channel SSD SUPA: A Single Unified Read-Write Buffer and Pattern-Change-Aware FTL for the High Performance of Multi-Channel SSD DONGJIN KIM, KYU HO PARK, and CHAN-HYUN YOUN, KAIST To design the write buffer and flash

More information

Strata: A Cross Media File System. Youngjin Kwon, Henrique Fingler, Tyler Hunt, Simon Peter, Emmett Witchel, Thomas Anderson

Strata: A Cross Media File System. Youngjin Kwon, Henrique Fingler, Tyler Hunt, Simon Peter, Emmett Witchel, Thomas Anderson A Cross Media File System Youngjin Kwon, Henrique Fingler, Tyler Hunt, Simon Peter, Emmett Witchel, Thomas Anderson 1 Let s build a fast server NoSQL store, Database, File server, Mail server Requirements

More information

OSSD: A Case for Object-based Solid State Drives

OSSD: A Case for Object-based Solid State Drives MSST 2013 2013/5/10 OSSD: A Case for Object-based Solid State Drives Young-Sik Lee Sang-Hoon Kim, Seungryoul Maeng, KAIST Jaesoo Lee, Chanik Park, Samsung Jin-Soo Kim, Sungkyunkwan Univ. SSD Desktop Laptop

More information

pblk the OCSSD FTL Linux FAST Summit 18 Javier González Copyright 2018 CNEX Labs

pblk the OCSSD FTL Linux FAST Summit 18 Javier González Copyright 2018 CNEX Labs pblk the OCSSD FTL Linux FAST Summit 18 Javier González Read Latency Read Latency with 0% Writes Random Read 4K Percentiles 2 Read Latency Read Latency with 20% Writes Random Read 4K + Random Write 4K

More information

File Systems. Kartik Gopalan. Chapter 4 From Tanenbaum s Modern Operating System

File Systems. Kartik Gopalan. Chapter 4 From Tanenbaum s Modern Operating System File Systems Kartik Gopalan Chapter 4 From Tanenbaum s Modern Operating System 1 What is a File System? File system is the OS component that organizes data on the raw storage device. Data, by itself, is

More information

LAST: Locality-Aware Sector Translation for NAND Flash Memory-Based Storage Systems

LAST: Locality-Aware Sector Translation for NAND Flash Memory-Based Storage Systems : Locality-Aware Sector Translation for NAND Flash Memory-Based Storage Systems Sungjin Lee, Dongkun Shin, Young-Jin Kim and Jihong Kim School of Information and Communication Engineering, Sungkyunkwan

More information

LightNVM: The Linux Open-Channel SSD Subsystem Matias Bjørling (ITU, CNEX Labs), Javier González (CNEX Labs), Philippe Bonnet (ITU)

LightNVM: The Linux Open-Channel SSD Subsystem Matias Bjørling (ITU, CNEX Labs), Javier González (CNEX Labs), Philippe Bonnet (ITU) ½ LightNVM: The Linux Open-Channel SSD Subsystem Matias Bjørling (ITU, CNEX Labs), Javier González (CNEX Labs), Philippe Bonnet (ITU) 0% Writes - Read Latency 4K Random Read Latency 4K Random Read Percentile

More information

Speeding Up Cloud/Server Applications Using Flash Memory

Speeding Up Cloud/Server Applications Using Flash Memory Speeding Up Cloud/Server Applications Using Flash Memory Sudipta Sengupta and Jin Li Microsoft Research, Redmond, WA, USA Contains work that is joint with Biplob Debnath (Univ. of Minnesota) Flash Memory

More information

Workload-Aware Elastic Striping With Hot Data Identification for SSD RAID Arrays

Workload-Aware Elastic Striping With Hot Data Identification for SSD RAID Arrays IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 36, NO. 5, MAY 2017 815 Workload-Aware Elastic Striping With Hot Data Identification for SSD RAID Arrays Yongkun Li,

More information

Solid State Drives (SSDs) Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Solid State Drives (SSDs) Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University Solid State Drives (SSDs) Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Memory Types FLASH High-density Low-cost High-speed Low-power High reliability

More information

KVFTL - Optimization of Storage Space Utilization for Key-Value-Specific Flash Storage Devices

KVFTL - Optimization of Storage Space Utilization for Key-Value-Specific Flash Storage Devices KVFTL - Optimization of Storage Space Utilization for Key-Value-Specific Flash Storage Devices ASP-DAC 217 Yen-Ting Chen, Ming-Chang Yang, Yuan-Hao Chang, Tseng-Yi Chen, Hsin-Wen Wei, and Wei-Kuan Shih

More information

Don t stack your Log on my Log

Don t stack your Log on my Log Don t stack your Log on my Log Jingpei Yang, Ned Plasson, Greg Gillis, Nisha Talagala, Swaminathan Sundararaman Oct 5, 2014 c 1 Outline Introduction Log-stacking models Problems with stacking logs Solutions

More information

ASEP: An Adaptive Sequential Prefetching Scheme for Second-level Storage System

ASEP: An Adaptive Sequential Prefetching Scheme for Second-level Storage System ASEP: An Adaptive Sequential Prefetching Scheme for Second-level Storage System Xiaodong Shi Email: shixd.hust@gmail.com Dan Feng Email: dfeng@hust.edu.cn Wuhan National Laboratory for Optoelectronics,

More information

A New Key-Value Data Store For Heterogeneous Storage Architecture

A New Key-Value Data Store For Heterogeneous Storage Architecture A New Key-Value Data Store For Heterogeneous Storage Architecture brien.porter@intel.com wanyuan.yang@intel.com yuan.zhou@intel.com jian.zhang@intel.com Intel APAC R&D Ltd. 1 Agenda Introduction Background

More information

Percona Live September 21-23, 2015 Mövenpick Hotel Amsterdam

Percona Live September 21-23, 2015 Mövenpick Hotel Amsterdam Percona Live 2015 September 21-23, 2015 Mövenpick Hotel Amsterdam TokuDB internals Percona team, Vlad Lesin, Sveta Smirnova Slides plan Introduction in Fractal Trees and TokuDB Files Block files Fractal

More information

From server-side to host-side:

From server-side to host-side: From server-side to host-side: Flash memory for enterprise storage Jiri Schindler et al. (see credits) Advanced Technology Group NetApp May 9, 2012 v 1.0 Data Centers with Flash SSDs iscsi/nfs/cifs Shared

More information

Yiying Zhang, Leo Prasath Arulraj, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. University of Wisconsin - Madison

Yiying Zhang, Leo Prasath Arulraj, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. University of Wisconsin - Madison Yiying Zhang, Leo Prasath Arulraj, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau University of Wisconsin - Madison 1 Indirection Reference an object with a different name Flexible, simple, and

More information

Cooperating Write Buffer Cache and Virtual Memory Management for Flash Memory Based Systems

Cooperating Write Buffer Cache and Virtual Memory Management for Flash Memory Based Systems Cooperating Write Buffer Cache and Virtual Memory Management for Flash Memory Based Systems Liang Shi, Chun Jason Xue and Xuehai Zhou Joint Research Lab of Excellence, CityU-USTC Advanced Research Institute,

More information

FlashKV: Accelerating KV Performance with Open-Channel SSDs

FlashKV: Accelerating KV Performance with Open-Channel SSDs FlashKV: Accelerating KV Performance with Open-Channel SSDs JIACHENG ZHANG, YOUYOU LU, JIWU SHU, and XIONGJUN QIN, Department of Computer Science and Technology, Tsinghua University As the cost-per-bit

More information

OS-caused Long JVM Pauses - Deep Dive and Solutions

OS-caused Long JVM Pauses - Deep Dive and Solutions OS-caused Long JVM Pauses - Deep Dive and Solutions Zhenyun Zhuang LinkedIn Corp., Mountain View, California, USA https://www.linkedin.com/in/zhenyun Zhenyun@gmail.com 2016-4-21 Outline q Introduction

More information

FILE SYSTEMS, PART 2. CS124 Operating Systems Fall , Lecture 24

FILE SYSTEMS, PART 2. CS124 Operating Systems Fall , Lecture 24 FILE SYSTEMS, PART 2 CS124 Operating Systems Fall 2017-2018, Lecture 24 2 Last Time: File Systems Introduced the concept of file systems Explored several ways of managing the contents of files Contiguous

More information

1. Creates the illusion of an address space much larger than the physical memory

1. Creates the illusion of an address space much larger than the physical memory Virtual memory Main Memory Disk I P D L1 L2 M Goals Physical address space Virtual address space 1. Creates the illusion of an address space much larger than the physical memory 2. Make provisions for

More information

Computer Architecture and System Software Lecture 09: Memory Hierarchy. Instructor: Rob Bergen Applied Computer Science University of Winnipeg

Computer Architecture and System Software Lecture 09: Memory Hierarchy. Instructor: Rob Bergen Applied Computer Science University of Winnipeg Computer Architecture and System Software Lecture 09: Memory Hierarchy Instructor: Rob Bergen Applied Computer Science University of Winnipeg Announcements Midterm returned + solutions in class today SSD

More information

Open-Channel SSDs Then. Now. And Beyond. Matias Bjørling, March 22, Copyright 2017 CNEX Labs

Open-Channel SSDs Then. Now. And Beyond. Matias Bjørling, March 22, Copyright 2017 CNEX Labs Open-Channel SSDs Then. Now. And Beyond. Matias Bjørling, March 22, 2017 What is an Open-Channel SSD? Then Now - Physical Page Addressing v1.2 - LightNVM Subsystem - Developing for an Open-Channel SSD

More information

Flash-Conscious Cache Population for Enterprise Database Workloads

Flash-Conscious Cache Population for Enterprise Database Workloads IBM Research ADMS 214 1 st September 214 Flash-Conscious Cache Population for Enterprise Database Workloads Hyojun Kim, Ioannis Koltsidas, Nikolas Ioannou, Sangeetha Seshadri, Paul Muench, Clem Dickey,

More information

MQSim: A Framework for Enabling Realistic Studies of Modern Multi-Queue SSD Devices

MQSim: A Framework for Enabling Realistic Studies of Modern Multi-Queue SSD Devices MQSim: A Framework for Enabling Realistic Studies of Modern Multi-Queue SSD Devices Arash Tavakkol, Juan Gómez-Luna, Mohammad Sadrosadati, Saugata Ghose, Onur Mutlu February 13, 2018 Executive Summary

More information

Exploiting the benefits of native programming access to NVM devices

Exploiting the benefits of native programming access to NVM devices Exploiting the benefits of native programming access to NVM devices Ashish Batwara Principal Storage Architect Fusion-io Traditional Storage Stack User space Application Kernel space Filesystem LBA Block

More information

Red Hat Enterprise 7 Beta File Systems

Red Hat Enterprise 7 Beta File Systems Red Hat Enterprise 7 Beta File Systems New Scale, Speed & Features Ric Wheeler Director Red Hat Kernel File & Storage Team Red Hat Storage Engineering Agenda Red Hat Enterprise Linux 7 Storage Features

More information

The Oracle Database Appliance I/O and Performance Architecture

The Oracle Database Appliance I/O and Performance Architecture Simple Reliable Affordable The Oracle Database Appliance I/O and Performance Architecture Tammy Bednar, Sr. Principal Product Manager, ODA 1 Copyright 2012, Oracle and/or its affiliates. All rights reserved.

More information

EI 338: Computer Systems Engineering (Operating Systems & Computer Architecture)

EI 338: Computer Systems Engineering (Operating Systems & Computer Architecture) EI 338: Computer Systems Engineering (Operating Systems & Computer Architecture) Dept. of Computer Science & Engineering Chentao Wu wuct@cs.sjtu.edu.cn Download lectures ftp://public.sjtu.edu.cn User:

More information

Performance Modeling and Analysis of Flash based Storage Devices

Performance Modeling and Analysis of Flash based Storage Devices Performance Modeling and Analysis of Flash based Storage Devices H. Howie Huang, Shan Li George Washington University Alex Szalay, Andreas Terzis Johns Hopkins University MSST 11 May 26, 2011 NAND Flash

More information

Implementation and Performance Evaluation of RAPID-Cache under Linux

Implementation and Performance Evaluation of RAPID-Cache under Linux Implementation and Performance Evaluation of RAPID-Cache under Linux Ming Zhang, Xubin He, and Qing Yang Department of Electrical and Computer Engineering, University of Rhode Island, Kingston, RI 2881

More information

SWAP: EFFECTIVE FINE-GRAIN MANAGEMENT

SWAP: EFFECTIVE FINE-GRAIN MANAGEMENT : EFFECTIVE FINE-GRAIN MANAGEMENT OF SHARED LAST-LEVEL CACHES WITH MINIMUM HARDWARE SUPPORT Xiaodong Wang, Shuang Chen, Jeff Setter, and José F. Martínez Computer Systems Lab Cornell University Page 1

More information

Parallelizing Inline Data Reduction Operations for Primary Storage Systems

Parallelizing Inline Data Reduction Operations for Primary Storage Systems Parallelizing Inline Data Reduction Operations for Primary Storage Systems Jeonghyeon Ma ( ) and Chanik Park Department of Computer Science and Engineering, POSTECH, Pohang, South Korea {doitnow0415,cipark}@postech.ac.kr

More information

Maximizing Data Center and Enterprise Storage Efficiency

Maximizing Data Center and Enterprise Storage Efficiency Maximizing Data Center and Enterprise Storage Efficiency Enterprise and data center customers can leverage AutoStream to achieve higher application throughput and reduced latency, with negligible organizational

More information

Design Considerations for Using Flash Memory for Caching

Design Considerations for Using Flash Memory for Caching Design Considerations for Using Flash Memory for Caching Edi Shmueli, IBM XIV Storage Systems edi@il.ibm.com Santa Clara, CA August 2010 1 Solid-State Storage In a few decades solid-state storage will

More information

A Self Learning Algorithm for NAND Flash Controllers

A Self Learning Algorithm for NAND Flash Controllers A Self Learning Algorithm for NAND Flash Controllers Hao Zhi, Lee Firmware Manager Core Storage Electronics Corp./Phison Electronics Corp. haozhi_lee@phison.com Santa Clara, CA 1 Outline Basic FW Architecture

More information

BPCLC: An Efficient Write Buffer Management Scheme for Flash-Based Solid State Disks

BPCLC: An Efficient Write Buffer Management Scheme for Flash-Based Solid State Disks BPCLC: An Efficient Write Buffer Management Scheme for Flash-Based Solid State Disks Hui Zhao 1, Peiquan Jin *1, Puyuan Yang 1, Lihua Yue 1 1 School of Computer Science and Technology, University of Science

More information

Linux Kernel Extensions for Open-Channel SSDs

Linux Kernel Extensions for Open-Channel SSDs Linux Kernel Extensions for Open-Channel SSDs Matias Bjørling Member of Technical Staff Santa Clara, CA 1 The Future of device FTLs? Dealing with flash chip constrains is a necessity No way around the

More information

CHAPTER 6 Memory. CMPS375 Class Notes (Chap06) Page 1 / 20 Dr. Kuo-pao Yang

CHAPTER 6 Memory. CMPS375 Class Notes (Chap06) Page 1 / 20 Dr. Kuo-pao Yang CHAPTER 6 Memory 6.1 Memory 341 6.2 Types of Memory 341 6.3 The Memory Hierarchy 343 6.3.1 Locality of Reference 346 6.4 Cache Memory 347 6.4.1 Cache Mapping Schemes 349 6.4.2 Replacement Policies 365

More information

Storage Systems : Disks and SSDs. Manu Awasthi CASS 2018

Storage Systems : Disks and SSDs. Manu Awasthi CASS 2018 Storage Systems : Disks and SSDs Manu Awasthi CASS 2018 Why study storage? Scalable High Performance Main Memory System Using Phase-Change Memory Technology, Qureshi et al, ISCA 2009 Trends Total amount

More information

BROMS: Best Ratio of MLC to SLC

BROMS: Best Ratio of MLC to SLC BROMS: Best Ratio of MLC to SLC Wei Wang 1, Tao Xie 2, Deng Zhou 1 1 Computational Science Research Center, San Diego State University 2 Computer Science Department, San Diego State University Partitioned

More information

Using Synology SSD Technology to Enhance System Performance Synology Inc.

Using Synology SSD Technology to Enhance System Performance Synology Inc. Using Synology SSD Technology to Enhance System Performance Synology Inc. Synology_WP_ 20121112 Table of Contents Chapter 1: Enterprise Challenges and SSD Cache as Solution Enterprise Challenges... 3 SSD

More information

Storage Architecture and Software Support for SLC/MLC Combined Flash Memory

Storage Architecture and Software Support for SLC/MLC Combined Flash Memory Storage Architecture and Software Support for SLC/MLC Combined Flash Memory Soojun Im and Dongkun Shin Sungkyunkwan University Suwon, Korea {lang33, dongkun}@skku.edu ABSTRACT We propose a novel flash

More information

Rechnerarchitektur (RA)

Rechnerarchitektur (RA) 12 Rechnerarchitektur (RA) Sommersemester 2017 Flash Memory 2017/07/12 Jian-Jia Chen (Slides are based on Tei-Wei Kuo and Yuan-Hao Chang) Informatik 12 Jian-jia.chen@tu-.. http://ls12-www.cs.tu.de/daes/

More information

Clustered Page-Level Mapping for Flash Memory-Based Storage Devices

Clustered Page-Level Mapping for Flash Memory-Based Storage Devices H. Kim and D. Shin: ed Page-Level Mapping for Flash Memory-Based Storage Devices 7 ed Page-Level Mapping for Flash Memory-Based Storage Devices Hyukjoong Kim and Dongkun Shin, Member, IEEE Abstract Recent

More information

BCStore: Bandwidth-Efficient In-memory KV-Store with Batch Coding. Shenglong Li, Quanlu Zhang, Zhi Yang and Yafei Dai Peking University

BCStore: Bandwidth-Efficient In-memory KV-Store with Batch Coding. Shenglong Li, Quanlu Zhang, Zhi Yang and Yafei Dai Peking University BCStore: Bandwidth-Efficient In-memory KV-Store with Batch Coding Shenglong Li, Quanlu Zhang, Zhi Yang and Yafei Dai Peking University Outline Introduction and Motivation Our Design System and Implementation

More information

Utilitarian Performance Isolation in Shared SSDs

Utilitarian Performance Isolation in Shared SSDs Utilitarian Performance Isolation in Shared SSDs Bryan S. Kim (Seoul National University, Korea) Flash storage landscape Exploit parallelism! Expose parallelism! Host system NVMe MQ blk IO (SYSTOR13) NVMeDirect

More information

<Insert Picture Here> Btrfs Filesystem

<Insert Picture Here> Btrfs Filesystem Btrfs Filesystem Chris Mason Btrfs Goals General purpose filesystem that scales to very large storage Feature focused, providing features other Linux filesystems cannot Administration

More information

S-FTL: An Efficient Address Translation for Flash Memory by Exploiting Spatial Locality

S-FTL: An Efficient Address Translation for Flash Memory by Exploiting Spatial Locality S-FTL: An Efficient Address Translation for Flash Memory by Exploiting Spatial Locality Song Jiang, Lei Zhang, Xinhao Yuan, Hao Hu, and Yu Chen Department of Electrical and Computer Engineering Wayne State

More information

Gecko: Contention-Oblivious Disk Arrays for Cloud Storage

Gecko: Contention-Oblivious Disk Arrays for Cloud Storage Gecko: Contention-Oblivious Disk Arrays for Cloud Storage Ji-Yong Shin Cornell University In collaboration with Mahesh Balakrishnan (MSR SVC), Tudor Marian (Google), and Hakim Weatherspoon (Cornell) FAST

More information

Warming up Storage-level Caches with Bonfire

Warming up Storage-level Caches with Bonfire Warming up Storage-level Caches with Bonfire Yiying Zhang Gokul Soundararajan Mark W. Storer Lakshmi N. Bairavasundaram Sethuraman Subbiah Andrea C. Arpaci-Dusseau Remzi H. Arpaci-Dusseau 2 Does on-demand

More information

Cache Memory COE 403. Computer Architecture Prof. Muhamed Mudawar. Computer Engineering Department King Fahd University of Petroleum and Minerals

Cache Memory COE 403. Computer Architecture Prof. Muhamed Mudawar. Computer Engineering Department King Fahd University of Petroleum and Minerals Cache Memory COE 403 Computer Architecture Prof. Muhamed Mudawar Computer Engineering Department King Fahd University of Petroleum and Minerals Presentation Outline The Need for Cache Memory The Basics

More information

[537] Flash. Tyler Harter

[537] Flash. Tyler Harter [537] Flash Tyler Harter Flash vs. Disk Disk Overview I/O requires: seek, rotate, transfer Inherently: - not parallel (only one head) - slow (mechanical) - poor random I/O (locality around disk head) Random

More information

SRM-Buffer: An OS Buffer Management SRM-Buffer: An OS Buffer Management Technique toprevent Last Level Cache from Thrashing in Multicores

SRM-Buffer: An OS Buffer Management SRM-Buffer: An OS Buffer Management Technique toprevent Last Level Cache from Thrashing in Multicores SRM-Buffer: An OS Buffer Management SRM-Buffer: An OS Buffer Management Technique toprevent Last Level Cache from Thrashing in Multicores Xiaoning Ding The Ohio State University dingxn@cse.ohiostate.edu

More information

Integrating Flash Memory into the Storage Hierarchy

Integrating Flash Memory into the Storage Hierarchy Integrating Flash Memory into the Storage Hierarchy A DISSERTATION SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL OF THE UNIVERSITY OF MINNESOTA BY Biplob Kumar Debnath IN PARTIAL FULFILLMENT OF THE REQUIREMENTS

More information

Performance Trade-Offs in Using NVRAM Write Buffer for Flash Memory-Based Storage Devices

Performance Trade-Offs in Using NVRAM Write Buffer for Flash Memory-Based Storage Devices Performance Trade-Offs in Using NVRAM Write Buffer for Flash Memory-Based Storage Devices Sooyong Kang, Sungmin Park, Hoyoung Jung, Hyoki Shim, and Jaehyuk Cha IEEE TRANSACTIONS ON COMPUTERS, VOL. 8, NO.,

More information

Decentralized Distributed Storage System for Big Data

Decentralized Distributed Storage System for Big Data Decentralized Distributed Storage System for Big Presenter: Wei Xie -Intensive Scalable Computing Laboratory(DISCL) Computer Science Department Texas Tech University Outline Trends in Big and Cloud Storage

More information

HP AutoRAID (Lecture 5, cs262a)

HP AutoRAID (Lecture 5, cs262a) HP AutoRAID (Lecture 5, cs262a) Ion Stoica, UC Berkeley September 13, 2016 (based on presentation from John Kubiatowicz, UC Berkeley) Array Reliability Reliability of N disks = Reliability of 1 Disk N

More information

Parser. Select R.text from Report R, Weather W where W.image.rain() and W.city = R.city and W.date = R.date and R.text.

Parser. Select R.text from Report R, Weather W where W.image.rain() and W.city = R.city and W.date = R.date and R.text. Select R.text from Report R, Weather W where W.image.rain() and W.city = R.city and W.date = R.date and R.text. Lifecycle of an SQL Query CSE 190D base System Implementation Arun Kumar Query Query Result

More information

COS 318: Operating Systems. NSF, Snapshot, Dedup and Review

COS 318: Operating Systems. NSF, Snapshot, Dedup and Review COS 318: Operating Systems NSF, Snapshot, Dedup and Review Topics! NFS! Case Study: NetApp File System! Deduplication storage system! Course review 2 Network File System! Sun introduced NFS v2 in early

More information

I, J A[I][J] / /4 8000/ I, J A(J, I) Chapter 5 Solutions S-3.

I, J A[I][J] / /4 8000/ I, J A(J, I) Chapter 5 Solutions S-3. 5 Solutions Chapter 5 Solutions S-3 5.1 5.1.1 4 5.1.2 I, J 5.1.3 A[I][J] 5.1.4 3596 8 800/4 2 8 8/4 8000/4 5.1.5 I, J 5.1.6 A(J, I) 5.2 5.2.1 Word Address Binary Address Tag Index Hit/Miss 5.2.2 3 0000

More information

dedupv1: Improving Deduplication Throughput using Solid State Drives (SSD)

dedupv1: Improving Deduplication Throughput using Solid State Drives (SSD) University Paderborn Paderborn Center for Parallel Computing Technical Report dedupv1: Improving Deduplication Throughput using Solid State Drives (SSD) Dirk Meister Paderborn Center for Parallel Computing

More information