FlashBlox: Achieving Both Performance Isolation and Uniform Lifetime for Virtualized SSDs

Similar documents
D E N A L I S T O R A G E I N T E R F A C E. Laura Caulfield Senior Software Engineer. Arie van der Hoeven Principal Program Manager

Performance Modeling and Analysis of Flash based Storage Devices

Performance Benefits of Running RocksDB on Samsung NVMe SSDs

ParaFS: A Log-Structured File System to Exploit the Internal Parallelism of Flash Devices

Flash Trends: Challenges and Future

Solid State Drive (SSD) Cache:

Moneta: A High-Performance Storage Architecture for Next-generation, Non-volatile Memories

Open-Channel SSDs Offer the Flexibility Required by Hyperscale Infrastructure Matias Bjørling CNEX Labs

Fusion iomemory PCIe Solutions from SanDisk and Sqrll make Accumulo Hypersonic

CIT 668: System Architecture. Amazon Web Services

ECE 598-MS: Advanced Memory and Storage Systems Lecture 7: Unified Address Translation with FlashMap

Purity: building fast, highly-available enterprise flash storage from commodity components

Pocket: Elastic Ephemeral Storage for Serverless Analytics

Cascade Mapping: Optimizing Memory Efficiency for Flash-based Key-value Caching

The Impact of SSD Selection on SQL Server Performance. Solution Brief. Understanding the differences in NVMe and SATA SSD throughput

SFS: Random Write Considered Harmful in Solid State Drives

Designing SSDs for large scale cloud workloads FLASH MEMORY SUMMIT, AUG 2014

Flash Memory Based Storage System

pblk the OCSSD FTL Linux FAST Summit 18 Javier González Copyright 2018 CNEX Labs

Cloud Optimized Performance: I/O-Intensive Workloads Using Flash-Based Storage

A Flash Scheduling Strategy for Current Capping in Multi-Power-Mode SSDs

[537] Flash. Tyler Harter

Optimizing Flash-based Key-value Cache Systems

Gyu Sang Choi Yeungnam University

Toward SLO Complying SSDs Through OPS Isolation

Decentralized Distributed Storage System for Big Data

Maximizing Data Center and Enterprise Storage Efficiency

Architectural Principles for Networked Solid State Storage Access

EXPLOITING INTRINSIC FLASH PROPERTIES TO ENHANCE MODERN STORAGE SYSTEMS. A Dissertation Presented to The Academic Faculty.

WearDrive: Fast and Energy Efficient Storage for Wearables

Memory-Based Cloud Architectures

Design Considerations for Using Flash Memory for Caching

A Comparison of Capacity Management Schemes for Shared CMP Caches

Be Fast, Cheap and in Control with SwitchKV. Xiaozhou Li

Partitioned Real-Time NAND Flash Storage. Katherine Missimer and Rich West

SWAP: EFFECTIVE FINE-GRAIN MANAGEMENT

Presented by: Nafiseh Mahmoudi Spring 2017

UCS Invicta: A New Generation of Storage Performance. Mazen Abou Najm DC Consulting Systems Engineer

Anti-Caching: A New Approach to Database Management System Architecture. Guide: Helly Patel ( ) Dr. Sunnie Chung Kush Patel ( )

BERLIN. 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved

Application-Specific Configuration Selection in the Cloud: Impact of Provider Policy and Potential of Systematic Testing

Cloud Computing with FPGA-based NVMe SSDs

Database Workload. from additional misses in this already memory-intensive databases? interference could be a problem) Key question:

NAND Flash-based Storage. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

MQSim: A Framework for Enabling Realistic Studies of Modern Multi-Queue SSD Devices

Virtual Memory. Reading. Sections 5.4, 5.5, 5.6, 5.8, 5.10 (2) Lecture notes from MKP and S. Yalamanchili

Onyx: A Prototype Phase-Change Memory Storage Array

SOS : Software-based Out-of-Order Scheduling for High-Performance NAND Flash-Based SSDs

SSD Failures in Datacenters: What? When? And Why?

Seagate Enterprise SATA SSD with DuraWrite Technology Competitive Evaluation

KVFTL - Optimization of Storage Space Utilization for Key-Value-Specific Flash Storage Devices

STORAGE LATENCY x. RAMAC 350 (600 ms) NAND SSD (60 us)

An Evaluation of Different Page Allocation Strategies on High-Speed SSDs

Storage Systems : Disks and SSDs. Manu Awasthi CASS 2018

Benefits of Multi-Node Scale-out Clusters running NetApp Clustered Data ONTAP. Silverton Consulting, Inc. StorInt Briefing

Preface. Fig. 1 Solid-State-Drive block diagram

NAND Flash-based Storage. Computer Systems Laboratory Sungkyunkwan University

Getting Real: Lessons in Transitioning Research Simulations into Hardware Systems

CA485 Ray Walshe Google File System

Part II: Data Center Software Architecture: Topic 2: Key-value Data Management Systems. SkimpyStash: Key Value Store on Flash-based Storage

Balancing DRAM Locality and Parallelism in Shared Memory CMP Systems

Next-Generation Cloud Platform

Data Organization and Processing

Analysis of SSD Health & Prediction of SSD Life


NAND Flash-based Storage. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

IaaS Vendor Comparison

IBM Tivoli Storage Manager for Windows Version Installation Guide IBM

Onyx: A Protoype Phase Change Memory Storage Array

The Intersection of Cloud & Solid State Storage

Middleware and Flash Translation Layer Co-Design for the Performance Boost of Solid-State Drives

A File-System-Aware FTL Design for Flash Memory Storage Systems

The Oracle Database Appliance I/O and Performance Architecture

Ambry: LinkedIn s Scalable Geo- Distributed Object Store

Get ready to be what s next.

TRASH DAY: COORDINATING GARBAGE COLLECTION IN DISTRIBUTED SYSTEMS

Reducing Solid-State Storage Device Write Stress Through Opportunistic In-Place Delta Compression

Serverless Architecture Hochskalierbare Anwendungen ohne Server. Sascha Möllering, Solutions Architect

Evaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades

Agenda. AWS Database Services Traditional vs AWS Data services model Amazon RDS Redshift DynamoDB ElastiCache

Deploy a High-Performance Database Solution: Cisco UCS B420 M4 Blade Server with Fusion iomemory PX600 Using Oracle Database 12c

Big and Fast. Anti-Caching in OLTP Systems. Justin DeBrabant

AWS: Basic Architecture Session SUNEY SHARMA Solutions Architect: AWS

A Case Study: Performance Evaluation of a DRAM-Based Solid State Disk

A Caching-Oriented FTL Design for Multi-Chipped Solid-State Disks. Yuan-Hao Chang, Wei-Lun Lu, Po-Chun Huang, Lue-Jane Lee, and Tei-Wei Kuo

How to scale Windows Azure Application

Storage Systems : Disks and SSDs. Manu Awasthi July 6 th 2018 Computer Architecture Summer School 2018

HP ProLiant BladeSystem Gen9 vs Gen8 and G7 Server Blades on Data Warehouse Workloads

Request-Oriented Durable Write Caching for Application Performance appeared in USENIX ATC '15. Jinkyu Jeong Sungkyunkwan University

Accelerate Applications Using EqualLogic Arrays with directcache

Moneta: A High-performance Storage Array Architecture for Nextgeneration, Micro 2010

SSDs vs HDDs for DBMS by Glen Berseth York University, Toronto

Extremely Fast Distributed Storage for Cloud Service Providers

EECS750: Advanced Operating Systems. 2/24/2014 Heechul Yun

Lecture 8: Virtual Memory. Today: DRAM innovations, virtual memory (Sections )

Accelerate Database Performance and Reduce Response Times in MongoDB Humongous Environments with the LSI Nytro MegaRAID Flash Accelerator Card

BzTree: A High-Performance Latch-free Range Index for Non-Volatile Memory

The Google File System

Amazon Aurora Deep Dive

MicroFuge: A Middleware Approach to Providing Performance Isolation in Cloud Storage Systems

Transcription:

FlashBlox: Achieving Both Performance Isolation and Uniform Lifetime for Virtualized SSDs Jian Huang Anirudh Badam Laura Caulfield Suman Nath Sudipta Sengupta Bikash Sharma Moinuddin K. Qureshi

Flash Has Changed Over the Last Decade Performance Improvement 100x lower latency 5,000x higher throughput 2

Flash Has Changed Over the Last Decade Performance Improvement 100x lower latency 5,000x higher throughput Increased Parallelism Dozens of parallel chips 2

Flash Has Changed Over the Last Decade Performance Improvement 100x lower latency 5,000x higher throughput Increased Parallelism Dozens of parallel chips Became Commodity Less than $0.3/GB 2

Flash Has Changed Over the Last Decade Performance Improvement 100x lower latency 5,000x higher throughput Increased Parallelism Dozens of parallel chips Became Commodity Less than $0.3/GB Significant improvements on Flash 2

Shared Flash-Based Solid State Disk (SSD) in the Cloud. 3

Shared Flash-Based Solid State Disk (SSD) in the Cloud. SSDs are virtualized and shared in data centers 3

Performance Interference in Shared SSD. Write Read Flash Translation Layer Flash-based SSD: A Black Box 4

Performance Interference in Shared SSD. Read/write Write Read interferences cause long (3x) tail latency! Flash Translation Layer Flash-based SSD: A Black Box 4

Performance Interference in Shared SSD. Write Read Flash Translation Layer 4

Performance Interference in Shared SSD. Write Read Flash Translation Layer 4

FlashBlox: Hardware Isolation in Cloud Storage. Flash Translation Layer 5

FlashBlox: Hardware Isolation in Cloud Storage. Leveraging parallel chips for hardware isolation Flash Translation Layer 5

Internal Parallelism Enables Hardware Isolation 6

Internal Parallelism Enables Hardware Isolation -Level Parallelism 6

Internal Parallelism Enables Hardware Isolation -Level Parallelism -Level Parallelism 6

Internal Parallelism Enables Hardware Isolation -Level Parallelism Plane-level parallelism is constrained as each chip contains only one address buffer -Level Parallelism Plane-Level Parallelism 6

Internal Parallelism Enables Hardware Isolation -Level Parallelism -Level Parallelism Plane-Level Parallelism Different parallelism level provides different isolation guarantee 6

New Abstractions for Hardware Isolation 7

New Abstractions for Hardware Isolation Virtual SSD ( Level) Virtual SSD ( Level) Virtual SSD (Plane Level) High Medium Low 7

New Abstractions for Hardware Isolation Virtual SSD ( Level) Virtual SSD ( Level) Virtual SSD (Plane Level) Software-based High Medium Low 7

Hardware Isolation Meets the Pay-As-You-Go Model in Cloud vssd () vssd () vssd (Software) 8

Hardware Isolation Meets the Pay-As-You-Go Model in Cloud Azure DocumentDB vssd () Azure SQL Database vssd () Amazon DynamoDB vssd (Software) 8

Hardware Isolation Meets the Pay-As-You-Go Model in Cloud Throughput Azure DocumentDB vssd () Azure SQL Database vssd () Single Amazon Partition Size Price DynamoDB vssd (Software) 8

Hardware Isolation Meets the Pay-As-You-Go Model in Cloud Hundreds of vssds can be supported in a single server vssd () vssd () vssd (Software) 8

Impact of Hardware Isolation on SSD Lifetime. Flash Translation Layer 9

Average #Blocks Erased/sec Impact of Hardware Isolation on SSD Lifetime 4.5 4 3.5 3 2.5 2 1.5 1 0.5 0 The average rate at which. flash blocks are erased Flash Translation Layer 9

Average #Blocks Erased/sec Impact of Hardware Isolation on SSD Lifetime 4.5 4 3.5 3 2.5 2 1.5 1 0.5 0 The average rate at which. flash blocks are erased Flash Translation Layer 9

Average #Blocks Erased/sec Impact of Hardware Isolation on SSD Lifetime 4.5 4 3.5 3 2.5 2 1.5 1 0.5 0 The average rate at which. flash blocks are erased Flash Translation Layer Flash blocks wear out at different rate with different workload 9

Impact of Hardware Isolation on SSD Lifetime. Write Intensive Flash Translation Layer 9

App App App FlashBlox Challenges SSD Lifetime Performance Isolation 10

App App App FlashBlox Challenges SSD Lifetime Performance Isolation App App App SSD Lifetime Performance Isolation 10

App App App FlashBlox Challenges SSD Lifetime Performance Isolation 10

Used Erase Cycles FlashBlox: Swapping s for Wear Balance 1 2 3 4 Adjusting the wear imbalance at a more coarse time granularity can achieve near-ideal SSD lifetime 11

Used Erase Cycles FlashBlox: Swapping s for Wear Balance 1 2 3 4 The channel that has incurred the maximum wearout The channel that has the minimum rate of wearout 11

Used Erase Cycles FlashBlox: Swapping s for Wear Balance 1 2 3 4 migration takes 15 minutes, once per 19 days Overall performance drops only for 0.04% of all the time 11

Used Erase Cycles How Frequently Should We Swap? Imbalance = MaxWear / AvgWear 1 2 3 4 12

Used Erase Cycles How Frequently Should We Swap? Imbalance = MaxWear 4 / AvgWear App M 1 2 3 4 12

Used Erase Cycles How Frequently Should We Swap? Imbalance = MaxWear 42 / AvgWear App M 1 M 2 3 4 12

Used Erase Cycles How Frequently Should We Swap? Imbalance = MaxWear 424/3 / AvgWear App M M M 1 2 3 4 12

Used Erase Cycles How Frequently Should We Swap? Imbalance = MaxWear 424/3 1 / AvgWear App M M M M 1 2 3 4 12

Used Erase Cycles How Frequently Should We Swap? Imbalance = MaxWear 1424/3 8/5 8/7 4/3 / AvgWear App M M M M M M M M 1 2 3 4 12

Used Erase Cycles How Frequently Should We Swap? Imbalance = MaxWear 1424/3 8/5 8/7 4/3 / AvgWear App M M M M M M M M 1 2 3 4 How many times should we swap within SSD lifetime? 12

Quantifying the Swapping Frequency Assume there are N channels, wear imbalance target: 1+x after K rounds of cycling: Wear Imbalance = (MK + M)/(MK + M/N) = (K + 1)/(K + 1/N) (1 + x) Maximum Wearout Average Wearout 13

Quantifying the Swapping Frequency Assume there are N channels, wear imbalance target: 1+x after K rounds of cycling: Wear Imbalance = (MK + M)/(MK + M/N) = (K + 1)/(K + 1/N) (1 + x) K (N 1 x) / (Nx) 13

Quantifying the Swapping Frequency Assume there are N channels, wear imbalance target: 1+x after K rounds of cycling: Wear Imbalance = (MK + M)/(MK + M/N) = (K + 1)/(K + 1/N) (1 + x) K (N 1 x) / (Nx) If N = 16, x = 0.1, then K = 9, which means after swap NK = 148 times, we can guarantee the wear imbalance is bounded in 1.1 13

Quantifying the Swapping Frequency Assume there are N channels, wear imbalance target: 1+x after K rounds of cycling: Wear Imbalance = (MK + M)/(MK + M/N) = (K + 1)/(K + 1/N) (1 + x) K (N 1 x) / (Nx) For an SSD with 5 years lifetime, swap once per 12 days can guarantee the channels are well balanced for worst case If N = 16, x = 0.1, then K = 9, which means after swap NK = 148 times, we can guarantee the wear imbalance is bounded in 1.1 13

Used Erase Cycles Adaptive Wear Leveling in Practice App App App App M M/3 M/2 0 1 2 3 4 14

Used Erase Cycles Adaptive Wear Leveling in Practice App App App App M M/3 M/2 0 1 2 3 4 Using erase rate as the trigger condition for swapping 14

Used Erase Cycles Intra Wear Leveling 1 2 3 4 15

Used Erase Cycles Intra Wear Leveling 1 2 3 4 s will be swapped along with the channel migration 15

Used Erase Cycles Intra Wear Leveling 1 2 3 4 s will be swapped along with the channel migration + Intra-chip wear leveling mechanisms 15

FlashBlox Architecture App Resource Manager -Level Wear Leveling -Level Wear Leveling Flash 16

FlashBlox Architecture App App Virtual SSD App Virtual SSD Resource Manager -Level Wear Leveling -Level Wear Leveling Flash 16

FlashBlox Architecture App App Virtual SSD App Virtual SSD Resource Manager Isolation, Bandwidth & Capacity Requirement (Virtual SSD to Parallel s Mappings) -Level Wear Leveling -Level Wear Leveling Flash 16

FlashBlox Architecture App App Virtual SSD App Virtual SSD Resource Manager Isolation, Bandwidth & Capacity Requirement Pay-As-You-Go (Virtual SSD to Parallel Model s Mappings) in Cloud -Level Wear Leveling -Level Wear Leveling Flash 16

FlashBlox Architecture App App Virtual SSD App Virtual SSD Resource Manager Isolation, Bandwidth & Capacity Requirement (Virtual SSD to Parallel s Mappings) -Level Wear Leveling Inter Swapping -Level Wear Leveling Flash 16

FlashBlox Architecture App App Virtual SSD App Virtual SSD Resource Manager Isolation, Bandwidth & Capacity Requirement (Virtual SSD to Parallel s Mappings) -Level Wear Leveling Inter Swapping -Level Wear Leveling Intra Swapping Other FTL Algorithms Intra Swapping Flash 16

FlashBlox Experimental Setup 16 channels 4 chips 4 planes 16 KB page size 14 data center workloads Yahoo Cloud Service Benchmark Bing Search / Index / PageRank Transactional Database Azure Storage 17

99th Percentile Latency (microsecons) Tail Latency Reduction with FlashBlox App1-Software Isolation App1-FlashBlox 700 600 500 400 300 200 100 0 App2-Software Isolation App2-FlashBlox App1 App2 A: Session store recording recent actions B: Photo tagging C: User profile cache D: User status update E: Threaded conversations F: User database A+A A+B A+C A+D A+E A+F Yahoo Cloud Service Benchmark (YCSB) 18

99th Percentile Latency (microsecons) Tail Latency Reduction with FlashBlox 700 600 500 400 300 200 100 0 App1-Software Isolation App1-FlashBlox App2-Software Isolation App2-FlashBlox App1 App2 A+A A+B A+C A+D A+E A+F Yahoo Cloud Service Benchmark (YCSB) Tail latency reduction: 2.6x, average latency reduction: 1.4x 18

Latency (milliseconds) Impact of Migration on Application Performance 0.6 0.5 0.4 0.3 0.2 0.1 0 Bing Search s Performance During Migration Without Migration With Migration Time (Seconds) 19

Latency (milliseconds) Impact of Migration on Application Performance 0.6 0.5 0.4 0.3 0.2 0.1 0 Bing Search s Performance During Migration 34% With Migration Without Migration Time (Seconds) 19

Latency (milliseconds) Impact of Migration on Application Performance 0.6 0.5 0.4 Bing Search s Performance During Migration 34% 0.3 0.2 0.1 0 Without Migration With Migration Time (Seconds) migration takes 15 minutes, once per 19 days Overall performance drops only for 0.04% of all the time 19

FlashBlox Summary 2.6x reduction on tail latency Near-ideal SSD lifetime Swap once per 19 days 20

Thanks! Jian Huang jian.huang@gatech.edu Anirudh Badam Laura Caulfield Suman Nath Sudipta Sengupta Bikash Sharma Moinuddin K. Qureshi Q&A 21