FlashBlox: Achieving Both Performance Isolation and Uniform Lifetime for Virtualized SSDs

FlashBlox: Achieving Both Performance Isolation and Uniform Lifetime for Virtualized SSDs Jian Huang Anirudh Badam Laura Caulfield Suman Nath Sudipta Sengupta Bikash Sharma Moinuddin K. Qureshi

Flash Has Changed Over the Last Decade Performance Improvement 100x lower latency 5,000x higher throughput 2

Flash Has Changed Over the Last Decade Performance Improvement 100x lower latency 5,000x higher throughput Increased Parallelism Dozens of parallel chips 2

Flash Has Changed Over the Last Decade Performance Improvement 100x lower latency 5,000x higher throughput Increased Parallelism Dozens of parallel chips Became Commodity Less than $0.3/GB 2

Flash Has Changed Over the Last Decade Performance Improvement 100x lower latency 5,000x higher throughput Increased Parallelism Dozens of parallel chips Became Commodity Less than $0.3/GB Significant improvements on Flash 2

Shared Flash-Based Solid State Disk (SSD) in the Cloud. 3

Shared Flash-Based Solid State Disk (SSD) in the Cloud. SSDs are virtualized and shared in data centers 3

Performance Interference in Shared SSD. Write Read Flash Translation Layer Flash-based SSD: A Black Box 4

Performance Interference in Shared SSD. Read/write Write Read interferences cause long (3x) tail latency! Flash Translation Layer Flash-based SSD: A Black Box 4

Performance Interference in Shared SSD. Write Read Flash Translation Layer 4

FlashBlox: Hardware Isolation in Cloud Storage. Flash Translation Layer 5

FlashBlox: Hardware Isolation in Cloud Storage. Leveraging parallel chips for hardware isolation Flash Translation Layer 5

Internal Parallelism Enables Hardware Isolation 6

Internal Parallelism Enables Hardware Isolation -Level Parallelism 6

Internal Parallelism Enables Hardware Isolation -Level Parallelism -Level Parallelism 6

Internal Parallelism Enables Hardware Isolation -Level Parallelism Plane-level parallelism is constrained as each chip contains only one address buffer -Level Parallelism Plane-Level Parallelism 6

Internal Parallelism Enables Hardware Isolation -Level Parallelism -Level Parallelism Plane-Level Parallelism Different parallelism level provides different isolation guarantee 6

New Abstractions for Hardware Isolation 7

New Abstractions for Hardware Isolation Virtual SSD ( Level) Virtual SSD ( Level) Virtual SSD (Plane Level) High Medium Low 7

New Abstractions for Hardware Isolation Virtual SSD ( Level) Virtual SSD ( Level) Virtual SSD (Plane Level) Software-based High Medium Low 7

Hardware Isolation Meets the Pay-As-You-Go Model in Cloud vssd () vssd () vssd (Software) 8

Hardware Isolation Meets the Pay-As-You-Go Model in Cloud Azure DocumentDB vssd () Azure SQL Database vssd () Amazon DynamoDB vssd (Software) 8

Hardware Isolation Meets the Pay-As-You-Go Model in Cloud Throughput Azure DocumentDB vssd () Azure SQL Database vssd () Single Amazon Partition Size Price DynamoDB vssd (Software) 8

Hardware Isolation Meets the Pay-As-You-Go Model in Cloud Hundreds of vssds can be supported in a single server vssd () vssd () vssd (Software) 8

Impact of Hardware Isolation on SSD Lifetime. Flash Translation Layer 9

Average #Blocks Erased/sec Impact of Hardware Isolation on SSD Lifetime 4.5 4 3.5 3 2.5 2 1.5 1 0.5 0 The average rate at which. flash blocks are erased Flash Translation Layer 9

Average #Blocks Erased/sec Impact of Hardware Isolation on SSD Lifetime 4.5 4 3.5 3 2.5 2 1.5 1 0.5 0 The average rate at which. flash blocks are erased Flash Translation Layer Flash blocks wear out at different rate with different workload 9

Impact of Hardware Isolation on SSD Lifetime. Write Intensive Flash Translation Layer 9

App App App FlashBlox Challenges SSD Lifetime Performance Isolation 10

App App App FlashBlox Challenges SSD Lifetime Performance Isolation App App App SSD Lifetime Performance Isolation 10

App App App FlashBlox Challenges SSD Lifetime Performance Isolation 10

Used Erase Cycles FlashBlox: Swapping s for Wear Balance 1 2 3 4 Adjusting the wear imbalance at a more coarse time granularity can achieve near-ideal SSD lifetime 11

Used Erase Cycles FlashBlox: Swapping s for Wear Balance 1 2 3 4 The channel that has incurred the maximum wearout The channel that has the minimum rate of wearout 11

Used Erase Cycles FlashBlox: Swapping s for Wear Balance 1 2 3 4 migration takes 15 minutes, once per 19 days Overall performance drops only for 0.04% of all the time 11

Used Erase Cycles How Frequently Should We Swap? Imbalance = MaxWear / AvgWear 1 2 3 4 12

Used Erase Cycles How Frequently Should We Swap? Imbalance = MaxWear 4 / AvgWear App M 1 2 3 4 12

Used Erase Cycles How Frequently Should We Swap? Imbalance = MaxWear 42 / AvgWear App M 1 M 2 3 4 12

Used Erase Cycles How Frequently Should We Swap? Imbalance = MaxWear 424/3 / AvgWear App M M M 1 2 3 4 12

Used Erase Cycles How Frequently Should We Swap? Imbalance = MaxWear 424/3 1 / AvgWear App M M M M 1 2 3 4 12

Used Erase Cycles How Frequently Should We Swap? Imbalance = MaxWear 1424/3 8/5 8/7 4/3 / AvgWear App M M M M M M M M 1 2 3 4 12

Used Erase Cycles How Frequently Should We Swap? Imbalance = MaxWear 1424/3 8/5 8/7 4/3 / AvgWear App M M M M M M M M 1 2 3 4 How many times should we swap within SSD lifetime? 12

Quantifying the Swapping Frequency Assume there are N channels, wear imbalance target: 1+x after K rounds of cycling: Wear Imbalance = (MK + M)/(MK + M/N) = (K + 1)/(K + 1/N) (1 + x) Maximum Wearout Average Wearout 13

Quantifying the Swapping Frequency Assume there are N channels, wear imbalance target: 1+x after K rounds of cycling: Wear Imbalance = (MK + M)/(MK + M/N) = (K + 1)/(K + 1/N) (1 + x) K (N 1 x) / (Nx) If N = 16, x = 0.1, then K = 9, which means after swap NK = 148 times, we can guarantee the wear imbalance is bounded in 1.1 13

Quantifying the Swapping Frequency Assume there are N channels, wear imbalance target: 1+x after K rounds of cycling: Wear Imbalance = (MK + M)/(MK + M/N) = (K + 1)/(K + 1/N) (1 + x) K (N 1 x) / (Nx) For an SSD with 5 years lifetime, swap once per 12 days can guarantee the channels are well balanced for worst case If N = 16, x = 0.1, then K = 9, which means after swap NK = 148 times, we can guarantee the wear imbalance is bounded in 1.1 13

Used Erase Cycles Adaptive Wear Leveling in Practice App App App App M M/3 M/2 0 1 2 3 4 14

Used Erase Cycles Adaptive Wear Leveling in Practice App App App App M M/3 M/2 0 1 2 3 4 Using erase rate as the trigger condition for swapping 14

Used Erase Cycles Intra Wear Leveling 1 2 3 4 15

Used Erase Cycles Intra Wear Leveling 1 2 3 4 s will be swapped along with the channel migration 15

Used Erase Cycles Intra Wear Leveling 1 2 3 4 s will be swapped along with the channel migration + Intra-chip wear leveling mechanisms 15

FlashBlox Architecture App Resource Manager -Level Wear Leveling -Level Wear Leveling Flash 16

FlashBlox Architecture App App Virtual SSD App Virtual SSD Resource Manager -Level Wear Leveling -Level Wear Leveling Flash 16

FlashBlox Architecture App App Virtual SSD App Virtual SSD Resource Manager Isolation, Bandwidth & Capacity Requirement (Virtual SSD to Parallel s Mappings) -Level Wear Leveling -Level Wear Leveling Flash 16

FlashBlox Architecture App App Virtual SSD App Virtual SSD Resource Manager Isolation, Bandwidth & Capacity Requirement Pay-As-You-Go (Virtual SSD to Parallel Model s Mappings) in Cloud -Level Wear Leveling -Level Wear Leveling Flash 16

FlashBlox Experimental Setup 16 channels 4 chips 4 planes 16 KB page size 14 data center workloads Yahoo Cloud Service Benchmark Bing Search / Index / PageRank Transactional Database Azure Storage 17

99th Percentile Latency (microsecons) Tail Latency Reduction with FlashBlox App1-Software Isolation App1-FlashBlox 700 600 500 400 300 200 100 0 App2-Software Isolation App2-FlashBlox App1 App2 A: Session store recording recent actions B: Photo tagging C: User profile cache D: User status update E: Threaded conversations F: User database A+A A+B A+C A+D A+E A+F Yahoo Cloud Service Benchmark (YCSB) 18

99th Percentile Latency (microsecons) Tail Latency Reduction with FlashBlox 700 600 500 400 300 200 100 0 App1-Software Isolation App1-FlashBlox App2-Software Isolation App2-FlashBlox App1 App2 A+A A+B A+C A+D A+E A+F Yahoo Cloud Service Benchmark (YCSB) Tail latency reduction: 2.6x, average latency reduction: 1.4x 18

Latency (milliseconds) Impact of Migration on Application Performance 0.6 0.5 0.4 0.3 0.2 0.1 0 Bing Search s Performance During Migration Without Migration With Migration Time (Seconds) 19

Latency (milliseconds) Impact of Migration on Application Performance 0.6 0.5 0.4 0.3 0.2 0.1 0 Bing Search s Performance During Migration 34% With Migration Without Migration Time (Seconds) 19

Latency (milliseconds) Impact of Migration on Application Performance 0.6 0.5 0.4 Bing Search s Performance During Migration 34% 0.3 0.2 0.1 0 Without Migration With Migration Time (Seconds) migration takes 15 minutes, once per 19 days Overall performance drops only for 0.04% of all the time 19

FlashBlox Summary 2.6x reduction on tail latency Near-ideal SSD lifetime Swap once per 19 days 20

Thanks! Jian Huang jian.huang@gatech.edu Anirudh Badam Laura Caulfield Suman Nath Sudipta Sengupta Bikash Sharma Moinuddin K. Qureshi Q&A 21