Robert Gottstein, Ilia Petrov, Guillermo G. Almeida, Todor Ivanov, Alex Buchmann

Similar documents
SIAS-Chains: Snapshot Isolation Append Storage Chains. Dr. Robert Gottstein Prof. Ilia Petrov M.Sc. Sergej Hardock Prof. Alejandro Buchmann

SICV Snapshot Isolation with Co-Located Versions

Performance Modeling and Analysis of Flash based Storage Devices

AN ALTERNATIVE TO ALL- FLASH ARRAYS: PREDICTIVE STORAGE CACHING

Page Size Selection for OLTP Databases on SSD RAID Storage

High Performance SSD & Benefit for Server Application

The Benefits of Solid State in Enterprise Storage Systems. David Dale, NetApp

Moneta: A High-Performance Storage Architecture for Next-generation, Non-volatile Memories

Presented by: Nafiseh Mahmoudi Spring 2017

Webinar Series: Triangulate your Storage Architecture with SvSAN Caching. Luke Pruen Technical Services Director

Toward Seamless Integration of RAID and Flash SSD

Solid State Drives (SSDs) Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Storage Systems : Disks and SSDs. Manu Awasthi July 6 th 2018 Computer Architecture Summer School 2018

Join Processing for Flash SSDs: Remembering Past Lessons

VSSIM: Virtual Machine based SSD Simulator

CSE 451: Operating Systems Spring Module 12 Secondary Storage

Calypso Blind Survey 2010

Adaptec Series 7 SAS/SATA RAID Adapters

Storage Systems : Disks and SSDs. Manu Awasthi CASS 2018

Flash Trends: Challenges and Future

Interface Trends for the Enterprise I/O Highway

I N V E N T I V E. SSD Firmware Complexities and Benefits from NVMe. Steven Shrader

Comparing Performance of Solid State Devices and Mechanical Disks

NVMFS: A New File System Designed Specifically to Take Advantage of Nonvolatile Memory

Next Generation Architecture for NVM Express SSD

Why Does Solid State Disk Lower CPI?

Key Points. Rotational delay vs seek delay Disks are slow. Techniques for making disks faster. Flash and SSDs

NAND Flash-based Storage. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

BIG DATA AND HADOOP ON THE ZFS STORAGE APPLIANCE

LEVERAGING FLASH MEMORY in ENTERPRISE STORAGE

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568

Deep Learning Performance and Cost Evaluation

Technical Note. Improving Random Read Performance Using Micron's SNAP READ Operation. Introduction. TN-2993: SNAP READ Operation.

Calypso Blind Survey 2010

COS 318: Operating Systems. Storage Devices. Jaswinder Pal Singh Computer Science Department Princeton University

D E N A L I S T O R A G E I N T E R F A C E. Laura Caulfield Senior Software Engineer. Arie van der Hoeven Principal Program Manager

SSD Applications in the Enterprise Area

u Covered: l Management of CPU & concurrency l Management of main memory & virtual memory u Currently --- Management of I/O devices

Introducing and Validating SNIA SSS Performance Test Suite Esther Spanjer SMART Modular

COS 318: Operating Systems. Storage Devices. Vivek Pai Computer Science Department Princeton University

SFS: Random Write Considered Harmful in Solid State Drives

INTEL NEXT GENERATION TECHNOLOGY - POWERING NEW PERFORMANCE LEVELS

Identifying Performance Bottlenecks with Real- World Applications and Flash-Based Storage

Exploiting the benefits of native programming access to NVM devices

A Semi Preemptive Garbage Collector for Solid State Drives. Junghee Lee, Youngjae Kim, Galen M. Shipman, Sarp Oral, Feiyi Wang, and Jongman Kim

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568

Database Architecture 2 & Storage. Instructor: Matei Zaharia cs245.stanford.edu

Accelerate Applications Using EqualLogic Arrays with directcache

Storage Adapter Testing Report

Cold Storage: The Road to Enterprise Ilya Kuznetsov YADRO

The Impact of SSD Selection on SQL Server Performance. Solution Brief. Understanding the differences in NVMe and SATA SSD throughput

ECE Enterprise Storage Architecture. Fall 2018

NAND Flash-based Storage. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Design Considerations for Using Flash Memory for Caching

NAND Flash-based Storage. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Deep Learning Performance and Cost Evaluation

ECE Enterprise Storage Architecture. Fall 2016

CSE 451: Operating Systems Spring Module 12 Secondary Storage. Steve Gribble

Evaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades

Micron Quad-Level Cell Technology Brings Affordable Solid State Storage to More Applications

Readings. Storage Hierarchy III: I/O System. I/O (Disk) Performance. I/O Device Characteristics. often boring, but still quite important

SUPERTALENT ULTRADRIVE LE/ME PERFORMANCE IN APPLE MAC WHITEPAPER

Understanding the Relation between the Performance and Reliability of NAND Flash/SCM Hybrid Solid- State Drive

Lightspeed. Solid. Impressive. Nytro 3000 SAS SSD

IBM DS8870 Release 7.0 Performance Update

Facing an SSS Decision? SNIA Efforts to Evaluate SSS Performance. Ray Lucchesi Silverton Consulting, Inc.

NAND Flash-based Storage. Computer Systems Laboratory Sungkyunkwan University

IBM and HP 6-Gbps SAS RAID Controller Performance

SSD/Flash for Modern Databases. Peter Zaitsev, CEO, Percona November 1, 2014 Highload Moscow,Russia

COS 318: Operating Systems. Storage Devices. Kai Li Computer Science Department Princeton University

Building a High IOPS Flash Array: A Software-Defined Approach

2009. October. Semiconductor Business SAMSUNG Electronics

I/O Performance" CS162 C o Operating Systems and Response" 300" Time (ms)" User" I/O" Systems Programming Thread" lle device" 200" Lecture 13 Queue"

Flash In the Data Center

A Prototype Storage Subsystem based on PCM

LATEST INTEL TECHNOLOGIES POWER NEW PERFORMANCE LEVELS ON VMWARE VSAN

Using Transparent Compression to Improve SSD-based I/O Caches

Dongjun Shin Samsung Electronics

Webscale, All Flash, Distributed File Systems. Avraham Meir Elastifile

Falcon: Scaling IO Performance in Multi-SSD Volumes. The George Washington University

[537] Flash. Tyler Harter

HP VMA-series Memory Arrays

Developing Low Latency NVMe Systems for HyperscaleData Centers. Prepared by Engling Yeo Santa Clara, CA Date: 08/04/2017

Moneta: A High-performance Storage Array Architecture for Nextgeneration, Micro 2010

Storage. CS 3410 Computer System Organization & Programming

SSD/Flash for Modern Databases. Peter Zaitsev, CEO, Percona October 08, 2014 Percona Technical Webinars

Unblinding the OS to Optimize User-Perceived Flash SSD Latency

Improving throughput for small disk requests with proximal I/O

SSD Architecture Considerations for a Spectrum of Enterprise Applications. Alan Fitzgerald, VP and CTO SMART Modular Technologies

Optimizing Fusion iomemory on Red Hat Enterprise Linux 6 for Database Performance Acceleration. Sanjay Rao, Principal Software Engineer

Replacing the FTL with Cooperative Flash Management

Software Defined Storage at the Speed of Flash. PRESENTATION TITLE GOES HERE Carlos Carrero Rajagopal Vaideeswaran Symantec

QuickSpecs. PCIe Solid State Drives for HP Workstations

Managing Array of SSDs When the Storage Device is No Longer the Performance Bottleneck

Virtual Storage Tier and Beyond

UCS Invicta: A New Generation of Storage Performance. Mazen Abou Najm DC Consulting Systems Engineer

Flash-Conscious Cache Population for Enterprise Database Workloads

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Computer Architecture ECE 568

Solid State Drive (SSD) Cache:

QLC Challenges. QLC SSD s Require Deep FTL Tuning Karl Schuh Micron. Flash Memory Summit 2018 Santa Clara, CA 1

Transcription:

Using Flash SSDs as Pi Primary Database Storage Robert Gottstein, Ilia Petrov, Guillermo G. Almeida, Todor Ivanov, Alex Buchmann {lastname}@dvs.tu-darmstadt.de Fachgebiet DVS Ilia Petrov 1

Flash SSDs, X25-E, iodriveduo i D FTL Fachgebiet DVS Ilia Petrov 2

Specification Specification Intel X25-E 64GB, SLC Specification: Savvio 146GB,15k Seq. Read/Write: 250 / 170 MB/s Seq. Read / Write: 160 MB/s Read/Write IOPS (4K): 35 000 / 3 300 Read/Write IOPS: 350 / 300 Latency Read/Write (4K): 0.075/0.085 ms Latency Read/Write: 3.2 / 3.5 ms Price: 650 Price: 180 10x 20x Fachgebiet DVS Ilia Petrov 3

Flash vs Magnetic Storage 10x 20x IoFusion iodrive Duo Seq. Read/Write: 1.5 / 1.4 GB/s Read/Write IOPS (4K): 130 000 / 80 000 Latency Read/Write (4K): 0.025/0.035 ms Price: approx. 6000 > 1000x Dr.-Ing. Ilia Petrov 4

Amdahl s Law Speedup [1] An OLTP database performs IO approx. 60% of the time [Patterson] 10x faster CPUs or 10x faster IO-Subsystem? f= 0,6 S( f,k ) S=1/( (1-f) + f/k ) S=2.2x 2x k= 10 10x faster storage f = 0,4 k= 10 S( f,k fk) S=1.5x 10x faster CPUs Fachgebiet DVS Ilia Petrov 5 [1] Amdahl, Gene. "Validity of the Single Processor Approach to Achieving Large- Scale Computing Capabilities". In Proc. AFIPS Conference pp.483 485. 1967

Amdahl s Revised Balanced System Law [2]: A system needs 8 MIPS/MB/s IO The instruction rate and IO rate workload dependent OLTP, CPI=2.1 Assume 75% random write, 25% random read, 8KB page size, 3.2 GHz CPU CPU= 3.2 GHz HDD SAS 15K RPM CPU= 3.2 GHz Enterpr. SSD Amdahl s Balanced 80 HDD/CPU System Law 12 000 Amdahl s Balanced System Law 2 000 3 SSD/CPU [2] Jim Gray, Prashant Shenoy, "Rules of Thumb in Data Engineering," In Proc., ICDE 2000 06.11.2010 Fachgebiet DVS Ilia Petrov 6

In summary HDD have reached physical limits it Fighting low access density with thousands of HDDs is unreasonable Outdated storage technology Data-Intensive Systems are IO-Bound Data-Intensive systems built around HDD properties Access Gap / Access Density Larger Buffer Sizes Larger Page Sizes Algorithms optimized for streaming access rather than random access SSDs come at the right moment Fachgebiet DVS Ilia Petrov 7

Flash SSD Characteristics Fachgebiet DVS Ilia Petrov 8

Characteristics ti Throughput h asymmetry Random Throughput Better for small block-sizes Random Writes are an issue Very good sequential throughput Still asymmetric Caching Very low latency Command Queuing and internal parallelism Fachgebiet DVS Ilia Petrov 9

Random Throughputh Random Throughput-Very High Asymmetric: Read vs. Write Up to 10x difference Better for small blocksizes: Major weakness of HDD 4K: 35 500 Read 6 000 Write 8K: 23 000 Read 4 800 Write Fachgebiet DVS Ilia Petrov 10

Random Throughput h SSD and HDD Fachgebiet DVS Ilia Petrov 11

Sequential Throughput h Sequential Bandwidth MB/s Asymmetric >= HDD Caching Command Queuing 189 MB/s Write Caching Write Cache OFF Fachgebiet DVS Ilia Petrov 12

Average Access Time / Latency (AVG) AVG. Latency [ms] WC On WC Off Seq. Read 0.053 -- Seq.Write 0.059 0.455 Rand. Read 0.167 -- Rand. Write 0.113 0.435 Max Latency[ms] WC On WC Off Seq. Read 12.29 -- Seq.Write 94.82 100.26 Rand. Read 12.41 -- Rand. Write 175.27 100.68 Fachgebiet DVS Ilia Petrov 13

FTL, Address-Mapping LBA File System Block device interface Block Device Interface Logical Blocks, LBA SAS/SATA2 Pages, (Erase)Blocks, Log records NAND Flash SSD FTL- Flash Translation Layer NAND Flash FTL Memory Con troller Mapping Table(SRAM) L P Fachgebiet DVS Ilia Petrov 14 Block P Log Block Area<3% Background Processes Wear-leveling Garbage collection Metadata synch Log-block merging SATA2/SAS TRIM RAID

Fragmentation and Background Processes Fachgebiet DVS Ilia Petrov 15

Single Drive Fragmentation ti max. 70% full Fragment: 5h write (rand., seq.) Random reads less affected - 11% Seq. writes 18% slower Reason: (+) write cache/write back for small block sizes,(-) garbage collection Worse for larger block sizes Most affected Seq.Read, Rnd.Write Sequential reads 52% slower! Read ahead not possible Better for larger block sizes Random writes 50% slower! Reason: excessive garbage collection Fachgebiet DVS Ilia Petrov 16

Single Drive Fragmentation ti over 90% full Reads less affected Random reads not affected Sequential reads approx 30% slower Writes affected significantly Random writes 75% slower Sequential writes 79% slower SEQUENTIAL, 64K Read Fragmented Non-Fragment. Fragmented Write Non-Fragment. Bandw. [MB/s] 177 255 38 185 Avg. Latency [ms] 9 8 52 11 RANDOM, 4K Read Fragmented Non-Fragment. Fragmented Write Non-Fragment. IOPS 38900 39810 828 3358 Avg. Latency [ms] 0.8 0.8 39 10 Fachgebiet DVS Ilia Petrov 17

Flash Trends [A. v. Bechtolsheim h HPTS 2009] Density doubling each year 1TB in 4 years Costs falling by 50% per year Access times falling by 50% per year 5μs in 4 years Throughput doubling every year Interface moving from SATA to PCI Express Very large-scale I/O looks feasible Fachgebiet DVS Ilia Petrov 18

SSD RAID Storage How do we build large SSD storage? Fachgebiet DVS Ilia Petrov 19

Single SSD vs. SSD RAID Device Seq. Read [MB/s] Seq. Write [MB/s] Rnd. Read [ms] Rnd. Write [ms] Rnd. Read IOPS Rnd. Write IOPS Price [ /GB] Price Read IOPS/ Price Write IOPS/ E.SSD 250 170 0.075 0.085 35 000 3 300 10 56 5.3 RAID0 2xSSD 422 631 0.375 0.458 24 371 2 035 19 13 1.1 What did go wrong? RAID benefits come at a high cost in SSD configurations Random throughput (IOPS) approx. 30% lower Sequential read throughput (MB/s) better than that of a single SSD Sequential write throughput good Entirely due to write caching Fachgebiet DVS Ilia Petrov 20

Scalability Tests Random Load RAID 0 Controller saturated with: SMALL Block size 2 SSDs!!! (even 1) Larger block sizes more SSDs less than 4 Fachgebiet DVS Ilia Petrov 21

Scalability Tests Sequential Tests RAID 0 Write saturated from start Controller Cache Seq.! Scales with writing threads Read Contrl. Cache - ineffective File System = Raw Dev. Fachgebiet DVS Ilia Petrov 22

Hardware/Software RAID Configurations Fachgebiet DVS Ilia Petrov 23

Hardware/Software RAID 4 SSD Total 1 Controller (4 SSDs) 2 Controllers (2 SSD per Controller) RAID0 HW RAID0 SW SimpleVolumes RAID0 SW 2 SSD per Controller RAID0 HW RAID0 SW 2 SSD per Controller RAID0 SimpleVolumes Quantity blocksize Read Write Read Write Read Write Read Write Sequential 256KB 672 397 671 462 1033 762 1031 684 Throughput 512KB 670 398 674 468 1039 760 1030 687 Sequential 256KB 0.743 0.743 0.688 0.711 0.772 0.512 0.531 0.461 Latency 512KB 1.168168 1.382 1.152152 1.254 1.303 0.913 0.877 0.791 Random 4KB 24787 10193 27675 11704 44537 19529 49054 22512 Throughput 8KB 20987 6289 25417 10575 41091 13657 44129 13765 Random 4KB 0.353 0.204 0.277 0.120 0.282 0.114 0.277 0.109 Latency 8KB 0.429 0.220 0.365 0.196 0.334 0.161 0.332 0.138 Fachgebiet DVS Ilia Petrov 24 Two Controllers double the Performance! Simple Volumes better random throughput! HW RAID0 better sequential throughput! Host Based Storage!

Thank You! http://www.dvs.tu-darmstadt.de/research/flashydb/ Fachgebiet DVS Ilia Petrov 25