Getting Real: Lessons in Transitioning Research Simulations into Hardware Systems Mohit Saxena, Yiying Zhang Michael Swift, Andrea Arpaci-Dusseau and Remzi Arpaci-Dusseau
Flash Storage Stack Research SSD Design Flash management OS & Applications File systems, block cache, K-V stores Device Interface read, write, trim 2
How to evaluate new SSD designs? 1. Modify SSD Replace firmware/ftl Are you a device vendor? 2. FPGA Prototype Flexible and fast Hard and time-consuming 3. Simulator or Emulator Replay block traces Implement device models 3
Simulator/Emulator Limitations Real hardware performance SSDs are complex: banks, packages, planes Interaction with software stack SSDs are sensitive to OS & app behavior Trace replay Unable to model request timing dependencies & new commands FAST year Unmodified SSDs Simulator Hardware 2011 2 5 0 2012 2 7 0 2013 2 3 2 4
Approach 4. SSD Prototyping Kit OpenSSD Hardware Platform 5
In this Talk New SSD and Interface Designs Research Simulation Hardware Prototype FlashTier s Solid-State Cache (SSC) [EuroSys 12] Nameless Write SSD [FAST 12] Challenges, Experiences and Lessons General insights applicable to other SSD designs and interfaces 6
Introduction Background Talk Outline OpenSSD Hardware Platform SSC and Nameless Write SSD Interfaces Prototyping Experiences Evaluation Conclusions 7
Flash SSD 101 Remap in-place writes Address translation Log-structuring Garbage collection 0 1 2 3 30 1 2 FTL 0 1 2 write X 0 1 2 30 Pages in Flash Erase Block OS 8
OpenSSD Hardware Platform Hardware: commodity Indilinx Barefoot ARM SSD controller 64 MB DRAM, 128 GB NAND Flash Software: reference Open-source FTL Interface: standard SATA read & write UART serial debugging 9
Application Solid-State Disk (SSD) File System Block Layer read, write, trim SSD SSD Block Interface Emulate disk: persistent block store 10
Solid-State Cache (SSC) Application File System Block Layer Application File System Block Layer Cache Manager read, write, trim read, clean, exists write-clean/dirty not-present errors SSD SSC SSC Caching Interface Distinguish clean from dirty data Return not-present read errors for evicted blocks Fast and reliable SSC 11
Nameless-Write (NW-SSD) Application Application Application File System Block Layer File System Block Layer Cache Manager File System Block Layer read, write, trim read, clean, exists write-clean/dirty not-present errors virtual-write/read nameless-write/read callbacks phy-address SSD SSC NW-SSD Nameless-Write SSD nameless-write/read : physical address migration callbacks : relocated blocks Cheap and fast SSD 12
Summary of Interface Changes New Interface Extensions Forward Commands Return Values Device Responses SSC write-dirty write-clean exists clean not-present read errors Nameless-Write SSD nameless-write physical-read virtual-write virtual-read physical addresses migration-callbacks 13
Talk Outline Introduction Background Prototyping Experiences Challenges Solutions Lessons Evaluation Conclusions 14
Prototyping Challenges 1. New Forward Commands 2. New Device Responses 3. Real Hardware Constraints for SSC and NW-SSD 15
1. New Forward Commands Assumption File system & cache write-dirty/clean, evict, clean, exists manager directly interface with device File System or Application Cache Manager nw-write, physical-read, virtual-write virtual-read Simulated Interface (Function Calls) Device Interface Simulated Device SSD
1. New Forward Commands Assumption File system & cache manager directly interface with device Reality Several intermediate OS, firmware & hardware layers write-dirty/clean, evict, clean, exists Low-level drivers File System or Application Cache Manager Device Mapper I/O Scheduler SCSI Layer ATA Layer AHCI Driver Device Interface nw-write, physical-read, virtual-write virtual-read Hardware Command Queue & Firmware SSD
Implementing Forward Commands Solutions Disallow merging & re-ordering Add new command sub-types Mask command subtype in firmware for hardware queues write-dirty/clean, evict, clean, exists Low-level drivers <lba,length+sub-type> File System or Application Cache Manager Device Mapper I/O Scheduler SCSI Layer ATA Layer AHCI Driver nw-write, physical-read, virtual-write virtual-read scsi_cmd ata_qc ahci_fis Device Interface Hardware Command Queue & Firmware SSD
Lesson 1 Designing New Forward Commands Research lesson: consider all layers of OS I/O scheduler: merging and re-ordering Storage device drivers: adding sub-types Research lesson: consider complete SSD ecosystem Firmware: encoding sub-types Hardware: accelerating new command queues 19
2. New Device Responses Assumption Device can directly send new responses to cache manager and file system not-present read errors Simulated Responses (Function Upcalls) File System or Application Cache Manager phy-addresses, migration callbacks Device Responses Simulated Device SSD
2. New Device Responses Reality New errors do not propagate up beyond device drivers Write responses can not encode physical addresses Migration callbacks do not fit standard protocols not-present read errors Low-level drivers ATA Responses and Protocol File System or Application Cache Manager Device Mapper I/O Scheduler SCSI Layer ATA Layer AHCI Driver SSD phy-addresses, migration callbacks Device Responses 8 bits+reserved
Reverse Path Errors Solution File System or Application not-present errors overloaded on read response data from device to OS not-present read errors Cache Manager Device Mapper I/O Scheduler SCSI Layer phy-addresses, migration callbacks Low-level drivers ATA Layer AHCI Driver Magic #@!!@ Read Response Data SSD
Split-FTL Design Solution Kernel and Device FTLs Forward commands transformed to raw flash commands from Kernel- FTL Physical addresses & migration callbacks returned from kernel- FTL to file system Garbage collection: copyback from/to host File System or Application Block Layer I/O Scheduler Kernel-FTL SCSI, ATA, AHCI copyback Device FTL Raw SSD nameless-read/write virtual-read/write phy-addresses, migration callbacks flash page read/write, block erase Device Interface 23
Lesson 2 Designing New Device Responses Research lesson: consider all OS layers Storage device drivers: handling of frequent benign errors Device & file system: race conditions for callbacks Prototyping lesson: simplicity and correctness Kernel-FTL: simpler block layer OS interface Device-FTL: enforce correct erase-beforeoverwrite 24
Talk Outline Introduction Background Prototyping Experiences Evaluation Validate Performance Claims Conclusions 25
Methodology Systems for comparison SSD: Facebook FlashCache using SSD with GC SSC: modified Facebook FlashCache using SSC with silent eviction Nameless-Write SSD interface performance Workload: filebench with read/write/fsync Workload fileserver webserver varmail Ratio of operations 1:2 reads/writes 10:1 reads/writes 1:1:1 reads/writes/fysnc 26
Percent performance (relative to no cache) SSC Prototype Performance 450 400 350 300 250 200 150 100 50 0 SSC Claim 168% better perf. than common hybrid FTL SSD +52% No Cache SSD SSC SSC Prototype 52% better perf. than faster page-map FTL SSD +27% fileserver webserver varmail write intensive read intensive fsync intensive 27
I/O operations/s Nameless-Write SSD Performance 14000 12000 10000 8000 6000 4000 SSD NW-SSD Claim Perf. close to page-map FTL Merging optimization Nameless-Write SSD -60% NW-SSD Prototype Perf. almost close to page-map FTL Device Memory Page-map FTL: 53 MB NW-SSD: 803 KB 2000 0 Seq. Write Rand. Write Seq. Read Rand. Read fio benchmark: 4KB request size 28
Conclusions OpenSSD is a valuable tool for evaluating new SSD designs Prototyping and research lessons applicable to other SSD designs First high-performance open-source FTL http://www.cs.wisc.edu/~msaxena/new/ftl.html OpenSSD prototype on display at poster session today 29
Thanks! Getting Real: Lessons in Transitioning Research Simulations into Hardware Systems Mohit Saxena, Yiying Zhang Michael Swift, Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau 30