Non-Volatile Memory Through Customized Key-Value Stores

Similar documents
An Analysis of Persistent Memory Use with WHISPER

An Analysis of Persistent Memory Use with WHISPER

Accessing NVM Locally and over RDMA Challenges and Opportunities

Soft Updates Made Simple and Fast on Non-volatile Memory

BzTree: A High-Performance Latch-free Range Index for Non-Volatile Memory

Aerie: Flexible File-System Interfaces to Storage-Class Memory [Eurosys 2014] Operating System Design Yongju Song

Windows Support for PM. Tom Talpey, Microsoft

Ext3/4 file systems. Don Porter CSE 506

Architectural Support for Atomic Durability in Non-Volatile Memory

Windows Support for PM. Tom Talpey, Microsoft

Using NVDIMM under KVM. Applications of persistent memory in virtualization

Closing the Performance Gap Between Volatile and Persistent K-V Stores

Blurred Persistence in Transactional Persistent Memory

Mnemosyne Lightweight Persistent Memory

Rethink the Sync 황인중, 강윤지, 곽현호. Embedded Software Lab. Embedded Software Lab.

CSE506: Operating Systems CSE 506: Operating Systems

Fine-grained Metadata Journaling on NVM

January 28-29, 2014 San Jose

Loose-Ordering Consistency for Persistent Memory

Dalí: A Periodically Persistent Hash Map

Strata: A Cross Media File System. Youngjin Kwon, Henrique Fingler, Tyler Hunt, Simon Peter, Emmett Witchel, Thomas Anderson

Topics. File Buffer Cache for Performance. What to Cache? COS 318: Operating Systems. File Performance and Reliability

ò Very reliable, best-of-breed traditional file system design ò Much like the JOS file system you are building now

Recoverability. Kathleen Durant PhD CS3200

Arrakis: The Operating System is the Control Plane

WORT: Write Optimal Radix Tree for Persistent Memory Storage Systems

SAY-Go: Towards Transparent and Seamless Storage-As-You-Go with Persistent Memory

REMOTE PERSISTENT MEMORY ACCESS WORKLOAD SCENARIOS AND RDMA SEMANTICS

PM Support in Linux and Windows. Dr. Stephen Bates, CTO, Eideticom Neal Christiansen, Principal Development Lead, Microsoft

High Performance Transactions in Deuteronomy

FILE SYSTEMS, PART 2. CS124 Operating Systems Fall , Lecture 24

Caching and reliability

SoftWrAP: A Lightweight Framework for Transactional Support of Storage Class Memory

Update on Windows Persistent Memory Support Neal Christiansen Microsoft

ext3 Journaling File System

File System Implementation. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Efficient Memory Mapped File I/O for In-Memory File Systems. Jungsik Choi, Jiwon Kim, Hwansoo Han

DHTM: Durable Hardware Transactional Memory

NVMFS: A New File System Designed Specifically to Take Advantage of Nonvolatile Memory

RDMA Requirements for High Availability in the NVM Programming Model

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

Reducing CPU and network overhead for small I/O requests in network storage protocols over raw Ethernet

File Systems: Consistency Issues

The SNIA NVM Programming Model. #OFADevWorkshop

PASTE: A Network Programming Interface for Non-Volatile Main Memory

Oracle Database 12c: JMS Sharded Queues

Windows Persistent Memory Support

What You can Do with NVDIMMs. Rob Peglar President, Advanced Computation and Storage LLC

Accelerating Microsoft SQL Server Performance With NVDIMM-N on Dell EMC PowerEdge R740

Falcon: Scaling IO Performance in Multi-SSD Volumes. The George Washington University

Hardware Undo+Redo Logging. Matheus Ogleari Ethan Miller Jishen Zhao CRSS Retreat 2018 May 16, 2018

System Software for Persistent Memory

File System Implementation

CS3600 SYSTEMS AND NETWORKS

New Abstractions for Fast Non-Volatile Storage

Ben Walker Data Center Group Intel Corporation

PebblesDB: Building Key-Value Stores using Fragmented Log Structured Merge Trees

SNIA NVM Programming Model Workgroup Update. #OFADevWorkshop

Journaling. CS 161: Lecture 14 4/4/17

File system internals Tanenbaum, Chapter 4. COMP3231 Operating Systems

Advanced file systems: LFS and Soft Updates. Ken Birman (based on slides by Ben Atkin)

NV-Tree Reducing Consistency Cost for NVM-based Single Level Systems

NVMe SSDs with Persistent Memory Regions

TxFS: Leveraging File-System Crash Consistency to Provide ACID Transactions

Block Device Scheduling. Don Porter CSE 506

Block Device Scheduling

Lecture 21: Logging Schemes /645 Database Systems (Fall 2017) Carnegie Mellon University Prof. Andy Pavlo

JOURNALING techniques have been widely used in modern

The Tux3 File System

NVthreads: Practical Persistence for Multi-threaded Applications

COS 318: Operating Systems. NSF, Snapshot, Dedup and Review

APIs for Persistent Memory Programming

TRANSACTIONAL FLASH CARSTEN WEINHOLD. Vijayan Prabhakaran, Thomas L. Rodeheffer, Lidong Zhou

Dan Noé University of New Hampshire / VeloBit

NPTEL Course Jan K. Gopinath Indian Institute of Science

! Design constraints. " Component failures are the norm. " Files are huge by traditional standards. ! POSIX-like

A New Key-value Data Store For Heterogeneous Storage Architecture Intel APAC R&D Ltd.

File System Implementation

Deukyeon Hwang UNIST. Wook-Hee Kim UNIST. Beomseok Nam UNIST. Hanyang Univ.

Shared snapshots. 1 Abstract. 2 Introduction. Mikulas Patocka Red Hat Czech, s.r.o. Purkynova , Brno Czech Republic

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

Distributed caching for cloud computing

15: Filesystem Examples: Ext3, NTFS, The Future. Mark Handley. Linux Ext3 Filesystem

COS 318: Operating Systems. Journaling, NFS and WAFL

Fine-grained Metadata Journaling on NVM

CS 318 Principles of Operating Systems

Bill Bridge. Oracle Software Architect NVM support for C Applications

Linux Kernel Abstractions for Open-Channel SSDs

Log-Free Concurrent Data Structures

Performance Benefits of Running RocksDB on Samsung NVMe SSDs

Exploiting the benefits of native programming access to NVM devices

PASTE: Fast End System Networking with netmap

Chapter 11: Implementing File Systems

Flash Memory Summit Persistent Memory - NVDIMMs

CSE 120: Principles of Operating Systems. Lecture 10. File Systems. February 22, Prof. Joe Pasquale

CSE 124: Networked Services Lecture-17

Remote Persistent Memory With Nothing But Net Tom Talpey Microsoft

Flavors of Memory supported by Linux, their use and benefit. Christoph Lameter, Ph.D,

CSE 120: Principles of Operating Systems. Lecture 10. File Systems. November 6, Prof. Joe Pasquale

Topics. " Start using a write-ahead log on disk " Log all updates Commit

Transcription:

Non-Volatile Memory Through Customized Key-Value Stores Leonardo Mármol 1 Jorge Guerra 2 Marcos K. Aguilera 2 1 Florida International University 2 VMware L. Mármol, J. Guerra, M. K. Aguilera (FIU and VMware) NVM Through Customized KV-Stores 1 / 17

Characteristics of NVM Non-volatile Memory survives power cycles No need to restore from slow disks or flash High density Low latency Fine granularity updates Operates on individual words Access through load and store instructions L. Mármol, J. Guerra, M. K. Aguilera (FIU and VMware) NVM Through Customized KV-Stores 2 / 17

NVM Challenges Non-persistent caching Out-of-order flushes write-back caches Torn writes Updates bigger than 8 bytes are not atomic Complex interfaces flushing cache lines, using memory fences, etc. L. Mármol, J. Guerra, M. K. Aguilera (FIU and VMware) NVM Through Customized KV-Stores 3 / 17

Approaches to use NVM NVM Low-Latency High-Density Byte- Addressable Persistent Storage Memory Block Dev. Namespace Filesystem Transactions Sharing Pointers L. Mármol, J. Guerra, M. K. Aguilera (FIU and VMware) NVM Through Customized KV-Stores 4 / 17

Application Specific Solution We argue for consuming NVM through a transactional key-value store. Flexible Simple Performant L. Mármol, J. Guerra, M. K. Aguilera (FIU and VMware) NVM Through Customized KV-Stores 5 / 17

Case Study: VMware R Virtual San L. Mármol, J. Guerra, M. K. Aguilera (FIU and VMware) NVM Through Customized KV-Stores 6 / 17

metradb: Specialized KV Store for VSAN Organizes objects in Containers Provides a flat namespace for Containers Provides transactional update containers Only one active transaction per container Transactions do not expand to multiple containers Provides KV-Store like interface L. Mármol, J. Guerra, M. K. Aguilera (FIU and VMware) NVM Through Customized KV-Stores 7 / 17

metradb API Operation open(name, flags) remove(name) close(h) put(h, k, buf, len) get(h, k, buf, len) delete(h, k) commit(h) abort(h) Description open/create container, get handle remove container close a handle put key-value pair get key-value pair delete key-value pair commit transaction abort transaction L. Mármol, J. Guerra, M. K. Aguilera (FIU and VMware) NVM Through Customized KV-Stores 8 / 17

Transactions: How to do them? Undo Logging Update in-place Adds latency to critical path No easy way to batch and flush (poor cache locality) Data can be read from its original location Easy to implement Redo Logging Updates are buffered and applied at commit Batch flushes and sync (better cache locality) No latency added to the critical path Data may need to be read from the log Implementation is more complicated L. Mármol, J. Guerra, M. K. Aguilera (FIU and VMware) NVM Through Customized KV-Stores 9 / 17

Shadow Bitmaps: Handling Allocations L. Mármol, J. Guerra, M. K. Aguilera (FIU and VMware) NVM Through Customized KV-Stores 10 / 17

Shadow Bitmaps: Handling Allocations L. Mármol, J. Guerra, M. K. Aguilera (FIU and VMware) NVM Through Customized KV-Stores 10 / 17

Shadow Bitmaps: Handling Allocations L. Mármol, J. Guerra, M. K. Aguilera (FIU and VMware) NVM Through Customized KV-Stores 10 / 17

Shadow Bitmaps: Handling Allocations L. Mármol, J. Guerra, M. K. Aguilera (FIU and VMware) NVM Through Customized KV-Stores 10 / 17

Shadow Bitmaps: Handling Allocations L. Mármol, J. Guerra, M. K. Aguilera (FIU and VMware) NVM Through Customized KV-Stores 10 / 17

Shadow Bitmaps: Handling Allocations L. Mármol, J. Guerra, M. K. Aguilera (FIU and VMware) NVM Through Customized KV-Stores 10 / 17

Shadow Bitmaps: Handling Allocations L. Mármol, J. Guerra, M. K. Aguilera (FIU and VMware) NVM Through Customized KV-Stores 10 / 17

Implementing Transactions Redo logging Out-of-place updates Shadow data structures Idempotent commits Volatile metadata can be reconstructed from the logs Implicit start transaction Move the state of the KV Store from one consistent state to the next L. Mármol, J. Guerra, M. K. Aguilera (FIU and VMware) NVM Through Customized KV-Stores 11 / 17

Indexing: Which data structure to use? B+ Tree Higher latency for average operations Higher write amplification Predictable performance More difficult to implement Maintain key order Hash Table Low latency for average operation Lower write amplification Less predictable performance Easy to implement Does not maintain key order L. Mármol, J. Guerra, M. K. Aguilera (FIU and VMware) NVM Through Customized KV-Stores 12 / 17

Experimental Setup metradb is a user space library for GNU/Linux Linux Kernel v4.4 24 GB of RAM Intel XeonE5-2440 v2 1.90GHz CPU 8 cores each with 2 hyper-threads NVM was simulated with memory mapped files EXT4 with DAX support L. Mármol, J. Guerra, M. K. Aguilera (FIU and VMware) NVM Through Customized KV-Stores 13 / 17

Comparison with NVML Avg Latency (µs) 0.5 0.4 0.3 0.2 0.1 0.0 20 15 10 5 0 25 20 15 10 5 0 Get Put Delete 1.45 1.50 1.44 metradb ctree btree 33.6 59.8 58.4 rbtree htbl_atomic htbl_tx 2.2 10x Lower is better 6.6-15x 12-50x L. Mármol, J. Guerra, M. K. Aguilera (FIU and VMware) NVM Through Customized KV-Stores 14 / 17

Comparison with NVML Avg Latency (µs) 0.5 0.4 0.3 0.2 0.1 0.0 20 15 10 5 0 25 20 15 10 5 0 Get Put Delete 1.45 1.50 1.44 metradb ctree btree 33.6 59.8 58.4 rbtree htbl_atomic htbl_tx 2.2 10x Lower is better 6.6-15x 12-50x L. Mármol, J. Guerra, M. K. Aguilera (FIU and VMware) NVM Through Customized KV-Stores 14 / 17

Comparison with NVML Avg Latency (µs) 0.5 0.4 0.3 0.2 0.1 0.0 20 15 10 5 0 25 20 15 10 5 0 Get Put Delete 1.45 1.50 1.44 metradb ctree btree 33.6 59.8 58.4 rbtree htbl_atomic htbl_tx 2.2 10x Lower is better 6.6-15x 12-50x L. Mármol, J. Guerra, M. K. Aguilera (FIU and VMware) NVM Through Customized KV-Stores 14 / 17

Comparison with NVML Avg Latency (µs) 0.5 0.4 0.3 0.2 0.1 0.0 20 15 10 5 0 25 20 15 10 5 0 Get Put Delete 1.45 1.50 1.44 metradb ctree btree 33.6 59.8 58.4 rbtree htbl_atomic htbl_tx 2.2 10x Lower is better 6.6-15x 12-50x L. Mármol, J. Guerra, M. K. Aguilera (FIU and VMware) NVM Through Customized KV-Stores 14 / 17

Comparison with NVML Avg Latency (µs) 0.5 0.4 0.3 0.2 0.1 0.0 20 15 10 5 0 25 20 15 10 5 0 Get Put Delete 1.45 1.50 1.44 metradb ctree btree 33.6 59.8 58.4 rbtree htbl_atomic htbl_tx 2.2 10x Lower is better 6.6-15x 12-50x L. Mármol, J. Guerra, M. K. Aguilera (FIU and VMware) NVM Through Customized KV-Stores 14 / 17

Comparison with NVML Avg Latency (µs) 0.5 0.4 0.3 0.2 0.1 0.0 20 15 10 5 0 25 20 15 10 5 0 Get Put Delete 1.45 1.50 1.44 metradb ctree btree 33.6 59.8 58.4 rbtree htbl_atomic htbl_tx 2.2 10x Lower is better 6.6-15x 12-50x L. Mármol, J. Guerra, M. K. Aguilera (FIU and VMware) NVM Through Customized KV-Stores 14 / 17

Throughput Scalability of metradb Relative Througput (Ops) 16 14 12 10 8 6 4 2 Ideal Get Put Delete Higher is better 0 0 2 4 6 8 10 12 14 16 Number of Containers / Threads Number of cores L. Mármol, J. Guerra, M. K. Aguilera (FIU and VMware) NVM Through Customized KV-Stores 15 / 17

Throughput Scalability of metradb Relative Througput (Ops) 16 14 12 10 8 6 4 2 Ideal Get Put Delete Higher is better 0 0 2 4 6 8 10 12 14 16 Number of Containers / Threads Number of cores L. Mármol, J. Guerra, M. K. Aguilera (FIU and VMware) NVM Through Customized KV-Stores 15 / 17

Summary We propose application to consume NVM through a middle layer For our application a key-value interface was sufficient This approach allows simplicity, easy adoptions of different NVM technologies, and fast development About 2.3K LOC Because our solution was tailored to our application, we achieved higher performance than more general solutions L. Mármol, J. Guerra, M. K. Aguilera (FIU and VMware) NVM Through Customized KV-Stores 16 / 17

Thank you! Leonardo Mármol <marmol@cs.fiu.edu>