FAWN: A Fast Array of Wimpy Nodes

Similar documents
FAWN. A Fast Array of Wimpy Nodes. David Andersen, Jason Franklin, Michael Kaminsky*, Amar Phanishayee, Lawrence Tan, Vijay Vasudevan

System and Algorithmic Adaptation for Flash

FAWN as a Service. 1 Introduction. Jintian Liang CS244B December 13, 2017

FAWN: A Fast Array of Wimpy Nodes

From architecture to algorithms: Lessons from the FAWN project

FAWN: A Fast Array of Wimpy Nodes By David G. Andersen, Jason Franklin, Michael Kaminsky, Amar Phanishayee, Lawrence Tan, and Vijay Vasudevan

SILT: A MEMORY-EFFICIENT, HIGH-PERFORMANCE KEY- VALUE STORE PRESENTED BY PRIYA SRIDHAR

C 1. Recap. CSE 486/586 Distributed Systems Distributed File Systems. Traditional Distributed File Systems. Local File Systems.

Hash-Based Indexes. Chapter 11

Cheap and Large CAMs for High Performance Data-Intensive Networked Systems- The Bufferhash KV Store

MemC3: MemCache with CLOCK and Concurrent Cuckoo Hashing

Voldemort. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Overview Design Evaluation

COSC 6385 Computer Architecture - Memory Hierarchies (III)

FaSST: Fast, Scalable, and Simple Distributed Transactions with Two-Sided (RDMA) Datagram RPCs

File system internals Tanenbaum, Chapter 4. COMP3231 Operating Systems

Memory Management. Dr. Yingwu Zhu

GFS: The Google File System

GridGain and Apache Ignite In-Memory Performance with Durability of Disk

c-through: Part-time Optics in Data Centers

Be Fast, Cheap and in Control with SwitchKV Xiaozhou Li

The Google File System

SILT: A Memory-Efficient, High- Performance Key-Value Store

Database Applications (15-415)

LSM-trie: An LSM-tree-based Ultra-Large Key-Value Store for Small Data

BUFFER HASH KV TABLE

Part II: Software Infrastructure in Data Centers: Key-Value Data Management Systems

PebblesDB: Building Key-Value Stores using Fragmented Log Structured Merge Trees

Near Memory Key/Value Lookup Acceleration MemSys 2017

LECTURE 5: MEMORY HIERARCHY DESIGN

Kathleen Durant PhD Northeastern University CS Indexes

Computer Organization and Architecture. OS Objectives and Functions Convenience Making the computer easier to use

Database Applications (15-415)

CIS 601 Graduate Seminar. Dr. Sunnie S. Chung Dhruv Patel ( ) Kalpesh Sharma ( )

Copyright 2012, Elsevier Inc. All rights reserved.

Computer Architecture. A Quantitative Approach, Fifth Edition. Chapter 2. Memory Hierarchy Design. Copyright 2012, Elsevier Inc. All rights reserved.

Hash-Based Indexes. Chapter 11. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1

Sort vs. Hash Join Revisited for Near-Memory Execution. Nooshin Mirzadeh, Onur Kocberber, Babak Falsafi, Boris Grot

CSE 4/521 Introduction to Operating Systems. Lecture 23 File System Implementation II (Allocation Methods, Free-Space Management) Summer 2018

How To Rock with MyRocks. Vadim Tkachenko CTO, Percona Webinar, Jan

Copyright 2012, Elsevier Inc. All rights reserved.

Main-Memory Databases 1 / 25

Fast Cache for Your Text: Accelerating Exact Pattern Matching with Feed-Forward Bloom Filters

CS 655 Advanced Topics in Distributed Systems

Selection Queries. to answer a selection query (ssn=10) needs to traverse a full path.

Using Transparent Compression to Improve SSD-based I/O Caches

Lecture 18: DRAM Technologies

EI338: Computer Systems and Engineering (Computer Architecture & Operating Systems)

The Adaptive Radix Tree

Low memory Map-Reduce

Federated Array of Bricks Y Saito et al HP Labs. CS 6464 Presented by Avinash Kulkarni

Computer Architecture A Quantitative Approach, Fifth Edition. Chapter 2. Memory Hierarchy Design. Copyright 2012, Elsevier Inc. All rights reserved.

YCSB++ Benchmarking Tool Performance Debugging Advanced Features of Scalable Table Stores

HDFS: Hadoop Distributed File System. Sector: Distributed Storage System

HYRISE In-Memory Storage Engine

SASS: A High-Performance Key-Value Store Design for Massive Hybrid Storage

Advanced Database Systems

Introduction to Distributed Data Systems

A Fast and High Throughput SQL Query System for Big Data

Advanced Databases: Parallel Databases A.Poulovassilis

Storage Optimization with Oracle Database 11g

CA485 Ray Walshe Google File System

Memory Systems IRAM. Principle of IRAM

Scalable and Secure Internet Services and Architecture PRESENTATION REPORT Semester: Winter 2015 Course: ECE 7650

Fusion iomemory PCIe Solutions from SanDisk and Sqrll make Accumulo Hypersonic

KVZone and the Search for a Write-Optimized Key-Value Store

CS 223 Final Project CuckooRings: A Data Structure for Reducing Maximum Load in Consistent Hashing

Was ist dran an einer spezialisierten Data Warehousing platform?

Sub-millisecond Stateful Stream Querying over Fast-evolving Linked Data

Reducing Hit Times. Critical Influence on cycle-time or CPI. small is always faster and can be put on chip

Toward Energy-efficient and Fault-tolerant Consistent Hashing based Data Store. Wei Xie TTU CS Department Seminar, 3/7/2017

Hash-Based Indexing 1

The Google File System

CS 3733 Operating Systems:

High-Performance Key-Value Store on OpenSHMEM

Part II: Software Infrastructure in Data Centers: Key-Value Data Management Systems

Comparing Performance of Solid State Devices and Mechanical Disks

Database Applications (15-415)

GFS: The Google File System. Dr. Yingwu Zhu

RAMCloud: A Low-Latency Datacenter Storage System Ankita Kejriwal Stanford University

IJSRD - International Journal for Scientific Research & Development Vol. 3, Issue 12, 2016 ISSN (online):

Virtual Memory. Motivation:

Memory management. Last modified: Adaptation of Silberschatz, Galvin, Gagne slides for the textbook Applied Operating Systems Concepts

LPC-08 Series. Quick Reference Guide. 8 Multi-functional Touch Panel PC. Copyright Notice. 2 nd Ed May 2010

Closing the Performance Gap Between Volatile and Persistent K-V Stores

A Reconfigurable Architecture for Load-Balanced Rendering

Storage Systems for Serverless Analytics

Chapter 4 Main Memory

Chapter 17: Parallel Databases

Chapter 8 Main Memory

Chapter 8: Main Memory

Cuckoo Filter: Practically Better Than Bloom

The Google File System (GFS)

DELL EMC DATA DOMAIN SISL SCALING ARCHITECTURE

Google File System (GFS) and Hadoop Distributed File System (HDFS)

Compressed Swap for Embedded Linux. Alexander Belyakov, Intel Corp.

Organization. 5.1 Semiconductor Main Memory. William Stallings Computer Organization and Architecture 6th Edition

Chapter 8: Memory-Management Strategies

Deploy a High-Performance Database Solution: Cisco UCS B420 M4 Blade Server with Fusion iomemory PX600 Using Oracle Database 12c

William Stallings Computer Organization and Architecture 6th Edition. Chapter 5 Internal Memory

SSDs vs HDDs for DBMS by Glen Berseth York University, Toronto

Transcription:

FAWN: A Fast Array of Wimpy Nodes David G. Andersen, Jason Franklin, Michael Kaminsky *, Amar Phanishayee, Lawrence Tan, Vijay Vasudevan Carnegie Mellon University, * Intel Labs SOSP 09 CAS ICT Storage System Group 1

Outline Introduction Problems Designs FAWN-KV FAWN-DS Evaluation Related Work Conclusions Acknowledgments CAS ICT Storage System Group 2

Introduction Large-scale data-intensive applications are growing in both size and importance. Common characteristics: I/O intensive, requiring random access over large datasets; Massively parallel with thousands of concurrent, mostlyindependent operations; High load requires large clusters to support; The size of objects stored is typically small. CAS ICT Storage System Group 3

Problems Small-object random-access workloads are illserved by conventional disk-based clusters. DRAM-based clusters are expensive and consume a surprising amount of power. FAWN Performance Flash Energy CAS ICT Storage System Group 4

FAWN: What is FAWN? Hardware: a specified wimpy node, embedded CPU as the processor and limited DRAM and flash as the storage medium. Software: FAWN-KV System, a system that can manage thousands of FAWN nodes efficiently. CAS ICT Storage System Group 5

Why FAWN? Increasing CPU-I/O Gap Using wimpy processors selected to reduce I/O-included idle cycles. CPU power consumption grows super-linearly with speed Dynamic power scaling on traditional systems is surprisingly inefficient CAS ICT Storage System Group 6

FAWN-KV Architecture-I Back-end: responsible for serving particular key. Front-end: Maintain membership list. Forward requests to back-end node. Front-end:Back-end = 1:n Ring CAS ICT Storage System Group 7

FAWN-KV Architecture-II Client Back-end Back-end FAWN-DS Front-end Manages back-ends Routes Requests Switch Back-end Back-end If the front-end which the client contacted with was not the back-end belonged to, How to deal this scene? CAS ICT Storage System Group 8

FAWN-KV Architecture-III Client Map table Back-end Back-end FAWN-DS Front-end Switch Back-end Front-end Back-end 1 client aware of the front-end mapping 2 front-end cache values. CAS ICT Storage System Group 9

FAWN-KV Architecture-IV Replication and Consistency Chain replication: strong consistency. CAS ICT Storage System Group 10

FAWN-KV Architecture-V Joins and Leaves Joins: Key range split; Data transmission, new vnode should get a copy of the key range; Update the front-end to valid the new vnode for requests; Free the space of the vnode witch down from the chain. CAS ICT Storage System Group 11

FAWN-KV Architecture-VI Phase 1: Datastore pre-copy E1 sends C1 a copy of the datstore log file. Phase 2: Chain insertion, log flush and play-forward Update each node s neighbor state to add C1 to the chain; Ensure any in-flight updates sent after the phase 1 completed are flushed to C1. CAS ICT Storage System Group 12

FAWN-DS-I FAWN-DS Log-structured key-value store; Using a in-dram hash table to map keys to an offset in the append-only Data Log on flash. DRAM 15 bit i bit keyfrag index flash hashtable 160- bit key Log Entry Key Len Data 15 delete 14 valid 13 keyfrag 0 Data Log 2 i buckets Fragment pnt Inserted values Offset are appended CAS ICT Storage System Group 13

FAWN-DS-II Back-end Interface: Get(key, key_len, &data); Delete(key, key_len); Insert(key, key_len, data, length). Key step of the above: Find the correct bucket of the key in the Hash index. How to map the key to hash index? 2 160 to 2 i? CAS ICT Storage System Group 14

FAWN-DS-III Conflict chain: depth = 8. Different hash functions: three funcs. h1(key) h2(key) h3(key) CAS ICT Storage System Group 15

FAWN-DS-IV Maintenance: Split, Merge, Compact Split: triggered by a node addition. H A G B F C D CAS ICT Storage System Group 16

Nodes Stream Data Range-I Create new Datastore A(dsA); Scan Datastore B(dsB) and transfer the data in rang A to dsa. Datastore list Scan and split dsb dsa Concurrent inserts CAS ICT Storage System Group 17

Nodes Stream Data Range-II Create new Datastore A(dsA); Scan Datastore B(dsB) and transfer the data in rang A to dsa. Datastore list Scan and split dsb unlock Concurrent inserts dsa CAS ICT Storage System Group 18

Evaluation Evaluation Items: K/V lookup efficiency comparison; Impact of Ring Membership Changes; TCO analysis for random read. Evaluation Hardware: AMD Geode LX processor, 500MHz; 256 MB DDR SDRAM, 400MHz; 100Mbit/s Ethernet; 4GB Sandisk Extreme IV CF. CAS ICT Storage System Group 19

K/V Lookup Efficient Comparison-I FAWN-based system over 6x more efficient than the other traditional systems CAS ICT Storage System Group 20

K/V Lookup Efficient Comparison-II CAS ICT Storage System Group 21

Impact of Ring Membership Changes-I CAS ICT Storage System Group 22

Impact of Ring Membership Changes-II CAS ICT Storage System Group 23

TCO Analysis for Random Read-I TCO = Capital Cost + Power Cost ($0.1/kWh) CAS ICT Storage System Group 24

TCO Analysis for Random Read-II How many nodes are required for a cluster? CAS ICT Storage System Group 25

TCO Analysis for Random Read-III CAS ICT Storage System Group 26

Related Work Hardware architecture: Pairing an array of flash chips and DRAM with lowpower CPUs for low-power data intensive computing. File systems for Flash: Several file systems, such as JFFS2, are specialized for use on flash. High-throughput Storage and Analysis: Some systems like Hadoop, provide bulk throughput for massive datasets with low selectivity. CAS ICT Storage System Group 27

Conclusions FAWN architecture reduce energy consumption of cluster computing. FAWN-KV address the challenges of wimpy nodes for a key-value store: Log-structured, memory efficient datastore; Efficient replication; Meets the energy efficiency and performance goals. CAS ICT Storage System Group 28

Acknowledgment Article Understanding: Prof. Xiong Fengfeng Pan Zigang Zhang PPT Production: Fengfeng Pan Biao Ma CAS ICT Storage System Group 29

Thank You! CAS ICT Storage System Group 30