YCSB++ Benchmarking Tool Performance Debugging Advanced Features of Scalable Table Stores

Similar documents
YCSB++ benchmarking tool Performance debugging advanced features of scalable table stores

Qing Zheng Lin Xiao, Kai Ren, Garth Gibson

Crossing the Chasm: Sneaking a parallel file system into Hadoop

Crossing the Chasm: Sneaking a parallel file system into Hadoop

Benchmarking Cloud Serving Systems with YCSB 詹剑锋 2012 年 6 月 27 日

Fusion iomemory PCIe Solutions from SanDisk and Sqrll make Accumulo Hypersonic

Google Cloud Bigtable. And what it's awesome at

Distributed File Systems II

PebblesDB: Building Key-Value Stores using Fragmented Log Structured Merge Trees

18-hdfs-gfs.txt Thu Nov 01 09:53: Notes on Parallel File Systems: HDFS & GFS , Fall 2012 Carnegie Mellon University Randal E.

Bigtable. A Distributed Storage System for Structured Data. Presenter: Yunming Zhang Conglong Li. Saturday, September 21, 13

18-hdfs-gfs.txt Thu Oct 27 10:05: Notes on Parallel File Systems: HDFS & GFS , Fall 2011 Carnegie Mellon University Randal E.

10 Million Smart Meter Data with Apache HBase

Data Informatics. Seon Ho Kim, Ph.D.

Lecture 11 Hadoop & Spark

DiskReduce: Making Room for More Data on DISCs. Wittawat Tantisiriroj

NoSQL Databases MongoDB vs Cassandra. Kenny Huynh, Andre Chik, Kevin Vu

ΕΠΛ 602:Foundations of Internet Technologies. Cloud Computing

Bigtable: A Distributed Storage System for Structured Data by Google SUNNIE CHUNG CIS 612

CSE-E5430 Scalable Cloud Computing Lecture 9

CSE 444: Database Internals. Lectures 26 NoSQL: Extensible Record Stores

Scalable Table Stores: Tools for Understanding Advanced Key-Value Systems for Hadoop

Time Series Storage with Apache Kudu (incubating)

MDHIM: A Parallel Key/Value Store Framework for HPC

EsgynDB Enterprise 2.0 Platform Reference Architecture

References. What is Bigtable? Bigtable Data Model. Outline. Key Features. CSE 444: Database Internals

Introduction Data Model API Building Blocks SSTable Implementation Tablet Location Tablet Assingment Tablet Serving Compactions Refinements

Flash Storage Complementing a Data Lake for Real-Time Insight

Experiment-Driven Evaluation of Cloud-based Distributed Systems

Apache HBase Andrew Purtell Committer, Apache HBase, Apache Software Foundation Big Data US Research And Development, Intel

Processing of big data with Apache Spark

Let s Make Parallel File System More Parallel

Tools for Social Networking Infrastructures

April Copyright 2013 Cloudera Inc. All rights reserved.

HDFS: Hadoop Distributed File System. CIS 612 Sunnie Chung

MapR Enterprise Hadoop

Yuval Carmel Tel-Aviv University "Advanced Topics in Storage Systems" - Spring 2013

Ghislain Fourny. Big Data 5. Wide column stores

Presented by Nanditha Thinderu

Unifying Big Data Workloads in Apache Spark

Bigtable: A Distributed Storage System for Structured Data. Andrew Hon, Phyllis Lau, Justin Ng

Adaptation in distributed NoSQL data stores

Storage Systems for Shingled Disks

Big Data Processing Technologies. Chentao Wu Associate Professor Dept. of Computer Science and Engineering

Architecture of a Real-Time Operational DBMS

Accelerating Big Data: Using SanDisk SSDs for Apache HBase Workloads

Distributed computing: index building and use

CIS 601 Graduate Seminar. Dr. Sunnie S. Chung Dhruv Patel ( ) Kalpesh Sharma ( )

Introduction to Apache Beam

Introduction to NoSQL Databases

Konstantin Shvachko, Hairong Kuang, Sanjay Radia, Robert Chansler Yahoo! Sunnyvale, California USA {Shv, Hairong, SRadia,

IndexFS: Scaling File System Metadata Performance with Stateless Caching and Bulk Insertion

Ghislain Fourny. Big Data 5. Column stores

CS / Cloud Computing. Recitation 11 November 5 th and Nov 8 th, 2013

BigTable: A Distributed Storage System for Structured Data

Hadoop An Overview. - Socrates CCDH

Voldemort. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Overview Design Evaluation

CISC 7610 Lecture 2b The beginnings of NoSQL

Column Stores and HBase. Rui LIU, Maksim Hrytsenia

Performance Benefits of Running RocksDB on Samsung NVMe SSDs

Distributed Data Analytics Partitioning

A BigData Tour HDFS, Ceph and MapReduce

Name: Instructions. Problem 1 : Short answer. [48 points] CMU / Storage Systems 23 Feb 2011 Spring 2012 Exam 1

HyperDex. A Distributed, Searchable Key-Value Store. Robert Escriva. Department of Computer Science Cornell University

Bigtable: A Distributed Storage System for Structured Data By Fay Chang, et al. OSDI Presented by Xiang Gao

Cassandra, MongoDB, and HBase. Cassandra, MongoDB, and HBase. I have chosen these three due to their recent

The Google File System

FROM LEGACY, TO BATCH, TO NEAR REAL-TIME. Marc Sturlese, Dani Solà

LazyBase: Trading freshness and performance in a scalable database

Outline. Spanner Mo/va/on. Tom Anderson

Cloudera Kudu Introduction

Hadoop. Course Duration: 25 days (60 hours duration). Bigdata Fundamentals. Day1: (2hours)

Data Acquisition. The reference Big Data stack

BigTable: A Distributed Storage System for Structured Data (2006) Slides adapted by Tyler Davis

Hadoop Beyond Batch: Real-time Workloads, SQL-on- Hadoop, and thevirtual EDW Headline Goes Here

Distributed Systems 16. Distributed File Systems II

Data Acquisition. The reference Big Data stack

Cloud Computing at Yahoo! Thomas Kwan Director, Research Operations Yahoo! Labs

SQL, NoSQL, MongoDB. CSE-291 (Cloud Computing) Fall 2016 Gregory Kesden

Tuning Enterprise Information Catalog Performance

Apache Accumulo 1.4 & 1.5 Features

CLOUD-SCALE FILE SYSTEMS

Cloud Computing and Hadoop Distributed File System. UCSB CS170, Spring 2018

Cassandra - A Decentralized Structured Storage System. Avinash Lakshman and Prashant Malik Facebook

Deployment Planning and Optimization for Big Data & Cloud Storage Systems

DiskReduce: Making Room for More Data on DISCs. Wittawat Tantisiriroj

Chapter 5. The MapReduce Programming Model and Implementation

VOLTDB + HP VERTICA. page

The Google File System. Alexandru Costan

New Oracle NoSQL Database APIs that Speed Insertion and Retrieval

HBase... And Lewis Carroll! Twi:er,

Hadoop 2.x Core: YARN, Tez, and Spark. Hortonworks Inc All Rights Reserved

NOSQL DATABASE SYSTEMS: DECISION GUIDANCE AND TRENDS. Big Data Technologies: NoSQL DBMS (Decision Guidance) - SoSe

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective

Diagnosis via monitoring & tracing

Apache Hadoop.Next What it takes and what it means

Scalable Web Programming. CS193S - Jan Jannink - 2/25/10

Typical size of data you deal with on a daily basis

The Stratosphere Platform for Big Data Analytics

CS November 2018

Transcription:

YCSB++ Benchmarking Tool Performance Debugging Advanced Features of Scalable Table Stores Swapnil Patil Milo Polte, Wittawat Tantisiriroj, Kai Ren, Lin Xiao, Julio Lopez, Garth Gibson, Adam Fuchs *, Billie Rinaldi * Carnegie Mellon University * National Security Agency Open Cirrus Summit, October 2011, Atlanta GA

Scalable table stores are critical systems For data processing & analysis (e.g. Pregel, Hive) For systems services (e.g., Google Colossus metadata) Swapnil Patil, CMU 2

Evolution of scalable table stores Growing set of HBase features 2008 2009 2010 2011+ HBASE release RangeRowFilters Batch updates Bulk load tools RegEx filtering Scan optimizations Co-processors Access Control Simple, lightweight complex, feature-rich stores Supports a broader range of applications and services Hard to debug and understand performance problems Complex behavior and interaction of various components Swapnil Patil, CMU 3

Need richer tools for understanding advanced features in table stores YCSB++ FUNCTIONALITY ZooKeeper-based distributed and coordinated testing API and extensions the new Apache ACCUMULO DB Fine-grained, correlated monitoring using OTUS FEATURES TESTED USING YCSB++ Batch writing Table pre-splitting Bulk loading Weak consistency Server-side filtering Fine-grained security Tool released at http://www.pdl.cmu.edu/ycsb++ Swapnil Patil, CMU 4

Outline Problem YCSB++ design Illustrative examples Ongoing work and summary Swapnil Patil, CMU 5

Yahoo Cloud Serving Benchmark [Cooper2010] Command-line Parameters HBASE Workload Parameter File Workload Executor Threads DB Clients OTHER DBS Stats Storage Servers For CRUD (create-read-update-delete) benchmarking Single-node system with an extensible API Swapnil Patil, CMU 6

YCSB++: New extensions Command-line Parameters HBASE Workload Parameter File Workload Executor Threads DB Clients OTHER DBS EXTENSIONS EXTENSIONS Stats ACCUMULO Storage Servers Added support for the new Apache ACCUMULO DB New parameters and workload executors Swapnil Patil, CMU 7

YCSB++: Distributed & parallel tests Command-line Parameters MULTI-PHASE HBASE Workload Parameter File Workload Executor Threads DB Clients OTHER DBS EXTENSIONS EXTENSIONS Stats ACCUMULO COORDINATION YCSB clients Storage Servers Multi-client, multi-phase coordination using ZooKeeper Enables testing at large scales and testing asymmetric features Swapnil Patil, CMU 8

YCSB++: Collective monitoring Command-line Parameters MULTI-PHASE HBASE Workload Parameter File Workload Executor Threads DB Clients OTHER DBS EXTENSIONS EXTENSIONS Stats ACCUMULO COORDINATION OTUS MONITORING YCSB clients Storage Servers OTUS monitor built on Ganglia [Ren2011] Collects information from YCSB, table stores, HDFS and OS Swapnil Patil, CMU 9

Example of YCSB++ debugging Monitoring Resource Usage and TableStore Metrics CPU Usage (%) 100 80 60 40 20 Accumulo Avg. StoreFiles per Tablet HDFS DataNode CPU Usage Accumulo TabletServer CPU Usage 40 32 24 16 8 Avg Number of StoreFiles Per Tablet 0 0 00:00 04:00 08:00 12:00 16:00 20:00 00:00 04:00 Time (Minutes) OTUS collects fine-grained information Both HDFS process and TabletServer process on same node Swapnil Patil, CMU 10

Outline Problem YCSB++ design Illustrative examples YCSB++ on HBASE and ACCUMULO (Bigtable-like stores) Ongoing work and summary Swapnil Patil, CMU 11

Recap of Bigtable-like table stores Tablet Servers Data Insertion Memtable 1 2 Write Ahead Log Sorted Indexed Files MINOR COMPACTION MAJOR COMPACTION 3 Tablet T N (Fewer) Sorted Indexed Files HDFS nodes Write-path: in-memory buffering & async FS writes 1) Mutations logged in memory tables (unsorted order) 2) Minor compaction: Memtables -> sorted, indexed files in HDFS 3) Major compaction: LSM-tree based file merging in background Read-path: lookup both memtables and on-disk files Swapnil Patil, CMU 12

Apache ACCUMULO Started at NSA; now an Apache Incubator project Designed for for high-speed ingest and scan workloads http://incubator.apache.org/projects/accumulo.html New features in ACCUMULO Iterator framework for user-specified programs placed in between different stages of the DB pipeline E.g., Support joins and stream processing using iterators Also supports fine-grained cell-level access control Swapnil Patil, CMU 13

ILLUSTRATIVE EXAMPLE #1 Analyzing the fast inserts vs. weak consistency tradeoff using YCSB++ Swapnil Patil, CMU 14

Client-side batch writing Table store servers CLIENT #1 CLIENT #2 Store client Batch YCSB++ ZooKeeper Cluster Manager Store client Read {K} YCSB++ Feature: clients batch inserts, delay writes to server Improves insert throughput and latency Newly inserted data may not be immediately visible to other clients Swapnil Patil, CMU 15

Batch writing improves throughput Inserts per second (1000s) 60 50 40 30 20 10 0 Hbase Accumulo 10 KB 100 KB 1 MB 10 MB Batch size 6 clients creating 9 million 1-Kbyte records on 6 servers Small batches - high client CPU utilization, limits throughput Large batches - saturate servers, limited benefit from batching Swapnil Patil, CMU 16

Batch writing causes weak consistency Table store servers CLIENT #1 CLIENT #2 Store client Store client 1 Insert {K:V} (10 6 records) Batch YCSB++ ZooKeeper 2 3 Enqueue K (sample 1% records) Poll and dequeue K 4 Read {K} YCSB++ Test setup: ZooKeeper-based client coordination Share producer-consumer queue between readers/writers R-W lag = delay before C2 can read C1 s most recent write Swapnil Patil, CMU 17

Batch writing causes weak consistency (a) HBase: Time lag for different buffer sizes (b) Accumulo: Time lag for different buffer sizes Fraction of requests 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 10 KB ( <1%) 100 KB (7.4%) 1 MB ( 17%) 10 MB ( 23%) Fraction of requests 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 10 KB ( <1%) 100 KB (1.2%) 1 MB ( 14%) 10 MB ( 33%) 0.1 0.1 0 0 1 10 100 1000 10000 100000 read-after-write time lag (ms) 1 10 100 1000 10000 100000 read-after-write time lag (ms) Deferred write wins, but lag can be ~100 seconds (N%) = fraction of requests that needed multiple read()s Implementation of batching affects the median latency Swapnil Patil, CMU 18

ILLUSTRATIVE EXAMPLE #2 Benchmarking high-speed ingest features using YCSB++ Swapnil Patil, CMU 19

Features for high-speed insertions Most table stores have high-speed ingest features Periodically insert large amounts of data or migrate old data in bulk Classic relational DB techniques applied to new stores Two features: bulk loading and table pre-splitting Less data migration during inserts Engages more tablet servers immediately Need careful tuning and configuration [Sasha2002] Swapnil Patil, CMU 20

8-phase test setup: table bulk loading Phases 1 2 3 4 5 6 7 8 Pre-Load (re-formatting) Pre-Load (importing) Read/Update workload Load (re-formatting) Load (importing) Read/Update workload Sleep Read/Update workload Bulk loading involves two steps Hadoop-based data formatting Importing store files into table store Pre-load phase (1 and 2) Bulk load 6M rows in an empty table Goal: parallelism by engaging all servers Load phase (4 and 5) Load 48M new rows Goal: study rebalancing during ingest R/U measurements (3, 6 and 7) Correlate latency with rebalancing work Swapnil Patil, CMU 21

Read latency affected by rebalancing work Phases 1 2 3 4 5 6 Pre-Load (re-formatting) Pre-Load (importing) Read/Update workload Load (re-formatting) Load (importing) Read/Update workload Accumulo Read Latency (ms) R/U 1 (Phase 3) R/U 2 (Phase 6) R/U 3 (Phase 8) 1000 100 10 1 0 60 120 180 240 300 Measurement Phase Running Time (Seconds) 7 8 Sleep Read/Update workload High latency after high insertion periods that cause servers to rebalance (compactions) Latency drops after store is in a steady state Swapnil Patil, CMU 22

Rebalancing on ACCUMULO servers Phases 1 Pre-Load (re-formatting) 1000 2 Pre-Load (importing) 100 StoreFiles 3 4 Read/Update workload Load (re-formatting) 10 Tablets Compactions 5 6 7 8 Load (importing) Read/Update workload Sleep Read/Update workload 1 0 300 600 900 1200 1500 1800 Experiment Running Time (sec) OTUS monitor shows the server-side compactions during post-ingest measurement phases Swapnil Patil, CMU 23

HBASE is slower: Different compaction policies Accumulo Read Latency (ms) R/U 1 (Phase 3) R/U 2 (Phase 6) R/U 3 (Phase 8) 10000 1000 1000 100 10 1 0 60 120 180 240 300 Measurement Phase Running Time (Seconds) HBase Read Latency (ms) 10000 1000 1000 100 10 1 0 60 120 180 240 300 Measurement Phase Running Time (Seconds) 100 StoreFiles 100 10 Tablets 10 Compactions 1 0 300 600 900 1200 1500 1800 Accumulo Experiment Running Time (sec) 1 0 300 600 900 1200 1500 1800 HBase Experiment Running Time (sec) Swapnil Patil, CMU 24

Extending to table pre-splitting Pre-split a key range into N partitions to avoid splitting during insertion Pre-Load (re-formatting) Pre-Load (importing) Pre-load Read/Update workload Pre-split into N ranges Load (re-formatting) Load (importing) Read/Update workload Sleep Read/Update workload Bulk loading test Table pre-splitting test Load Read/Update workload Sleep Read/Update workload Swapnil Patil, CMU 25

Outline Problem YCSB++ design Illustrative examples Ongoing work and summary Swapnil Patil, CMU 26

Things not covered in this talk More features: function shipping to servers Data filtering at the servers Fine-grained, cell-level access control More details in the ACM SOCC 2011 paper Ongoing work Analyze more table stores: Cassandra, CouchDB, MongoDB Continue research through the new Intel Science and Technology Center for Cloud Computing at CMU (with GaTech) Swapnil Patil, CMU 27

Summary: YCSB++ tool Tool for performance debugging and benchmarking advanced features using new extensions to YCSB Weak consistency semantics Fast insertions (pre-splits & bulk loads) Server-side filtering Fine-grained access control Distributed clients using ZooKeeper Multi-phase testing (with Hadoop) New workload generators and database client API extensions Two case-studies: Apache HBASE and ACCUMULO Tool available at http://www.pdl.cmu.edu/ycsb++ Swapnil Patil, CMU 28