Squall: Fine-Grained Live Reconfiguration for Partitioned Main Memory Databases

Similar documents
E-Store: Fine-Grained Elastic Partitioning for Distributed Transaction Processing Systems

Predictive Elastic Database Systems. Rebecca Taft HPTS 2017

E-Store: Fine-Grained Elastic Partitioning for Distributed Transaction Processing Systems

Big and Fast. Anti-Caching in OLTP Systems. Justin DeBrabant

Rethinking Serializable Multi-version Concurrency Control. Jose Faleiro and Daniel Abadi Yale University

DATABASES IN THE CMU-Q December 3 rd, 2014

CMU SCS CMU SCS Who: What: When: Where: Why: CMU SCS

Anti-Caching: A New Approach to Database Management System Architecture. Guide: Helly Patel ( ) Dr. Sunnie Chung Kush Patel ( )

An Evaluation of Distributed Concurrency Control. Harding, Aken, Pavlo and Stonebraker Presented by: Thamir Qadah For CS590-BDS

Main-Memory Databases 1 / 25

Advances in Data Management - NoSQL, NewSQL and Big Data A.Poulovassilis

Janus: A Hybrid Scalable Multi-Representation Cloud Datastore

Huge market -- essentially all high performance databases work this way

S-Store: Streaming Meets Transaction Processing

arxiv: v1 [cs.db] 8 Mar 2017

Voldemort. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Overview Design Evaluation

CSE 544: Principles of Database Systems

Advances in Data Management - NoSQL, NewSQL and Big Data A.Poulovassilis

CGAR: Strong Consistency without Synchronous Replication. Seo Jin Park Advised by: John Ousterhout

Stephen Tu, Wenting Zheng, Eddie Kohler, Barbara Liskov, Samuel Madden Presenter : Akshada Kulkarni Acknowledgement : Author s slides are used with

Last Class Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications

Data Transformation and Migration in Polystores

Architecture of a Real-Time Operational DBMS

Rocksteady: Fast Migration for Low-Latency In-memory Storage. Chinmay Kulkarni, Aniraj Kesavan, Tian Zhang, Robert Ricci, Ryan Stutsman

TIME TRAVELING HARDWARE AND SOFTWARE SYSTEMS. Xiangyao Yu, Srini Devadas CSAIL, MIT

TicToc: Time Traveling Optimistic Concurrency Control

Traditional RDBMS Wisdom is All Wrong -- In Three Acts "

Concurrency Control Goals

SFS: Random Write Considered Harmful in Solid State Drives

Outline. Parallel Database Systems. Information explosion. Parallelism in DBMSs. Relational DBMS parallelism. Relational DBMSs.

THE UNIVERSITY OF CHICAGO TOWARD COORDINATION-FREE AND RECONFIGURABLE MIXED CONCURRENCY CONTROL A DISSERTATION SUBMITTED TO

Leveraging Lock Contention to Improve Transaction Applications. University of Washington

Dell Server Migration Utility (SMU)

Crescando: Predictable Performance for Unpredictable Workloads

4 Myths about in-memory databases busted

Clay: Fine-Grained Adaptive Partitioning for General Database Schemas

Anastasia Ailamaki. Performance and energy analysis using transactional workloads

Toward Energy-efficient and Fault-tolerant Consistent Hashing based Data Store. Wei Xie TTU CS Department Seminar, 3/7/2017

Amazon Aurora Deep Dive

SCHISM: A WORKLOAD-DRIVEN APPROACH TO DATABASE REPLICATION AND PARTITIONING

Abstract /10/$26.00 c 2010 IEEE

STORAGE LATENCY x. RAMAC 350 (600 ms) NAND SSD (60 us)

Low Overhead Concurrency Control for Partitioned Main Memory Databases. Evan P. C. Jones Daniel J. Abadi Samuel Madden"

YCSB++ benchmarking tool Performance debugging advanced features of scalable table stores

ProRea Live Database Migration for Multi-tenant RDBMS with Snapshot Isolation

The Google File System

EMC Business Continuity for Microsoft Applications

CIS 601 Graduate Seminar. Dr. Sunnie S. Chung Dhruv Patel ( ) Kalpesh Sharma ( )

PXFS Project. Alireza Kheirkhahan

Amazon Aurora Deep Dive

Lock Tuning. Concurrency Control Goals. Trade-off between correctness and performance. Correctness goals. Performance goals.

STAR: Scaling Transactions through Asymmetric Replication

Ambry: LinkedIn s Scalable Geo- Distributed Object Store

Database Management and Tuning

Nutanix Tech Note. Virtualizing Microsoft Applications on Web-Scale Infrastructure

FOEDUS: OLTP Engine for a Thousand Cores and NVRAM

MINIMIZING TRANSACTION LATENCY IN GEO-REPLICATED DATA STORES

Storage Optimization with Oracle Database 11g

HyPer-sonic Combined Transaction AND Query Processing

The MOSIX Scalable Cluster Computing for Linux. mosix.org

MY WEAK CONSISTENCY IS STRONG WHEN BAD THINGS DO NOT COME IN THREES ZECHAO SHANG JEFFREY XU YU

Orleans. Cloud Computing for Everyone. Hamid R. Bazoobandi. March 16, Vrije University of Amsterdam

Administração e Optimização de BDs 2º semestre

Protect enterprise data, achieve long-term data retention

Sizing Guidelines and Performance Tuning for Intelligent Streaming

VERITAS Storage Foundation 4.0 TM for Databases

HyPer-sonic Combined Transaction AND Query Processing

The mixed workload CH-BenCHmark. Hybrid y OLTP&OLAP Database Systems Real-Time Business Intelligence Analytical information at your fingertips

IBM InfoSphere Streams v4.0 Performance Best Practices

What We Have Already Learned. DBMS Deployment: Local. Where We Are Headed Next. DBMS Deployment: 3 Tiers. DBMS Deployment: Client/Server

VOLTDB + HP VERTICA. page

P-Store: An Elastic Database System with Predictive Provisioning

Oracle Database 10g The Self-Managing Database

RAMP: A Lightweight RDMA Abstraction for Loosely Coupled Applications

Designing Parallel Programs. This review was developed from Introduction to Parallel Computing

CPSC 421 Database Management Systems. Lecture 19: Physical Database Design Concurrency Control and Recovery

HyPer on Cloud 9. Thomas Neumann. February 10, Technische Universität München

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe Slide 16-1

The End of an Architectural Era (It's Time for a Complete Rewrite)

Information Systems (Informationssysteme)

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Database Systems: Fall 2008 Quiz II

Detailed study on Linux Logical Volume Manager

Exploiting Single-Threaded Model in Multi-Core In-memory Systems

Quantifying Load Imbalance on Virtualized Enterprise Servers

Schema-Agnostic Indexing with Azure Document DB

Nimble Storage Adaptive Flash

CSE 190D Database System Implementation

Systems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2014/15

data parallelism Chris Olston Yahoo! Research

5 Fundamental Strategies for Building a Data-centered Data Center

Performance Isolation in Multi- Tenant Relational Database-asa-Service. Sudipto Das (Microsoft Research)

Chapter 9: Virtual Memory. Operating System Concepts 9 th Edition

Storage. Hwansoo Han

CA ERwin Data Modeler s Role in the Relational Cloud. Nuccio Piscopo.

CloudNet: Dynamic Pooling of Cloud Resources by Live WAN Migration of Virtual Machines

<Insert Picture Here> Oracle NoSQL Database A Distributed Key-Value Store

Achieving Horizontal Scalability. Alain Houf Sales Engineer

Chapter 8: Virtual Memory. Operating System Concepts

1/9/13. + The Transaction Concept. Transaction Processing. Multiple online users: Gives rise to the concurrency problem.

Experience the GRID Today with Oracle9i RAC

Transcription:

Squall: Fine-Grained Live Reconfiguration for Partitioned Main Memory Databases AARON J. ELMORE, VAIBHAV ARORA, REBECCA TAFT, ANDY PAVLO, DIVY AGRAWAL, AMR EL ABBADI

Higher OLTP Throughput Demand for High-throughput transactional systems (OLTP) especially due to web-based services Cost per GB for RAM is dropping. Network memory is faster than local disk. Let s use Main-Memory

Scaling-out via Partitioning Growth in scale of the data Data Partitioning enables managing scale via Scaling-Out.

Approaches for main-memory DBMS* Highly concurrent, latch-free data structures Hekaton, Silo Partitioned data with single-threaded executors Hstore, VoltDB *Excuse the generalization

Procedure Name Input Parameters Client Application Slide Credits: Andy Pavlo

The Problem: Workload Skew High skew increases latency by 10X and decreases throughput by 4X Partitioned shared-nothing systems are especially susceptible

The Problem: Workload Skew Possible solutions: Provision resources for peak load (Very expensive and brittle!) Resources Capacity Demand Time Unused Resources

The Problem: Workload Skew Possible solutions: Limit load on system (Poor performance!) Resources Time

Need Elasticity

The Promise of Elasticity Resources Capacity Demand Time Unused resources Slide Credits: Berkeley RAD Lab

What we need Enable system to elastically scale in or out to dynamically adapt to changes in load Change the partition plan Reconfiguration Add nodes Remove nodes

Problem Statement Need to migrate tuples between partitions to reflect the updated partition plan. Partition Warehouse Partition 1 [0,2) Partition 2 [2,4) Partition 3 [4,6) Partition Warehouse Partition 1 [0,1) Partition 2 [2,3) Partition 3 [1, 2),[3,6) Would like to do this without bringing the system offline: Live Reconfiguration

E-Store Reconfiguration complete Normal operation, high level monitoring Load imbalance detected Online reconfiguration (Squall) Tuple level monitoring (E-Monitor) New partition plan Tuple placement planning (E-Planner) Hot tuples, partition-level access counts

Live Migrations Solutions are Not Suitable Predicated on disk based solutions with traditional concurrency and recovery. Zephyr: Relies on concurrency (2PL) and disk pages. ProRea: Relies on concurrency (SI and OCC) and disk pages. Albatross: Relies on replication and shared disk storage. Also introduces strain on source.

Not Your Parents Migration Single threaded execution model Either doing work or migration More than a single source and destination (and the destination is not cold) Want lightweight coordination Presence of distributed transactions and replication

Squall Given plan from E-Planner, Squall physically moves the data while the system is live Pull based mechanism Destination pulls from source Conforms to H-Store single-threaded execution model o While data is moving, transactions are blocked but only on partitions moving the data To avoid performance degradation, Squall moves small chunks of data at a time, interleaved with regular transaction execution

Squall Steps 1. Initialization and Identify migrating data 2. Live reactive pulls for required data 3. Periodic lazy/async pulls for large chunks 0 1 3 4 Reconfiguration (New Plan, Leader ID) 0 1 3 4 2 Partition 1 Partition 2 Outgoing: 2 2 Partition 1 Pull W_ID=2 Partition 2 5 6 8 9 5 6 Pull W_ID>5 8 9 Partitioned by Warehouse id 7 Partition 3 10 Partition 4 Incoming: 2 Outgoing: 5 7 Partition 3 10 Partition 4 Incoming: 5 Client

Chunk Data for Asynchronous Pulls

Why Chunk? Unknown amount of data when not partitioned by clustered index. Customers by W_ID in TPC-C Time spent extracting, is time not spent on TXNS.

Async Pulls Periodically pull chunks of cold data These pulls are answered lazily Start at lower priority than transactions. Priority increases with time. Execution is interwoven with extracting and sending data (dirty the range!)

Chunking Async Pulls Data Data Async Pull Request Source Destination

Keys to Performance Properly size reconfiguration granules and space them apart. Split large reconfigurations to limit demands on a single partition. Redirect or pull only if needed. Tune what gets pulled. Sometimes pull a little extra.

Optimization: Splitting Reconfigurations 1. Split by pairs of source and destination - Avoids contention to a single partition Example: partition 1 is migrating W_ID 2,3 to partitions 3 and 7, execute as two reconfigurations. 2. Split large objects and migrate one piece at a time

Evaluation Workloads Baselines YCSB TPC-C Stop & Copy Purely Reactive Only Demand based pulling Zephyr+ - Purely Reactive + Asynchronous Chunking with Pull Prefetching (Semantically equivalent to Zephyr)

YCSB Latency YCSB cluster consolidation 4 to 3 nodes YCSB data shuffle 10% pairwise

Results Highlight TPC-C load balancing hotspot warehouses

All about trade-offs Trading off time to complete migration and performance degradation. Future work to consider automating this trade-off based on service level objectives.

I Fell Asleep What Happened? Partitioned Single Threaded Main Memory Environment -> Susceptible to Hotspots. Elastic data Management is a solution -> Squall provides a mechanism for executing a fine grained live reconfiguration Questions?

Tuning Optimizations

Sizing Chunks Static analysis to set chunk sizes, future work to dynamically set sizing and scheduling. Impact of chunk sizes on a 10% reconfiguration during a YCSB workload.

Spacing Async Pulls Delay at destination between new async pull requests. Impact on chunk sizes on a 10% reconfiguration during a YCSB workload with 8mb chunk size.

Effect of Splitting into Sub-Plans Set a cap on sub-plan splits, and split on pairs and ability to decompose migrating objects