HAWQ: A Massively Parallel Processing SQL Engine in Hadoop

Similar documents
HAWQ: A Massively Parallel Processing SQL Engine in Hadoop

Apache HAWQ (incubating)

Gain Insights From Unstructured Data Using Pivotal HD. Copyright 2013 EMC Corporation. All rights reserved.

Big Data Hadoop Stack

Shark: SQL and Rich Analytics at Scale. Michael Xueyuan Han Ronny Hajoon Ko

Part 1: Indexes for Big Data

Interactive SQL-on-Hadoop from Impala to Hive/Tez to Spark SQL to JethroData

Copyright 2015 EMC Corporation. All rights reserved. A long time ago

Big Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara

Distributed File Systems II

April Copyright 2013 Cloudera Inc. All rights reserved.

Hive and Shark. Amir H. Payberah. Amirkabir University of Technology (Tehran Polytechnic)

The Technology of the Business Data Lake. Appendix

Impala. A Modern, Open Source SQL Engine for Hadoop. Yogesh Chockalingam

Configuring and Deploying Hadoop Cluster Deployment Templates

SQT03 Big Data and Hadoop with Azure HDInsight Andrew Brust. Senior Director, Technical Product Marketing and Evangelism

In-memory data pipeline and warehouse at scale using Spark, Spark SQL, Tachyon and Parquet

Topics. Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples

Jargons, Concepts, Scope and Systems. Key Value Stores, Document Stores, Extensible Record Stores. Overview of different scalable relational systems

Big Data Hadoop Developer Course Content. Big Data Hadoop Developer - The Complete Course Course Duration: 45 Hours

MapR Enterprise Hadoop

Advanced Databases: Parallel Databases A.Poulovassilis

VOLTDB + HP VERTICA. page

CSE 190D Spring 2017 Final Exam Answers

Distributed KIDS Labs 1

Hadoop 2.x Core: YARN, Tez, and Spark. Hortonworks Inc All Rights Reserved

CISC 7610 Lecture 2b The beginnings of NoSQL

Certified Big Data and Hadoop Course Curriculum

The Google File System (GFS)

MODERN BIG DATA DESIGN PATTERNS CASE DRIVEN DESINGS

About ADS 1.1 ADS comprises the following components: HAWQ PXF MADlib

Postgres-XC PostgreSQL Conference Michael PAQUIER Tokyo, 2012/02/24

Hadoop Map Reduce 10/17/2018 1

CSE 190D Spring 2017 Final Exam

Huge market -- essentially all high performance databases work this way

Hadoop. Introduction / Overview

Big Data Hadoop Course Content

How Apache Hadoop Complements Existing BI Systems. Dr. Amr Awadallah Founder, CTO Cloudera,

Final Review. May 9, 2017

EMC GREENPLUM MANAGEMENT ENABLED BY AGINITY WORKBENCH

Big Data Greenplum DBA Online Training

Final Review. May 9, 2018 May 11, 2018

Jyotheswar Kuricheti

Postgres-XC PG session #3. Michael PAQUIER Paris, 2012/02/02

B.H.GARDI COLLEGE OF ENGINEERING & TECHNOLOGY (MCA Dept.) Parallel Database Database Management System - 2

Big Data Technology Incremental Processing using Distributed Transactions

Data Lake Based Systems that Work

Hadoop Beyond Batch: Real-time Workloads, SQL-on- Hadoop, and thevirtual EDW Headline Goes Here

Technical Sheet NITRODB Time-Series Database

TRANSACTIONS AND ABSTRACTIONS

Hadoop. Course Duration: 25 days (60 hours duration). Bigdata Fundamentals. Day1: (2hours)

Modern Data Warehouse The New Approach to Azure BI

PostgreSQL Cluster. Mar.16th, Postgres XC Write Scalable Cluster

MapReduce. U of Toronto, 2014

MI-PDB, MIE-PDB: Advanced Database Systems

Blended Learning Outline: Developer Training for Apache Spark and Hadoop (180404a)

Distributed Filesystem

Sempala. Interactive SPARQL Query Processing on Hadoop

Certified Big Data Hadoop and Spark Scala Course Curriculum

Innovatus Technologies

Introduction to Big-Data

Webinar Series TMIP VISION

Advanced Data Management Technologies Written Exam

C-STORE: A COLUMN- ORIENTED DBMS

From Single Purpose to Multi Purpose Data Lakes. Thomas Niewel Technical Sales Director DACH Denodo Technologies March, 2019

SCHISM: A WORKLOAD-DRIVEN APPROACH TO DATABASE REPLICATION AND PARTITIONING

Big Data Architect.

FLAT DATACENTER STORAGE. Paper-3 Presenter-Pratik Bhatt fx6568

Accelerating BI on Hadoop: Full-Scan, Cubes or Indexes?

Database Tuning and Physical Design: Execution of Transactions

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Shark: SQL and Rich Analytics at Scale. Reynold Xin UC Berkeley

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Bigtable. Presenter: Yijun Hou, Yixiao Peng

Shark: Hive (SQL) on Spark

Cmprssd Intrduction To

C-Store: A column-oriented DBMS

Page 1. Goals for Today" Background of Cloud Computing" Sources Driving Big Data" CS162 Operating Systems and Systems Programming Lecture 24

Introduction to Hadoop. High Availability Scaling Advantages and Challenges. Introduction to Big Data

Accelerate Big Data Insights

Apache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context

Big Data Analytics using Apache Hadoop and Spark with Scala

BigData and Map Reduce VITMAC03

Hortonworks Data Platform

A Fast and High Throughput SQL Query System for Big Data

Shark: SQL and Rich Analytics at Scale. Yash Thakkar ( ) Deeksha Singh ( )

CA485 Ray Walshe Google File System

Start Working with Parquet!!!!

Map-Reduce. Marco Mura 2010 March, 31th

Overview. : Cloudera Data Analyst Training. Course Outline :: Cloudera Data Analyst Training::

Outline. Database Tuning. Ideal Transaction. Concurrency Tuning Goals. Concurrency Tuning. Nikolaus Augsten. Lock Tuning. Unit 8 WS 2013/2014

Lecture 11 Hadoop & Spark

CLOUD-SCALE FILE SYSTEMS

Database Architectures

Heckaton. SQL Server's Memory Optimized OLTP Engine

Google File System. Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google fall DIP Heerak lim, Donghun Koo

Actian SQL Analytics in Hadoop

Tutorial Outline. Map/Reduce vs. DBMS. MR vs. DBMS [DeWitt and Stonebraker 2008] Acknowledgements. MR is a step backwards in database access

April Final Quiz COSC MapReduce Programming a) Explain briefly the main ideas and components of the MapReduce programming model.

Pivotal Greenplum Database Azure Marketplace v4.0 Release Notes

Transcription:

HAWQ: A Massively Parallel Processing SQL Engine in Hadoop Lei Chang, Zhanwei Wang, Tao Ma, Lirong Jian, Lili Ma, Alon Goldshuv Luke Lonergan, Jeffrey Cohen, Caleb Welton, Gavin Sherry, Milind Bhandarkar Pivotal Inc {lchang, zwang, tma, ljian, lma, agoldshuv, llonergan, jcohen, cwelton, gsherry, mbhandarkar}@gopivotal.com Guoyao Feng CS 848 University of Waterloo

Agenda Introduction HAWQ Architecture Query Processing Interconnect Transaction Management Extension Framework Experiments Conclusions

Background Shifting Paradigm data source heavy ETL process data warehouse data source Requirements for Hadoop Interactive Queries Scalability Consistency Extensibility Standard Compliance Productivity central repository (HDFS)

Hadoop s advantages Scalability Fault tolerance... Motivation Hadoop limitations Poor performance for interactive analysis Low-level programming model Lack of transaction support MPP features Excellent for structured data processing Fast query processing capabilities Automatic query optimization HAWQ

Introducing HAWQ HAWQ A MPP SQL query engine on top of HDFS Combines the merit of Greenplum Database and Hadoop distributed storage SQL compliant Support large-scale analytics for big data (MADlib) Interactively query various data sources in different Hadoop format

Agenda Introduction HAWQ Architecture Query Processing Interconnect Transaction Management Extension Framework Experiments Conclusions

HAWQ Architecture MPP sharednothing compute layer Distributed storage layer

HAWQ Architecture: Interface Challenge: Simplify the interaction between HAWQ and external systems Design Choice: Enhance standard interfaces with open data format support External systems can bypass HAWQ to access table files on Hadoop (HDFS API) Open MapReduce InputFormats and OutputFormats

HAWQ Architecture: Catalog Service Catalog describes the system and all objects in the system Database objects Description about filespaces, tablespaces, schemas, databases, tables, etc. System information Security Information about the segments and their status Users, roles and privileges Language and Coding Languages and encoding support CaQL: a simplified catalog query language for internal access Easy to implement Scalability

HAWQ Architecture: Data Distribution id Hash distribution 2 A2 3 A3 Most frequently used strategy Align tables to improve some import query patterns A_payload id 3 B3 4 B4 B_payload Random distribution Distribute rows in a round-robin fashion Even data distribution Works well for table with a small number of distinct values Segment

HAWQ Architecture: Data Distribution Table partitioning Range partitioning + list partitioning Creates an inheritance relationship between the top-level parent table and child tables Improve query performance if only a subset of partitions are accessed

HAWQ Architecture: Query execution workflow QD -> Query Dispatcher QE -> Query Executer

HAWQ Architecture: Storage HAWQ supports a variety of storage models on HDFS Transformation among different storage models done at the user layer Roworiented/Read optimized AO Optimized for read-mostly full table scan and bulk append loads Columnoriented/Read optimized CO Columnoriented/Parquet Columns stored in large and packed blocks Efficient compression Columns stored in a row group instead of separate files Nested data Vertically partitioned

HAWQ Architecture: Fault Tolerance Master Warm standby kept synchronized with a transaction log replication process Segment Stateless Simple recovery Better availability Disk Disk failure of user data handled by HDFS Disk failure of intermediate query output marked by HAWQ

Agenda Introduction HAWQ Architecture Query Processing Interconnect Transaction Management Extension Framework Experiments Conclusions

Query Types Query Processing Master-only queries Only access catalog tables Queries evaluated without dispatching Symmetrically dispatched queries Physical query plans are dispatched to all segments Most common queries Directly dispatched queries A slice accesses a single segment directory Save network bandwidth and improve concurrency of small queries Query Plan Relational operators: scans, joins, etc Parallel motion operators pipelined

Query Processing Parallel motion Operators Broadcast Motion (N:N) Redistribute Motion (N:N) Gather Motion (N:N) Every segment sends the input tuples to all other segments Ever segment rehashes tuples on a column and redistribute to the appropriate segments Every segment sends the input tuples to a single segment (i.e. the master)

Query Processing Problem: A large number of QEs connect to the master and query meta data Solution: Dispatch the metadata along with the execution plan (selfdescribed plan) Store read-only meta data on the segments

Query Processing Example Data motion boundary 1-gang (QD) N-gang (QEs)

Agenda Introduction HAWQ Architecture Query Processing Interconnect Transaction Management Extension Framework Experiments Conclusions

Interconnect Issues with TCP interconnect Port number limitation (60k) Expensive connection setup UDP interconnect to rescue Reliability Ordering Flow control Performance and scalability Portability

Protocol UDP Interconnect

UDP Interconnect Implementation details Typical implementation of reliable UDP solution Sender maintains a send queue and an expiration queue Flow control window cut down to a minimal predefined value followed by slow start OUT-OF-ORDER and DUPLICATE message Send a status query message to eliminate deadlock

Agenda Introduction HAWQ Architecture Query Processing Interconnect Transaction Management Extension Framework Experiments Conclusions

Transaction Management Catalog Write ahead log (WAL) Multi-version concurrency control (MVCC) Support snapshot isolation User data Append-only HDFS files avoids complexity System catalog records logical file length to control visibility No distributed commit protocols Transaction is only noticeable on the master node Self-described plans convey visibility information to segments Transaction only occurs on master node

Transaction Management Isolation levels Read committed A statement can only see rows committed before it begins Read Uncommitted Repeatable read Serializable All statements of he current transaction can only see rows committed before the first query Locking Control the conflicts between concurrent DDL and DML statements

Transaction Management Pivotal HDFS adds truncate to support transaction Truncate is necessary for undoing changes by aborted transactions Hadoop HDFS currently does not support truncate Atomicity is guaranteed Only one update operation allowed Truncate applied to closed files only Concurrent Updates Lightweight swimming lane approach for concurrent inserts Catalog table is used to manage user table files

Agenda Introduction HAWQ Architecture Query Processing Interconnect Transaction Management Extension Framework Experiments Conclusions

Extension Framework PXF: a fast and extensible framework connecting HAWQ to any data store of choice Users implement a parallel connector API to create connector for data stores

Benefits Extension Framework A real connection among various data stores to share data Complex joins between internal HAWQ tables and external PXF tables External jobs can run faster with HAWQ/PXF Advanced functionality Exploits data locality to minimize network traffic Exposes a filter push down API Planner statics collection

HAWQ in Distributed Data Processing Data processing framework MapReduce Dryad Spark Framework extension Pig Hive Shark HAWQ Native SQL engines on HDFS Impala Presto Drill Database and Hadoop Mix Greenplum DB2 Vertica

Agenda Introduction HAWQ Architecture Query Processing Interconnect Transaction Management Extension Framework Experiments Conclusions

Experiments: Competing Systems Stinger a community-based effort to optimize Apache Hive reported 35x-45x faster than the original Hive 36GB RAM on each node in YARN, 4GB minimum memory for each container HAWQ 6 HAWQ segments are configured on each node 6GB memory is allocated to each segment.

Experiments: TCP-H Results

Experiments

Conclusion: HAWQ A massively parallel processing SQL engine Inherits merits from MPP database and HDFS Stateless segment design supported by metadata dispatch and self-described execution plan UDP based interconnect to overcome TCP limitations Transaction management supported by a swimming lane model and truncate operation in HDFS Significant performance advantage over Stinger

Discussion The master node alone handles transaction, catalog, user interaction, query processing, the final aggregation step of gather motion, etc. 1. Will it be the bottleneck in the system? 2. What are the alternative designs? The paper claims that PXF allows HAWQ to interact with any data store of choice and run any type of SQL directly on external data sets. 1. What if the format is not compatible with Hadoop data formats or SQL query languages? 2. Is performance and efficiency a valid concern here?