Stinger Initiative. Making Hive 100X Faster. Page 1. Hortonworks Inc. 2013

Size: px

Start display at page:

Download "Stinger Initiative. Making Hive 100X Faster. Page 1. Hortonworks Inc. 2013"

Brianne Marylou Morton
5 years ago
Views:

1 Stinger Initiative Making Hive 100X Faster Page 1

HDP: Enterprise Hadoop Distribution OPERATIONAL SERVICES Manage AMBARI & Operate at Scale OOZIE HADOOP CORE FLUME SQOOP DATA SERVICES PIG Store, HIVE

0) Hortonworks Data Platform (HDP) Enterprise Hadoop The ONLY 100% open source and complete distribution PLATFORM SERVICES Enterprise Readiness: HA,

2 HDP: Enterprise Hadoop Distribution OPERATIONAL SERVICES Manage AMBARI & Operate at Scale OOZIE HADOOP CORE FLUME SQOOP DATA SERVICES PIG Store, HIVE Process and Access Data HCATALOG HBASE WEBHDFS Distributed MAP REDUCE Storage HDFS & Processing YARN (in 2.0) Hortonworks Data Platform (HDP) Enterprise Hadoop The ONLY 100% open source and complete distribution PLATFORM SERVICES Enterprise Readiness: HA, DR, Snapshots, Security, HORTONWORKS DATA PLATFORM (HDP) OS Cloud VM Appliance Enterprise grade, proven and tested at true scale (45,000+ nodes) Ecosystem endorsed to ensure interoperability Page 2

3 Hive Maturity and Stability " Hive was originally developed at Facebook. " More data than existing RDBMS could handle. " 60,000+ Hive queries per day. " More than 1,000 users per day. " 100+ PB of data. " 15+ TB of data loaded daily. " Hive is a proven solution at extreme scale. Page 3

Hive: Vibrant & Existing Ecosystem Teradata, Microsoft Microstrategy, Tableau

Pentaho, Jaspersoft Tibco, Talend, Informatica and more Vendors End Users Open Source

NexR InMobi and more Hive is the De-Facto SQL-for-Hadoop Solution Hive is Proven,

4 Hive: Vibrant & Existing Ecosystem Teradata, Microsoft Microstrategy, Tableau Karmasphere Datameer Information Builders SAP, Oracle, Actuate QlikView, SAS, Arcplan Pentaho, Jaspersoft Tibco, Talend, Informatica and more Vendors End Users Open Source Facebook, Teradata SAP, Intel, Twitter Microsoft, Huawei, Yahoo, Qubole, Citus Data, NexR InMobi and more Hive is the De-Facto SQL-for-Hadoop Solution Hive is Proven, Robust, Scalable Hive supports THE most BI Use Cases but Hive is currently optimized for Batch Processing. Page 4

5 So Innovate & Invest in Hive Parameterized Reports Enterprise Reports Dashboard / Scorecard Visualization Data Mining Users Want: More SQL Better Performance Interactive Batch Page 5

6 Stinger: Faster and Improved Insight on Hive Performance Op+miza+ons 100X+ Faster Time to Insight Deeper Analy+cal Capabili+es Base OpGmizaGons Generate simplified DAGs In- memory Hash Joins YARN Next- gen Hadoop data processing framework Tez Express tasks more simply Eliminate disk writes Pre- warmed Containers + + ORCFile Column Store High Compression Predicate / Filter Pushdowns Vector Query Engine Op>mized for modern processor architectures Query Planner Intelligent Cost- Based Op>mizer

7 Stinger: Faster and Improved Insight on Hive Performance Op+miza+ons 100X+ Faster Time to Insight Deeper Analy+cal Capabili+es Base OpGmizaGons Generate simplified DAGs In- memory Hash Joins YARN Next- gen Hadoop data processing framework Tez Express tasks more simply Eliminate disk writes Pre- warmed Containers + + ORCFile Column Store High Compression Predicate / Filter Pushdowns Vector Query Engine Op>mized for modern processor architectures Query Planner Intelligent Cost- Based Op>mizer

8 Improved Analytics: ROLLUP, CUBE select state, year, sum(amt_paid) Performance select state, year, Persistence sum(amt_paid) from sales from sales group by state, year with rollup group by state, year with cube State Year Sum CA CA CA * NY NY * * * State Year Sum CA CA CA * NY NY * * * * * Page 8

9 Definitely Room for Improvement. Simple analytics can turn into unintuitive and inefficient queries select count(*) as rk, s2.state as state, s2.product as product, avg(s2.amt_paid), sum(s1.amt_paid) from sales s1 join sales s2 on (s1.product = s2.product and s1.state = s2.state) where s1.year <= s2.year group by s2.state, s2.product, s2.year order by state, product, rk; Performance Persistence Page 9

10 Definitely Room for Improvement. This is all we were trying to do! Running total of Sales Figures! Performance Number State Product Amount Total 1 CA A CA A CA A CA A CA B CA B Persistence Page 10

11 Improved Analytics: OVER MUCH Better! Performance Persistence select rank() over state_and_product, state, product, amt_paid, sum(amt_paid) over state_and_product from sales window state_and_product as (partition by state, product order by year); Page 11

Improved Analytics: OVER partition by order by Performance AL 2012 1000.00 CA 2010 2000.00 CA 2011 2000.00 CA 2012 4000.00 CA 2013 1000.00 NY 2012 500.

12 Improved Analytics: OVER partition by order by Performance AL CA CA CA CA NY Persistence rows OVER clause PARTITION BY, ORDER BY, ROWS BETWEEN/FOLLOWING/PRECEDING Works with current aggregate functions New aggregates/window functions RANK, LEAD, ROW_NUMBER, LAG, LEAD, FIRST_VALUE, LAST_VALUE NTILE, DENSE_RANK, CUME_DIST, PERCENT_RANK, PERCENT_CONT, PERCENT_DISC Page 12

13 Additional Improved Analytics Sub-Queries in WHERE Performance Non-correlated only (no values from outer query) [NOT] IN supported Fit in memory as hash table when feasible Persistence Additional Standard SQL data types datetime char() and varchar() add precision and scale to decimal and float aliases for standard SQL types (BLOB = binary, CLOB = string, integer = int, real/number = decimal) Page 13

14 Stinger: Faster and Improved Insight on Hive Performance Op+miza+ons 100X+ Faster Time to Insight Deeper Analy+cal Capabili+es Base OpGmizaGons Generate simplified DAGs In- memory Hash Joins YARN Next- gen Hadoop data processing framework Tez Express tasks more simply Eliminate disk writes Pre- warmed Containers + + ORCFile Column Store High Compression Predicate / Filter Pushdowns Vector Query Engine Op>mized for modern processor architectures Query Planner Intelligent Cost- Based Op>mizer

15 Automatic Join Conversion Bucketed? Sorted? Small enough? Small enough? Sort Merge Bucket Join Shuffle Join Map Join Shuffle Join Map Join When enabled hive will automatically pick join implementation Query Hints No Longer Needed

16 Dimensionally Structured Data Extremely common pattern in EDW Large fact tables and small dimension tables Dimension tables often fit in RAM Oftentimes called Star Schema Page 16

17 Star Schema Join Derived from TPC-DS Query 27 SELECT col5, avg(col6) FROM fact_table join dim1 on (fact_table.col1 = dim1.col1) join dim2 on (fact_table.col2 = dim2.col1) join dim3 on (fact_table.col3 = dim3.col1) join dim4 on (fact_table.col4 = dim4.col1) GROUP BY col5 ORDER BY col5 LIMIT 100; Dramatic speedup on Hive 0.11 Page 17

18 Star Schema Joins BEFORE Page 18

19 Star Schema Joins AFTER

20 Star Schema Join Performance 35X Improvement (More to Come) Page 20

21 Large Tables Join BEFORE Stage-3 17 Stage-2 17 Start Time Stage Stage

22 Hive Sort Merge Bucket Join Bucketing allows Hive to physically co-locate rows within files (sorted or unsorted) CREATE EXTERNAL TABLE IF NOT EXISTS test_table ( ) Id INT, name String PARTITIONED BY (dt STRING, hour STRING) CLUSTERED BY(country,continent) SORTED BY(country,continent) INTO n BUCKETS ROW FORMAT DELIMITED FIELDS TERMINATED BY ' ' LOCATION '/home/test_dir ; Join on two large tables share same sorted bucket key? Very efficient joins (minimize shuffles) Page 22

23 Large Tables Join AFTER Stage

24 Sort-Merge-Bucket Join Performance 45X Improvement (More to Come) Page 24

25 Summary: Early Benchmarking Results TPC DS Sample Queries Query One (Left): Star Schema Join, 35X improvement Query Two (Right): Join two tables, too large to fit in memory, 45X speedup MUCH More to Come as we make our way to 100X and Beyond! Page 25

26 Stinger: Faster and Improved Insight on Hive Performance Op+miza+ons 100X+ Faster Time to Insight Deeper Analy+cal Capabili+es Base OpGmizaGons Generate simplified DAGs In- memory Hash Joins YARN Next- gen Hadoop data processing framework Tez Express tasks more simply Eliminate disk writes Pre- warmed Containers + + ORCFile Column Store High Compression Predicate / Filter Pushdowns Vector Query Engine Op>mized for modern processor architectures Query Planner Intelligent Cost- Based Op>mizer

ORCFile Even Faster Query-Optimized: Split-able, columnar storage file Efficient Reads: Break into large stripes of data for efficient read Fast Filtering: Built in index, min/max, metadata for fast

27 ORCFile Even Faster Query-Optimized: Split-able, columnar storage file Efficient Reads: Break into large stripes of data for efficient read Fast Filtering: Built in index, min/max, metadata for fast filtering blocks - bloom filters if desired Efficient Compression: Decompose complex row types into primitives, runlength encoding => massive compression and efficient comparisons for filtering Pre-computation: Built in aggregates per block (min, max, count, sum) Page 27

28 ORCFile High Compression Data set from TPC-DS Page 28

29 Stinger: Faster and Improved Insight on Hive Performance Op+miza+ons 100X+ Faster Time to Insight Deeper Analy+cal Capabili+es Base OpGmizaGons Generate simplified DAGs In- memory Hash Joins YARN Next- gen Hadoop data processing framework Tez Express tasks more simply Eliminate disk writes Pre- warmed Containers + + ORCFile Column Store High Compression Predicate / Filter Pushdowns Vector Query Engine Op>mized for modern processor architectures Query Planner Intelligent Cost- Based Op>mizer

30 Longer Term Hive Performance Rewrite all operations to operate on blocks of 1K+ records, rather than one record at a time Block is array of Java scalars, not Objects (eliminate Objects compounding GC gains over time) Avoids many function calls, CPU pipeline stalls Size to fit in L1 cache, avoid cache misses Cost Based Optimizer: Generate better DAGs based on properties of data being queried: table size, statistics, histograms, etc. Buffer/Cache Data Hotspots Page 30

31 Stinger: Faster and Improved Insight on Hive Performance Op+miza+ons 100X+ Faster Time to Insight Deeper Analy+cal Capabili+es Base OpGmizaGons Generate simplified DAGs In- memory Hash Joins YARN Next- gen Hadoop data processing framework Tez Express tasks more simply Eliminate disk writes Pre- warmed Containers + + ORCFile Column Store High Compression Predicate / Filter Pushdowns Vector Query Engine Op>mized for modern processor architectures Query Planner Intelligent Cost- Based Op>mizer

32 Tez: High Throughput and Low Latency Tez Generalizes Map-Reduce Simplified execution plans process data more efficiently Always-On Tez Service Low latency processing for all Hadoop data processing Page 32

33 Tez: High Throughput and Low Latency Node Manager Tez runs in YARN Container App Mstr Client Client Resource Manager Node Manager MapReduce Status Job Submission Node Status Resource Request App Mstr Container Node Manager Container Container Accelerate High Throughput AND Low Latency Processing

34 Tez: Core Idea Task with pluggable Input, Processor & Output Input Processor Output Task YARN ApplicationMaster runs DAG of Tez Tasks Page 34

35 Tez Hive Performance Low level data-processing, execution engine on YARN Base for MapReduce, Hive, Pig, Cascading, etc. Re-usable data processing primitives (ex: sort, merge, intermediate data management) Hive SQL can be expressed as single job Jobs are no longer interrupted (efficient pipeline) Avoid writing intermediate output to HDFS when performance outweights job re-start (speed and network/disk usage savings) Break MR contract to turn MRMRMR to MRRR (flexible DAG) Removes task and job overhead (10-30s savings is huge for a 2s query!) Page 35

36 Pig/Hive optimized on Tez SELECT a.state, COUNT(*) FROM a JOIN b ON (a.id = b.id) GROUP BY a.state I/O Synchronization Barrier I/O Pipelining MapReduce TEZ Page 36

37 Pig/Hive optimized on Tez SELECT a.state, COUNT(*), AVERAGE(c.price) FROM a JOIN b ON (a.id = b.id) JOIN c ON (a.itemid = c.itemid) GROUP BY a.state Job 1 Single Job Job 2 I/O Synchronization Barrier I/O Synchronization Barrier Job 3 MapReduce TEZ Page 37

38 Innovation via Community Performance Op+miza+ons 100X+ Faster Time to Insight Deeper Analy+cal Capabili+es

April Copyright 2013 Cloudera Inc. All rights reserved.

April Copyright 2013 Cloudera Inc. All rights reserved. Hadoop Beyond Batch: Real-time Workloads, SQL-on- Hadoop, and the Virtual EDW Headline Goes Here Marcel Kornacker marcel@cloudera.com Speaker Name or Subhead Goes Here April 2014 Analytic Workloads on