Parallelism Strategies In The DB2 Optimizer
|
|
- Jordan Rodgers
- 6 years ago
- Views:
Transcription
1 Session: A05 Parallelism Strategies In The DB2 Optimizer Calisto Zuzarte IBM Toronto Lab May 20, :15 a.m. 10:15 a.m. Platform: DB2 on Linux, Unix and Windows The Database Partitioned Feature (DPF) or inter-partition environment and the Intra-partition environment are the two main parallelism environments in DB2. This session is about exploiting these environments to improve query performance. This presentation talks about how the DB2 optimizer works with these features that allow parallel processing for query performance. The goal is to help you understand how you can get good query performance exploiting these features that provide parallelism. 1
2 Agenda Objective Inter-Partition (DPF) Parallelism Intra-Partition (SMP) Parallelism Parallelism in the Websphere Information Integrator 2 2
3 Why Query Performance? Databases are getting larger Systems are getting complex Queries are getting complex Users are getting impatient SELECT prrfnbr, prnbr, pdsdesc, COALESCE((SELECT ccpr.cppub FROM contract, cgprrel_1 AS ccpr WHERE cntrfnbr = 1 AND cntmenbr = 1 AND ccpr.cpmenbr = cntmenbr AND ccpr.cpcgnbr = mcpr.cpcgnbr AND ccpr.cpprnbr = mcpr.cpprnbr AND ccpr.cpprt = mcpr.cpprt AND ccpr.cpcatnbr = cntcatnbr), 0) AS cppub FROM cgprrel AS mcpr, product, proddesc WHERE mcpr.cpmenbr = 1 AND mcpr.cpcgnbr = AND mcpr.cpprt = '826' AND mcpr.cppub = 1 AND mcpr.cpcatnbr = 1 AND prrfnbr = mcpr.cpprnbr AND prmenbr = mcpr.cpmenbr AND pdprnbr = prrfnbr AND pdlang = 'en_gb' AND pdmenbr = prmenbr ORDER BY mcpr.cpseqnbr 3 3
4 Query Example Query the database for total sales by customers over the age of 20 by product and store SELECT C.NAME, P.NAME, ST.ADDRESS SUM(sales_amount) FROM CUST C, PROD P, SALES S, STORE ST WHERE C.AGE > 20 AND C.CUST_ID = S.CUST_ID ST.STORE_ID = S.STORE_ID P.PROD_ID = S.PROD_ID GROUP BY C.NAME, P.NAME, ST.ADDRESS 4 4
5 How Can You Help DB2 Get Good Query Performance? SELECT C.NAME, P.NAME, ST.ADDRESS SUM(sales_amount) FROM CUST C, PROD P, SALES S, STORE ST WHERE C.AGE > 20 AND C.CUST_ID = S.CUST_ID ST.STORE_ID = S.STORE_ID P.PROD_ID = S.PROD_ID GROUP BY C.NAME, P.NAME, ST.ADDRESS Indexes on the various ID Columns or C.AGE Define a Materialized Query Table Collect histogram distribution statistics on C.AGE Increase memory for the SORT or BUFFERPOOL... What about parallelism? 5 5
6 Objective : Query Performance By Improving Parallelism Many ways to help improve performance Defining indexes, Collecting Statistics, adding memory Objective of this presentation: Provide recommendations so you can improve query performance by exploiting parallelism features in DB2 6 6
7 Terminology / Acronym Quiz True or False DPF = Decimal Floating Point SMP = Symmetric multiprocessor Parallel Processing: Talking on the phone, instant messaging, watching TV and doing homework, all at the same time DPF improves performance by paralyzing SQL queries Websphere II is the version after Websphere like Viper II is the version after Viper 7 7
8 Agenda Objective Inter-Partition (DPF) Parallelism Intra-Partition (SMP) Parallelism Parallelism in the Websphere Information Integrator 8 8
9 Join Strategies Without DPF HASH JOIN HASH JOIN HASH JOIN HASH JOIN Cust HASH JOIN Cust HASH JOIN HASH JOIN Cust HASH JOIN Cust Sales Prod Prod Sales Sales Prod Prod Sales Join orders Join types H P H C S H P H S C H H P S C H H P C S Hash Joins Merge Joins Nested Loop Joins M P M C S H M P M S C H M M P S C H M M P C S H P M P M M P M P C S S C S C C S 9 9
10 Database Partitioning Feature SORT SORT SORT JOIN JOIN JOIN PROD PROD PROD CUST SALES SALES SALES STORE STORE STORE 10 10
11 Join Strategies With DPF Join strategies and partitioning concepts Collocated joins Directed joins Broadcast joins Repartitioned joins 11 11
12 Collocated Joins JOIN JOIN JOIN PROD SALES PROD SALES PROD SALES Join Predicate : P.PROD_ID = S.PROD_ID Partitioning Key : PROD : P.PROD_ID SALES : S.PROD_ID No Transfer of data between partitions 12 12
13 Directed Joins JOIN JOIN JOIN DTQ DTQ DTQ STORE SALES STORE SALES STORE SALES Join Predicate : ST.STORE_ID = S.STORE_ID Partitioning Key : STORE : ST.STORE_ID SALES : S.PROD_ID Direct specific SALES rows to appropriate partition 13 13
14 Broadcast Joins JOIN JOIN JOIN BTQ BTQ BTQ CUST SALES SALES SALES Join Predicate : C.CUST_ID = S.CUST_ID Partitioning Key : CUST : Single Node SALES : S.PROD_ID Broadcast all CUST rows to the other partitions 14 14
15 Repartitioned Joins JOIN JOIN JOIN DTQ DTQ DTQ DTQ DTQ DTQ EMP SALES EMP SALES EMP SALES Join Predicate : EMP.STORE_ID = S.STORE_ID Partitioning Key : EMPLOYES : EMP.EMP_ID SALES : S.PROD_ID SALES and EMPLOYEE rows repartitioned 15 15
16 Join Planning Strategies in DPF If a collocated join is possible done Otherwise if a directed join is possible done Otherwise choose the cheaper plan between A broadcast join and A repartitioned join H H TQ TQ S P C Many more plans considered with TQs 16 16
17 Replicated Tables JOIN JOIN JOIN BTQ BTQ BTQ CUST CUST COPY SALES CUST COPY SALES CUST COPY SALES Consider dimension tables in a star schema Avoids repeated data movement Faster query performance 17 17
18 Repartitioned Tables JOIN JOIN JOIN DTQ DTQ DTQ DTQ DTQ DTQ EMP ID EMP ST_ID SALES EMP ID EMP ST_ID SALES EMP ID EMP ST_ID SALES Consider hot columns of large table partitioned on a different column Avoids repeated data movement Faster query performance 18 18
19 Parallel Sorts Local SORT and merge at the coordinator MDTQ SORT SORT SORT SALES SALES SALES SORT after sending to the coordinator SORT DTQ SALES SALES SALES 19 19
20 Parallel Aggregation Partial aggregation during SORT Intermediate aggregation when the Grouping key covers a subset of the partitioning key Final Aggregation with global Grouping key Aggregation strategy is cost based Consider Column Group Statistics on GROUP BY columns 20 20
21 Unexpected Data Movement? Is data from your large fact table moving to a single partition dimension table? Local predicates on the fact table? For example F.C1 = 5 AND F.C2 = ABC DB2 may be underestimating the number of qualifying rows Consider replication or appropriate statistics For example column group statistics on equality predicates on fact table columns (C1, C2) 21 21
22 Fast Communication Manager (FCM) Communication is done through FCM Significantly more efficient in DB2 9 Minimizes context switching Exploits parallelism better Tuning Considerations Pay attention to FCM_NUM_BUFFERS Default of 4096 is small for typical warehouse environments recommend around "AUTOMATIC" allows some increase ~25% and drops back if not used after some time 22 22
23 Recommendations - DPF Consider collocation for the largest and frequently joined tables Consider replicated tables Consider repartitioned tables Consider appropriate statistics For example, Column Group Statistics Check FCM_NUM_BUFFERS 23 23
24 Agenda Objective Inter-Partition (DPF) Parallelism Intra-Partition (SMP) Parallelism Parallelism in the Websphere Information Integrator 24 24
25 Intra-Partition Parallelism (SMP) Exploits multiple processors on symmetric multiprocessor (SMP) architectures Could benefit a single processor system if I/O bound Combination of Data parallelism and Functional parallelism 25 25
26 Data Parallelism User not required to partition data Data dynamically assigned to query task Assigns a range of pages or rows or keys Provides dynamic load balancing Supports table and index scans Straw partitioning architecture 26 26
27 Functional Parallelism Divides query operation and assign tasks to different agent processes Single coordinator agent and multiple subagents Ensure subagents are equally busy 27 27
28 SMP Design Considerations And Concepts Optimizer design consideration Goal to reduced engineering cost Implemented as an optimizer post pass Run Time design considerations Goal to limit specialized changes Local table queues (LTQ) DEGREE 28 28
29 Optimizer Considerations SQL Query Not deeply integrated in the optimizer Main Optimizer Best Serial or DPF access plan Optimizer Post Pass SMP Parallelize Some Operators 29 29
30 SMP Access Plan RETURN (9) LTQ (8) MSJOIN (7) / \ TSCAN TSCAN (3) (6) SORT SORT (2) (5) TSCAN TSCAN (1) (4) PRODUCT PRODATR Results returned via shared memory table Queue to the coordinator agent Join processed in parallel by each sub-agent Each sub-agent scans a sort partition Hash partitioned sorts on prod_id one partition per sub-agent Parallel table scans SELECT p.name, p.prod_id, pa.attribute FROM product p, prodatr pa WHERE p.prod_id = pa.prod_id; 30 30
31 SMP Parallelism Usage In DB2 Query Operations Sorts, Joins, TEMP tables, Aggregation INSERT, DELETE, UPDATE not parallelized Utilities Index Creation, Data Load, Database Recovery 31 31
32 Recommendations - SMP Parallelism SMP parallelism is usually not recommended : In OLTP environments In concurrent multi-user environments with heavy CPU usage Recommended When CPUs are highly under utilized When DPF is not an option 32 32
33 Agenda Objective Inter-Partition (DPF) Parallelism Intra-Partition (SMP) Parallelism Parallelism in the Websphere Information Integrator 33 33
34 SMP Parallelism and Websphere II Parallelism with query parts on local data Nickname access is serialized Serialization point is the coordinator RETURN NLJN / \ TQ SHIP SCAN(T) SCAN(NN) SMP coordinator process subagent process subagent process subagent process Remote Remote DB DB RETURN NLJN / \ SHIP SCAN(NN) SCAN(T) Local data on WS II 34 34
35 DPF Federated Server Federated server separate from the DPF system No No Parallel Joins Joins Coord DPF with integrated federated server Joins Joins may may occur in in parallel Partitioned local data Fed Server Partitioned data Coord + Fed Server 35 35
36 Websphere II Trusted and Fenced Wrappers Websphere II V8.1 and earlier had trusted wrapper technology Websphere II V8.2 added fenced wrappers 36 36
37 DPF With Fenced Wrappers Optimizer has two plan choices Serial processing at the server Small amount of of data data from from the the partitioned table table Coord + Fed Server Parallel processing across appropriate partitions Large amount of of data data in in the the partitioned table table Coord + Fed Server 37 37
38 DPF Computational Partition Groups CPGs enable distribution of nickname data to partitions for parallel joins Coord + Fed server Using Using CPGs CPGs Access to to nickname is is still still serial serial but but asynchronous Coord + Fed server 38 38
39 Recommendations - Federation Make the DB2 partitioned server a federated server Use fenced wrappers enables parallelism, improve scalability, provide fault protection to the engine Use Computational Partition Groups to: Improve performance of large nickname only queries Reduce potential bottlenecks on coordinator partition Load balance the coordinator partition: Rotate the coordinator partition for federated applications Assign additional resource to a dedicated coordinator partition 39 39
40 Objective of this presentation: Provide recommendations so you can improve query performance by exploiting parallelism features in DB2 Summary Get better query performance using DPF Consider appropriate collocation Consider replicated tables Consider repartitioned tables Consider appropriate statistics If DPF is not an option, consider SMP when You have long running queries CPUs are under utilized [Not good for multi-user CPU bound systems] Improve query performance in Websphere II With DPF on the federated server Using Fenced Wrappers Using Computational Partition Groups Load balance the federation server 40 40
41 Appendix 41 41
42 DPF Example - TPCH Query SELECT nation, o_year, SUM(amount) AS sum_profit FROM (SELECT n_name as nation, year(o_orderdate) as o_year, l_extendedprice * (1 - l_discount) - ps_supplycost * l_quantity as amount FROM part, supplier, lineitem, partsupp, orders, nation WHERE s_suppkey = l_suppkey and ps_suppkey = l_suppkey and ps_partkey = l_partkey and p_partkey = l_partkey and o_orderkey = l_orderkey and s_nationkey = n_nationkey and p_name like '%coral%' ) AS profit PARTITIONING GROUP BY nation, o_year LINEITEM : L_ORDERKEY ORDER BY nation, o_year desc ORDERS : O_ORDERKEY PART : P_PARTKEY PARTSUPP : PS_PARTKEY, PS_SUPPKEY SUPPLIER : S_SUPPKEY NATION : R_REGIONKEY 42 42
43 Access Plan / \ e DTQ HSJOIN ( 8) ( 18) / \ e HSJOIN TBSCAN BTQ ( 9) ( 19) ( 20) / \ e e TBSCAN DTQ TABLE: TPCD FETCH ( 10) ( 11) SUPPLIER ( 21) / \ e e TABLE: TPCD HSJOIN IXSCAN TABLE: TPCD PARTSUPP ( 12) ( 22) NATION / \ e e TBSCAN HSJOIN INDEX: TPCD ( 13) ( 14) N_NK / \ e e e+08 TBSCAN BTQ TABLE: TPCD ORDERS ( 15) ( 16) e TABLE: TPCD TBSCAN LINEITEM ( 17) e+07 TABLE: TPCD PART Rows RETURN ( 1) 225 GRPBY ( 2) MDTQ ( 3) 225 GRPBY ( 4) 225 TBSCAN ( 5) 225 SORT ( 6) e+07 HSJOIN ( 7) / \ e DTQ HSJOIN ( 8) ( 18) 43 43
44 DPF Plan Analysis / \ E DTQ HSJOIN ( 8) ( 18) / \ e HSJOIN TBSCAN BTQ ( 9) ( 19) ( 20) / \ e e TBSCAN DTQ TABLE: FETCH ( 10) ( 11) SUPPLIER ( 21) / \ e e TABLE: TPCD HSJOIN IXSCAN TABLE: PARTSUPP ( 12) ( 22) NATION / \ e e TBSCAN HSJOIN INDEX: N_NK ( 13) ( 14) / \ e e e+08 TBSCAN BTQ TABLE: TPCD ORDERS ( 15) ( 16) e TABLE: TPCD TBSCAN Rows RETURN ( 1) 225 GRPBY ( 2) MDTQ ( 3) 225 GRPBY ( 4) 225 TBSCAN ( 5) 225 SORT ( 6) e+07 HSJOIN ( 7) / \ e DTQ HSJOIN ( 8) ( 18) LINEITEM ( 17) e+07 TABLE: TPCD PART 44 44
45 Small Table Broadcast Hash Hash Joins Joins start start with with the the build build phase. phase. HSJOIN(18) HSJOIN(18) is is part part of of the the build build phase phase of of HSJOIN(7) HSJOIN(7) NATION NATION partitioned partitioned on on REGIONKEY REGIONKEY SUPPLIER SUPPLIER partitioned partitioned on on SUPPKEY SUPPKEY Join Join predicate predicate is is on on NATIONKEY. NATIONKEY. HSJOIN ( 18) / \ TBSCAN BTQ ( 19) ( 20) TABLE: TPCD FETCH SUPPLIER ( 21) / \ IXSCAN TABLE: TPCD ( 22) NATION 25 INDEX: TPCD N_NK No No collocation collocation or or directed directed join join possible possible 45 45
46 Largest Table Filtered Early e+07 HSJOIN ( 12) / \ e e+07 TBSCAN HSJOIN ( 13) ( 14) / \ e e e+08 TABLE: TPCD TBSCAN BTQ ORDERS ( 15) ( 16) e TABLE: TPCD TBSCAN LINEITEM ( 17) e+07 TABLE: TPCD PART PART has has local local predicates Notice the the cardinality change after after the the broadcast LINEITEM is is too too huge huge to to move around!!!! 46 46
47 Biggest Join Is Collocated e+07 HSJOIN ( 12) / \ e e+07 TBSCAN HSJOIN ( 13) ( 14) / \ e e e+08 TABLE: TPCD TBSCAN BTQ ORDERS ( 15) ( 16) e TABLE: TPCD TBSCAN LINEITEM ( 17) e+07 TABLE: TPCD PART This This is is the the most most expensive join join It It is is important to to ensure that that this this join join is is collocated 47 47
48 Directed Join If No Collocation e+07 HSJOIN ( 9) / \ e e+07 TBSCAN DTQ ( 10) ( 11) e+07 TABLE: TPCD PARTSUPP ORDERS LINEITEM PART PARTSUPP: partitioned on on (PARTKEY, SUPPKEY) which are are also also the the join join columns So So a directed join join is is perfect Note Note :: Join Join result of of LINEITEM, ORDERS and and PART is is partitioned on on ORDERKEY 48 48
49 Another Directed Join e+07 HSJOIN ( 7) / \ e DTQ HSJOIN ( 8) ( 18) PARTSUPP ORDERS LINEITEM PART SUPPLIER NATION HSJOIN(7) is is based on on the the SUPPKEY column and and so so a directed join join is is good good here here Join Join result of of SUPPLIER and and NATION is is still still partitioned on on SUPPKEY Join Join result of of PARTSUPP, ORDERS, LINEITEM and and PART is is partitioned on on (PARTKEY, SUPPKEY) 49 49
50 GROUP BY And ORDER BY RETURN ( 1) 225 GRPBY ( 2) MDTQ ( 3) 225 GRPBY ( 4) 225 TBSCAN ( 5) 225 SORT ( 6) Final Final aggregation at at the the coordinator node node Rows retain order when merged at at the the coordinator Intermediate aggregation on on each each node node Rows sorted on on each each node node GROUP BY nation, o_year ORDER BY nation, o_year desc 50 50
51 Parallel Lines? 51 51
52 Session A05 Parallelism Strategies In The DB2 Optimizer Calisto Zuzarte IBM Toronto Lab 52 52
Technical Report - Distributed Database Victor FERNANDES - Université de Strasbourg /2000 TECHNICAL REPORT
TECHNICAL REPORT Distributed Databases And Implementation of the TPC-H Benchmark Victor FERNANDES DESS Informatique Promotion : 1999 / 2000 Page 1 / 29 TABLE OF CONTENTS ABSTRACT... 3 INTRODUCTION... 3
More informationHigh Volume In-Memory Data Unification
25 March 2017 High Volume In-Memory Data Unification for UniConnect Platform powered by Intel Xeon Processor E7 Family Contents Executive Summary... 1 Background... 1 Test Environment...2 Dataset Sizes...
More informationTPC-H Benchmark Set. TPC-H Benchmark. DDL for TPC-H datasets
TPC-H Benchmark Set TPC-H Benchmark TPC-H is an ad-hoc and decision support benchmark. Some of queries are available in the current Tajo. You can download the TPC-H data generator here. DDL for TPC-H datasets
More informationWelcome to the presentation. Thank you for taking your time for being here.
Welcome to the presentation. Thank you for taking your time for being here. In this presentation, my goal is to share with you 10 practical points that a single partitioned DBA needs to know to get head
More informationOn-Disk Bitmap Index Performance in Bizgres 0.9
On-Disk Bitmap Index Performance in Bizgres 0.9 A Greenplum Whitepaper April 2, 2006 Author: Ayush Parashar Performance Engineering Lab Table of Contents 1.0 Summary...1 2.0 Introduction...1 3.0 Performance
More informationOracle Database In-Memory By Example
Oracle Database In-Memory By Example Andy Rivenes Senior Principal Product Manager DOAG 2015 November 18, 2015 Safe Harbor Statement The following is intended to outline our general product direction.
More informationVisual Explain Tutorial
IBM DB2 Universal Database Visual Explain Tutorial Version 8 IBM DB2 Universal Database Visual Explain Tutorial Version 8 Before using this information and the product it supports, be sure to read the
More informationWhen and How to Take Advantage of New Optimizer Features in MySQL 5.6. Øystein Grøvlen Senior Principal Software Engineer, MySQL Oracle
When and How to Take Advantage of New Optimizer Features in MySQL 5.6 Øystein Grøvlen Senior Principal Software Engineer, MySQL Oracle Program Agenda Improvements for disk-bound queries Subquery improvements
More informationChallenges in Query Optimization. Doug Inkster, Ingres Corp.
Challenges in Query Optimization Doug Inkster, Ingres Corp. Abstract Some queries are inherently more difficult than others for a query optimizer to generate efficient plans. This session discusses the
More informationTPC BENCHMARK TM H (Decision Support) Standard Specification Revision
TPC BENCHMARK TM H (Decision Support) Standard Specification Revision 2.17.3 Transaction Processing Performance Council (TPC) Presidio of San Francisco Building 572B Ruger St. (surface) P.O. Box 29920
More informationComparison of Database Cloud Services
Comparison of Database Cloud Services Benchmark Testing Overview ORACLE WHITE PAPER SEPTEMBER 2016 Table of Contents Table of Contents 1 Disclaimer 2 Preface 3 Introduction 4 Cloud OLTP Workload 5 Cloud
More informationOptimizing Queries Using Materialized Views
Optimizing Queries Using Materialized Views Paul Larson & Jonathan Goldstein Microsoft Research 3/22/2001 Paul Larson, View matching 1 Materialized views Precomputed, stored result defined by a view expression
More informationTPC BENCHMARK TM H (Decision Support) Standard Specification Revision 2.8.0
TPC BENCHMARK TM H (Decision Support) Standard Specification Revision 2.8.0 Transaction Processing Performance Council (TPC) Presidio of San Francisco Building 572B Ruger St. (surface) P.O. Box 29920 (mail)
More informationTuning Relational Systems I
Tuning Relational Systems I Schema design Trade-offs among normalization, denormalization, clustering, aggregate materialization, vertical partitioning, etc Query rewriting Using indexes appropriately,
More informationThe DB2Night Show Episode #89. InfoSphere Warehouse V10 Performance Enhancements
The DB2Night Show Episode #89 InfoSphere Warehouse V10 Performance Enhancements Pat Bates, WW Technical Sales for Big Data and Warehousing, jpbates@us.ibm.com June 27, 2012 June 27, 2012 Multi-Core Parallelism
More informationDB2 9.7 Advanced DBA for LUW
000 544 DB2 9.7 Advanced DBA for LUW Version 3.5 QUESTION NO: 1 An employee is not able to connect to the PRODDB database using the correct user ID and password. The TCP/IP protocol is running normally;
More informationParallel Database Processing on a 100 Node PC Cluster: Cases for Decision Support Query Processing and Data Mining
Parallel Database Processing on a 100 Node PC Cluster: Cases for Decision Support Query Processing and Data Mining Takayuki Tamura, Masato Oguchi, Masaru Kitsuregawa Institute of Industrial Science, The
More informationSchema Tuning. Tuning Schemas : Overview
Administração e Optimização de Bases de Dados 2012/2013 Schema Tuning Bruno Martins DEI@Técnico e DMIR@INESC-ID Tuning Schemas : Overview Trade-offs among normalization / denormalization Overview When
More informationChapter 9. Cardinality Estimation. How Many Rows Does a Query Yield? Architecture and Implementation of Database Systems Winter 2010/11
Chapter 9 How Many Rows Does a Query Yield? Architecture and Implementation of Database Systems Winter 2010/11 Wilhelm-Schickard-Institut für Informatik Universität Tübingen 9.1 Web Forms Applications
More informationSQL Query Writing Tips To Improve Performance in Db2 and Db2 Warehouse on Cloud
SQL Query Writing Tips To Improve Performance in Db2 and Db2 Warehouse on Cloud Calisto Zuzarte IBM St. Louis Db2 User s Group 201803 Tue, March 06, 2018 Db2 Warehouse Db2 Warehouse on Cloud Integrated
More informationGetting Started with SAP Sybase IQ Column Store Analytics Server
Author: Courtney Claussen SAP Sybase IQ Technical Evangelist Contributor: Bruce McManus Director of Customer Support at Sybase Getting Started with SAP Sybase IQ Column Store Analytics Server Lesson 4:
More informationThe query processor turns user queries and data modification commands into a query plan - a sequence of operations (or algorithm) on the database
query processing Query Processing The query processor turns user queries and data modification commands into a query plan - a sequence of operations (or algorithm) on the database from high level queries
More informationCSC317/MCS9317. Database Performance Tuning. Class test
CSC317/MCS9317 Database Performance Tuning Class test 7 October 2015 Please read all instructions (including these) carefully. The test time is approximately 120 minutes. The test is close book and close
More informationOptimizing Communication for Multi- Join Query Processing in Cloud Data Warehouses
Optimizing Communication for Multi- Join Query Processing in Cloud Data Warehouses Swathi Kurunji, Tingjian Ge, Xinwen Fu, Benyuan Liu, Cindy X. Chen Computer Science Department, University of Massachusetts
More informationIBM Exam A DB2 9.7 Advanced DBA for LUW Version: 6.1 [ Total Questions: 103 ]
s@lm@n IBM Exam A2090-544 DB2 9.7 Advanced DBA for LUW Version: 6.1 [ Total Questions: 103 ] Topic 1, Volume A Question No : 1 - (Topic 1) An employee is not able to connect to the PRODDB database using
More informationMaterialized Views. March 26, 2018
Materialized Views March 26, 2018 1 CREATE VIEW salessincelastmonth AS SELECT l.* FROM lineitem l, orders o WHERE l.orderkey = o.orderkey AND o.orderdate > DATE( 2015-03-31 ) SELECT partkey FROM salessincelastmonth
More informationHistogram Support in MySQL 8.0
Histogram Support in MySQL 8.0 Øystein Grøvlen Senior Principal Software Engineer MySQL Optimizer Team, Oracle February 2018 Program Agenda 1 2 3 4 5 Motivating example Quick start guide How are histograms
More informationLazy Maintenance of Materialized Views
Lazy Maintenance of Materialized Views Jingren Zhou, Microsoft Research, USA Paul Larson, Microsoft Research, USA Hicham G. Elmongui, Purdue University, USA Introduction 2 Materialized views Speed up query
More informationTwo-Phase Optimization for Selecting Materialized Views in a Data Warehouse
Two-Phase Optimization for Selecting Materialized Views in a Data Warehouse Jiratta Phuboon-ob, and Raweewan Auepanwiriyakul Abstract A data warehouse (DW) is a system which has value and role for decision-making
More informationColumn Stores vs. Row Stores How Different Are They Really?
Column Stores vs. Row Stores How Different Are They Really? Daniel J. Abadi (Yale) Samuel R. Madden (MIT) Nabil Hachem (AvantGarde) Presented By : Kanika Nagpal OUTLINE Introduction Motivation Background
More informationPerformance Issue : More than 30 sec to load. Design OK, No complex calculation. 7 tables joined, 500+ millions rows
Bienvenue Nicolas Performance Issue : More than 30 sec to load Design OK, No complex calculation 7 tables joined, 500+ millions rows Denormalize, Materialized Views, Columnstore Index Less than 5 sec to
More informationGPU ACCELERATION FOR OLAP. Tim Kaldewey, Jiri Kraus, Nikolay Sakharnykh 03/26/2018
GPU ACCELERATION FOR OLAP Tim Kaldewey, Jiri Kraus, Nikolay Sakharnykh 03/26/2018 A TYPICAL ANALYTICS QUERY From a business question to SQL Business question (TPC-H query 4) Determines how well the order
More informationFighting Redundancy in SQL
Fighting Redundancy in SQL Antonio Badia and Dev Anand Computer Engineering and Computer Science department University of Louisville, Louisville KY 40292 Abstract. Many SQL queries with aggregated subqueries
More informationThat is my advertised agenda, but I might skip about a bit The idea is to share observations gleaned from experience of using systems that have been
1 2 3 That is my advertised agenda, but I might skip about a bit The idea is to share observations gleaned from experience of using systems that have been upgraded to V11.1 It is not going to present any
More informationABSTRACT. GUPTA, SHALU View Selection for Query-Evaluation Efficiency using Materialized
ABSTRACT GUPTA, SHALU View Selection for Query-Evaluation Efficiency using Materialized Views (Under the direction of Dr. Rada Chirkova) The purpose of this research is to show the use of derived data
More informationAutomating Physical Database Design in a Parallel Database
Automating Physical Database Design in a Parallel Database Jun Rao IBM Almaden Research Center junrao@almaden.ibm.com Chun Zhang University of Wisconsin at Madison czhang@cs.wisc.edu Nimrod Megiddo IBM
More informationOrri Erling (Program Manager, OpenLink Virtuoso), Ivan Mikhailov (Lead Developer, OpenLink Virtuoso).
Orri Erling (Program Manager, OpenLink Virtuoso), Ivan Mikhailov (Lead Developer, OpenLink Virtuoso). Business Intelligence Extensions for SPARQL Orri Erling and Ivan Mikhailov OpenLink Software, 10 Burlington
More informationAccelerating Analytical Workloads
Accelerating Analytical Workloads Thomas Neumann Technische Universität München April 15, 2014 Scale Out in Big Data Analytics Big Data usually means data is distributed Scale out to process very large
More informationDB2 for LUW Advanced Statistics with Statistical Views. John Hornibrook Manager DB2 for LUW Query Optimization Development
DB2 for LUW Advanced Statistics with Statistical Views John Hornibrook Manager DB2 for LUW Query Optimization Development 1 Session Information Presentation Category: DB2 for LUW 2 DB2 for LUW Advanced
More informationDistributed KIDS Labs 1
Distributed Databases @ KIDS Labs 1 Distributed Database System A distributed database system consists of loosely coupled sites that share no physical component Appears to user as a single system Database
More informationWhitepaper. Big Data implementation: Role of Memory and SSD in Microsoft SQL Server Environment
Whitepaper Big Data implementation: Role of Memory and SSD in Microsoft SQL Server Environment Scenario Analysis of Decision Support System with Microsoft Windows Server 2012 OS & SQL Server 2012 and Samsung
More informationPhysical Design. Elena Baralis, Silvia Chiusano Politecnico di Torino. Phases of database design D B M G. Database Management Systems. Pag.
Physical Design D B M G 1 Phases of database design Application requirements Conceptual design Conceptual schema Logical design ER or UML Relational tables Logical schema Physical design Physical schema
More information7. Query Processing and Optimization
7. Query Processing and Optimization Processing a Query 103 Indexing for Performance Simple (individual) index B + -tree index Matching index scan vs nonmatching index scan Unique index one entry and one
More informationDeclarative Partitioning Has Arrived!
Declarative Partitioning Has Arrived! Ashutosh Bapat (EnterpriseDB) Amit Langote (NTT OSS center) @PGConf.ASIA 2017 Copyright EnterpriseDB Corporation, 2015. All Rights Reserved. 1 Query Optimization Techniques
More informationHybrid Shipping Architectures: A Survey
Hybrid Shipping Architectures: A Survey Ivan Bowman itbowman@acm.org http://plg.uwaterloo.ca/~itbowman CS748T 14 Feb 2000 Outline Partitioning query processing Partitioning client code Optimization of
More informationBeyond EXPLAIN. Query Optimization From Theory To Code. Yuto Hayamizu Ryoji Kawamichi. 2016/5/20 PGCon Ottawa
Beyond EXPLAIN Query Optimization From Theory To Code Yuto Hayamizu Ryoji Kawamichi 2016/5/20 PGCon 2016 @ Ottawa Historically Before Relational Querying was physical Need to understand physical organization
More informationOptimizer Standof. MySQL 5.6 vs MariaDB 5.5. Peter Zaitsev, Ovais Tariq Percona Inc April 18, 2012
Optimizer Standof MySQL 5.6 vs MariaDB 5.5 Peter Zaitsev, Ovais Tariq Percona Inc April 18, 2012 Thank you Ovais Tariq Ovais Did a lot of heavy lifing for this presentation He could not come to talk together
More informationAvoiding Sorting and Grouping In Processing Queries
Avoiding Sorting and Grouping In Processing Queries Outline Motivation Simple Example Order Properties Grouping followed by ordering Order Property Optimization Performance Results Conclusion Motivation
More informationPostgres-XC PG session #3. Michael PAQUIER Paris, 2012/02/02
Postgres-XC PG session #3 Michael PAQUIER Paris, 2012/02/02 Agenda Self-introduction Highlights of Postgres-XC Core architecture overview Performance High-availability Release status 2 Self-introduction
More informationData Warehouse Tuning. Without SQL Modification
Data Warehouse Tuning Without SQL Modification Agenda About Me Tuning Objectives Data Access Profile Data Access Analysis Performance Baseline Potential Model Changes Model Change Testing Testing Results
More informationMultidimensional Clustering (MDC) Tables in DB2 LUW. DB2Night Show. January 14, 2011
Multidimensional Clustering (MDC) Tables in DB2 LUW DB2Night Show January 14, 2011 Pat Bates, IBM Technical Sales Professional, Data Warehousing Paul Zikopoulos, Director, IBM Information Management Client
More informationDatabase Design. Wenfeng Xu Hanxiang Zhao
Database Design Wenfeng Xu Hanxiang Zhao Automated Partitioning Design in Parallel Database Systems MPP system: A distributed computer system which consists of many individual nodes, each of which is essentially
More informationIBM DB2 LUW Performance Tuning and Monitoring for Single and Multiple Partition DBs
IBM DB2 LUW Performance Tuning and Monitoring for Single and Multiple Partition DBs Day(s): 5 Course Code: CL442G Overview Learn how to tune for optimum the IBM DB2 9 for Linux, UNIX, and Windows relational
More informationTowards Comprehensive Testing Tools
Towards Comprehensive Testing Tools Redefining testing mechanisms! Kuntal Ghosh (Software Engineer) PGCon 2017 26.05.2017 1 Contents Motivation Picasso Visualizer Picasso Art Gallery for PostgreSQL 10
More informationChapter 17: Parallel Databases
Chapter 17: Parallel Databases Introduction I/O Parallelism Interquery Parallelism Intraquery Parallelism Intraoperation Parallelism Interoperation Parallelism Design of Parallel Systems Database Systems
More informationDesign and Implementation of Bit-Vector filtering for executing of multi-join qureies
Undergraduate Research Opportunity Program (UROP) Project Report Design and Implementation of Bit-Vector filtering for executing of multi-join qureies By Cheng Bin Department of Computer Science School
More informationA Nested Relational Approach to Processing SQL Subqueries
A Nested Relational Approach to Processing SQL Subqueries Bin Cao bin.cao@louisville.edu Antonio Badia abadia@louisville.edu Computer Engineering and Computer Science Department University of Louisville
More informationAdvanced Databases: Parallel Databases A.Poulovassilis
1 Advanced Databases: Parallel Databases A.Poulovassilis 1 Parallel Database Architectures Parallel database systems use parallel processing techniques to achieve faster DBMS performance and handle larger
More informationPresentation Abstract
Presentation Abstract From the beginning of DB2, application performance has always been a key concern. There will always be more developers than DBAs, and even as hardware cost go down, people costs have
More informationInterpreting Explain Plan Output. John Mullins
Interpreting Explain Plan Output John Mullins jmullins@themisinc.com www.themisinc.com www.themisinc.com/webinars Presenter John Mullins Themis Inc. (jmullins@themisinc.com) 30+ years of Oracle experience
More informationQuery Optimization Overview
Query Optimization Overview parsing, syntax checking semantic checking check existence of referenced relations and attributes disambiguation of overloaded operators check user authorization query rewrites
More informationOracle. Professional. WITH Function-Based Indexes (FBIs), I was able to alter an execution. Avoid Costly Joins with FBIs Pedro Bizarro.
Oracle Solutions for High-End Oracle DBAs and Developers Professional Avoid Costly Joins with FBIs Pedro Bizarro In this article, Pedro Bizarro describes how to use Function-Based Indexes to avoid costly
More informationQuery tuning with Optimization Service Center
Session: F08 Query tuning with Optimization Service Center Patrick Bossman IBM May 20, 2008 4:00 p.m. 5:00 p.m. Platform: DB2 for z/os 1 Agenda Overview of Optimization Service Center Workload (application)
More informationMultiple query optimization in middleware using query teamwork
SOFTWARE PRACTICE AND EXPERIENCE Softw. Pract. Exper. 2005; 35:361 391 Published online 21 December 2004 in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/spe.640 Multiple query optimization
More informationPostgres-XC PostgreSQL Conference Michael PAQUIER Tokyo, 2012/02/24
Postgres-XC PostgreSQL Conference 2012 Michael PAQUIER Tokyo, 2012/02/24 Agenda Self-introduction Highlights of Postgres-XC Core architecture overview Performance High-availability Release status Copyright
More informationFighting Redundancy in SQL: the For-Loop Approach
Fighting Redundancy in SQL: the For-Loop Approach Antonio Badia and Dev Anand Computer Engineering and Computer Science department University of Louisville, Louisville KY 40292 July 8, 2004 1 Introduction
More informationToward a Progress Indicator for Database Queries
Toward a Progress Indicator for Database Queries Gang Luo Jeffrey F. Naughton Curt J. Ellmann Michael W. Watzke University of Wisconsin-Madison NCR Advance Development Lab {gangluo, naughton}@cs.wisc.edu
More informationAdvanced Query Optimization
Advanced Query Optimization Andreas Meister Otto-von-Guericke University Magdeburg Summer Term 2018 Why do we need query optimization? Andreas Meister Advanced Query Optimization Last Change: April 23,
More information! Parallel machines are becoming quite common and affordable. ! Databases are growing increasingly large
Chapter 20: Parallel Databases Introduction! Introduction! I/O Parallelism! Interquery Parallelism! Intraquery Parallelism! Intraoperation Parallelism! Interoperation Parallelism! Design of Parallel Systems!
More informationChapter 20: Parallel Databases
Chapter 20: Parallel Databases! Introduction! I/O Parallelism! Interquery Parallelism! Intraquery Parallelism! Intraoperation Parallelism! Interoperation Parallelism! Design of Parallel Systems 20.1 Introduction!
More informationChapter 20: Parallel Databases. Introduction
Chapter 20: Parallel Databases! Introduction! I/O Parallelism! Interquery Parallelism! Intraquery Parallelism! Intraoperation Parallelism! Interoperation Parallelism! Design of Parallel Systems 20.1 Introduction!
More informationmanagement systems Elena Baralis, Silvia Chiusano Politecnico di Torino Pag. 1 Distributed architectures Distributed Database Management Systems
atabase Management Systems istributed database istributed architectures atabase Management Systems istributed atabase Management Systems ata and computation are distributed over different machines ifferent
More informationSelf-Tuning Database Systems: The AutoAdmin Experience
Self-Tuning Database Systems: The AutoAdmin Experience Surajit Chaudhuri Data Management and Exploration Group Microsoft Research http://research.microsoft.com/users/surajitc surajitc@microsoft.com 5/10/2002
More informationParallel DBMS. Parallel Database Systems. PDBS vs Distributed DBS. Types of Parallelism. Goals and Metrics Speedup. Types of Parallelism
Parallel DBMS Parallel Database Systems CS5225 Parallel DB 1 Uniprocessor technology has reached its limit Difficult to build machines powerful enough to meet the CPU and I/O demands of DBMS serving large
More informationNew Requirements. Advanced Query Processing. Top-N/Bottom-N queries Interactive queries. Skyline queries, Fast initial response time!
Lecture 13 Advanced Query Processing CS5208 Advanced QP 1 New Requirements Top-N/Bottom-N queries Interactive queries Decision making queries Tolerant of errors approximate answers acceptable Control over
More informationComparison of Database Cloud Services
Comparison of Database Cloud Services Testing Overview ORACLE WHITE PAPER SEPTEMBER 2016 Table of Contents Table of Contents 1 Disclaimer 2 Preface 3 Introduction 4 Cloud OLTP Workload 5 Cloud Analytic
More informationMidterm Review. March 27, 2017
Midterm Review March 27, 2017 1 Overview Relational Algebra & Query Evaluation Relational Algebra Rewrites Index Design / Selection Physical Layouts 2 Relational Algebra & Query Evaluation 3 Relational
More informationSELECT Product.name, Purchase.store FROM Product JOIN Purchase ON Product.name = Purchase.prodName
Announcements Introduction to Data Management CSE 344 Lectures 5: More SQL aggregates Homework 2 has been released Web quiz 2 is also open Both due next week 1 2 Outline Outer joins (6.3.8, review) More
More informationA Examcollection.Premium.Exam.54q
A2090-544.Examcollection.Premium.Exam.54q Number: A2090-544 Passing Score: 800 Time Limit: 120 min File Version: 32.2 http://www.gratisexam.com/ Exam Code: A2090-544 Exam Name: Assessment: DB2 9.7 Advanced
More informationRequest Window: an Approach to Improve Throughput of RDBMS-based Data Integration System by Utilizing Data Sharing Across Concurrent Distributed Queries Rubao Lee, Minghong Zhou, Huaming Liao Research
More informationDB2 Performance Essentials
DB2 Performance Essentials Philip K. Gunning Certified Advanced DB2 Expert Consultant, Lecturer, Author DISCLAIMER This material references numerous hardware and software products by their trade names.
More informationOracle DB-Tuning Essentials
Infrastructure at your Service. Oracle DB-Tuning Essentials Agenda 1. The DB server and the tuning environment 2. Objective, Tuning versus Troubleshooting, Cost Based Optimizer 3. Object statistics 4.
More informationChapter 18: Parallel Databases
Chapter 18: Parallel Databases Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 18: Parallel Databases Introduction I/O Parallelism Interquery Parallelism Intraquery
More informationChapter 18: Parallel Databases. Chapter 18: Parallel Databases. Parallelism in Databases. Introduction
Chapter 18: Parallel Databases Chapter 18: Parallel Databases Introduction I/O Parallelism Interquery Parallelism Intraquery Parallelism Intraoperation Parallelism Interoperation Parallelism Design of
More informationJayant Haritsa. Database Systems Lab Indian Institute of Science Bangalore, India
Jayant Haritsa Database Systems Lab Indian Institute of Science Bangalore, India Query Execution Plans SQL, the standard database query interface, is a declarative language Specifies only what is wanted,
More informationTuning DBM and DB Configuration Parameters
DB2 for Linux, UNIX, Windows Tuning DBM and DB Configuration Parameters Philip K. Gunning DB2 Master Practitioner Gunning Technology Solutions, LLC 21 September 2005 DB2 is a registered trademark of IBM
More informationAdvanced Oracle SQL Tuning v3.0 by Tanel Poder
Advanced Oracle SQL Tuning v3.0 by Tanel Poder /seminar Training overview This training session is entirely about making Oracle SQL execution run faster and more efficiently, understanding the root causes
More informationDB2 9 for z/os Selected Query Performance Enhancements
Session: C13 DB2 9 for z/os Selected Query Performance Enhancements James Guo IBM Silicon Valley Lab May 10, 2007 10:40 a.m. 11:40 a.m. Platform: DB2 for z/os 1 Table of Content Cross Query Block Optimization
More informationData Manipulation (DML) and Data Definition (DDL)
Data Manipulation (DML) and Data Definition (DDL) 114 SQL-DML Inserting Tuples INSERT INTO REGION VALUES (6,'Antarctica','') INSERT INTO NATION (N_NATIONKEY, N_NAME, N_REGIONKEY) SELECT NATIONKEY, NAME,
More information<Insert Picture Here> Inside the Oracle Database 11g Optimizer Removing the black magic
Inside the Oracle Database 11g Optimizer Removing the black magic Hermann Bär Data Warehousing Product Management, Server Technologies Goals of this session We will Provide a common
More informationIBM CE243G - USING QUEUE REPLICATION
IBM CE243G - USING QUEUE REPLICATION Dauer: 4 Tage Nr.: 37483 Preis: 2.590,00 netto / 3.082,10 inkl. 19% MwSt. Durchführungsart: Präsenztraining Schulungsmethode: presentation, discussion, hands-on exercises,
More informationOracle on HP Storage
Oracle on HP Storage Jaime Blasco EMEA HP/Oracle CTC Cooperative Technology Center Boeblingen - November 2004 2003 Hewlett-Packard Development Company, L.P. The information contained herein is subject
More informationOverview of Implementing Relational Operators and Query Evaluation
Overview of Implementing Relational Operators and Query Evaluation Chapter 12 Motivation: Evaluating Queries The same query can be evaluated in different ways. The evaluation strategy (plan) can make orders
More informationData warehouse and Data Mining
Data warehouse and Data Mining Lecture No. 13 Teradata Architecture and its compoenets Naeem A. Mahoto Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and
More informationColumnstore and B+ tree. Are Hybrid Physical. Designs Important?
Columnstore and B+ tree Are Hybrid Physical Designs Important? 1 B+ tree 2 C O L B+ tree 3 B+ tree & Columnstore on same table = Hybrid design 4? C O L C O L B+ tree B+ tree ? C O L C O L B+ tree B+ tree
More informationCopyright 2016 Ramez Elmasri and Shamkant B. Navathe
CHAPTER 19 Query Optimization Introduction Query optimization Conducted by a query optimizer in a DBMS Goal: select best available strategy for executing query Based on information available Most RDBMSs
More informationTPC-D: Benchmarking for Decision Support
TPC-D: Benchmarking for Decision Support Carrie Ballinger, NCR Parallel Systems I. Introduction: A Child of the Nineties In the 1990 world that the TPC-D development effort was born into, decision support
More informationOracle. Exam Questions 1Z Oracle Database 11g Release 2: SQL Tuning Exam. Version:Demo
Oracle Exam Questions 1Z0-117 Oracle Database 11g Release 2: SQL Tuning Exam Version:Demo 1.You ran a high load SQL statement that used an index through the SQL Tuning Advisor and accepted its recommendation
More informationScalable Access to SAS Data Billy Clifford, SAS Institute Inc., Austin, TX
Scalable Access to SAS Data Billy Clifford, SAS Institute Inc., Austin, TX ABSTRACT Symmetric multiprocessor (SMP) computers can increase performance by reducing the time required to analyze large volumes
More informationDistributed Database Management Systems. Data and computation are distributed over different machines Different levels of complexity
atabase Management Systems istributed database atabase Management Systems istributed atabase Management Systems B M G 1 istributed architectures ata and computation are distributed over different machines
More information