DB2 9 for z/os Selected Query Performance Enhancements

Session: C13 DB2 9 for z/os Selected Query Performance Enhancements James Guo IBM Silicon Valley Lab May 10, 2007 10:40 a.m. 11:40 a.m. Platform: DB2 for z/os 1

Table of Content Cross Query Block Optimization Generalize Sparse Index and In Memory Data Caching More Statistics to Improve Access Path Query Parallelism Enhancements Summary 2 2

Query Performance Problems Global Query Optimization addresses query performance problems caused when DB2 breaks a query into multiple parts and optimizes each of those parts independently. SELECT * FROM T1 WHERE EXISTS (SELECT 1 FROM T2, T3 WHERE T2.C2 = T3.C2 AND T2.C1 = T1.C1); DB2 will break this query into 2 parts (the correlated subquery and the outer query) and each of these parts will be optimized independently. The access path for the subquery does not take into account the different ways in which the table in the outer query may be accessed and vice-versa. While each of the individual parts may be optimized to run efficiently, when these parts are combined the overall result may be inefficient. 3 Global Query Optimization, addresses query performance problems caused when DB2 breaks a query into multiple parts and optimizes each of those parts independently. While each of the individual parts may be optimized to run efficiently, when these parts are combined the overall result may be inefficient. For example, consider the following query: SELECT * FROM T1 WHERE EXISTS (SELECT 1 FROM T2, T3 WHERE T2.C2 = T3.C2 AND T2.C1 = T1.C1); DB2 will break this query into 2 parts (the correlated subquery and the outer query) and each of these parts will be optimized independently. The access path for the subquery does not take into account the different ways in which the table in the outer query may be accessed and vice-versa. DB2 may choose to do a table scan of T1, resulting in much random I/O when accessing T2, while a nonmatching index scan of T1 would avoid the random I/O on T2. In addition, DB2 will not consider reordering these 2 parts. The correlated subquery will always be performed after accessing T1 to get the correlation value. If T1 is a large table, and T2 is a small table, it may be much more efficient to access T2 first and then T1 (especially if there is no index on T2.C1 but there is an index on T1.C1). In summary, Global Query Optimization will allow DB2 to optimize a query as a whole rather than as independent parts. This is accomplished by allowing DB2 to: 1. Consider the effect of one queryblock on another 2. Consider reordering queryblocks 3

Functional description The purpose of this enhancement is to improve query performance by enhancing the DB2 Optimizer so that more efficient access paths are generated for queries that involve multiple parts. There is no new function per se. EXPLAIN output will be modified to make it easier to tell what the execution sequence is for these types of queries. 4 Functional Description The purpose of this enhancement is to improve query performance by enhancing the DB2 Optimizer so that more efficient access paths are generated for queries that involve multiple parts. The changes are within the DB2 Optimizer and the DB2 Runtime components. There is no new function per se. However, DB2 does document in some detail the way in which a query that involves multiple parts is performed. Also, since the way in which a query with multiple parts will be performed is no longer fixed to the way in which the query was coded, the EXPLAIN output will be modified to make it easier to tell what the execution sequence is for these types of queries. 4

Problem scenario 1 Large Non-correlated subquery is materialized SELECT * FROM SMALL_TABLE A WHERE A.C1 IN (SELECT B.C1 FROM BIG_TABLE B) BIG_TABLE is scanned and put into workfile SMALL_TABLE is joined with the workfile Much more efficient if scan / materialization of BIG_TABLE was avoided SELECT * FROM SMALL_TABLE A WHERE EXISTS (SELECT 1 FROM BIG_TABLE B WHERE B.C1 = A.C1) Allows matching index access on BIG_TABLE 5 Subquery Processing All subqueries will now be processed by the DB2 Optimizer differently than before, and the new processing is summarized as follows: The subquery itself is represented as a virtual table in the FROM clause containing the predicate with the subquery This virtual table may be moved around within the referencing query in order to obtain the most efficient sequence of operations Predicates may be derived from the correlation references in the subquery and also from the subquery SELECT list These predicates can be applied to either the subquery tables or the tables containing the correlated columns depending on the position of the virtual table When determining the access path for a subquery, the context in which the subquery occurs will be taken into consideration When determining the access path for a query that references a subquery, the effect the access path has on the subquery will be taken into consideration. 5

Problem scenario 2 Large outer table scanned rather than using matching index access SELECT * FROM BIG_TABLE A WHERE EXISTS (SELECT 1 FROM SMALL_TABLE B WHERE A.C1 = B.C1) BIG_TABLE is scanned to obtain A.C1 value SMALL_TABLE gets matching index access Much more efficient to get matching index access on BIG_TABLE SELECT * FROM BIG_TABLE A WHERE A.C1 IN (SELECT B.C1 FROM SMALL_TABLE B) SMALL_TABLE scanned and put in workfile Allows matching index access on BIG_TABLE 6 6

Performance improvements Improve Query Performance Consider both correlated and non-correlated forms of a given query Consider the inter-query block combinations Select the form / combination with the lowest overall estimated cost 7 7

Enhanced Optimizer costing Cost both Correlated and Non-correlated Forms of the Query Internally represent the query in both it s correlated and non-correlated forms and let the Optimizer cost each Final access path choice is deferred until the whole query has been costed Once the final access path is selected remove the form (correlated or non-correlated) that was not selected Allows most efficient overall access path to be selected 8 8

EXPLAIN Output Correlated subquery Additional row to represent a subquery which has considered both correlated and non-correlated forms TNAME for subquery will be DSNWFQB(nn) where nn is the queryblock number associated with the subquery Correlated subquery access Subquery is the inner table, join method is nested loop join (METHOD=1) New access type added to show correlated subquery access ACCESSTYPE= O 9 EXPLAIN Output The EXPLAIN output in the PLAN_TABLE will be modified to show virtual tables which are materialized to a workfile. The table name for a virtual table will use a naming convention similar to that used for MQB s (mini-queryblocks). The name will include an indicator of the queryblock number of the associated subquery (i.e. DSNVT(02) ). The table type for virtual tables that are materialized will be "W" for "Workfile". Virtual tables that are not materialized will not be shown in the EXPLAIN output. 9

EXPLAIN Output Non-correlated subquery Non-correlated subquery access Always involves a workfile. This workfile can be either the inner table, or the outer table. When workfile is the outer table, the workfile is scanned (ACCESSTYPE= R ) When workfile is the inner table, sparse index access will be used to access the workfile (ACCESSTYPE= R, PRIMARY_ACCESSTYPE= T ) Join method in either case is nested loop join or hybrid join 10 EXPLAIN Output The EXPLAIN output in the PLAN_TABLE will be modified to show virtual tables which are materialized to a workfile. The table name for a virtual table will use a naming convention similar to that used for MQB s (mini-queryblocks). The name will include an indicator of the queryblock number of the associated subquery (i.e. DSNVT(02) ). The table type for virtual tables that are materialized will be "W" for "Workfile". Virtual tables that are not materialized will not be shown in the EXPLAIN output. 10

Additional column PARENT_PLANNO Used with PARENT_QBLOCKNO (existing column) to connect child queryblock to parent miniplan For correlated subqueries it corresponds to the plan number in the parent queryblock where the correlated subquery is invoked. For non-correlated subqueries it corresponds to the plan number in the parent queryblock that represents the workfile for the subquery. IFCID 22 changes to stay in sync with EXPLAIN change New field to hold the value of the PARENT_PLANNO column 11 Also, a new column is added to the PLAN_TABLE called "PARENT_PLANNO". For correlated subqueries it corresponds to the plan number in the parent queryblock where the correlated subquery is invoked. For non-correlated subqueries it corresponds to the plan number in the parent queryblock that represents the workfile for the subquery. 11

Other considerations INSERT, UPDATE and DELETE INSERT, UPDATE and DELETE statements that contain the types of subqueries discussed previously will be handled the same as SELECT statements that contain these subqueries. No special consideration needs be given to INSERT, UPDATE and DELETE. 12 INSERT, UPDATE, DELETE INSERT, UPDATE and DELETE statements that contain the types of subqueries discussed previously will be handled the same as SELECT statements that contain these subqueries. No special consideration need be given to INSERT, UPDATE and DELETE. Optimization Hints Optimization hints will be supported. Since the EXPLAIN output is enhanced to show the virtual tables and the position in which the virtual table is accessed, that information can be fed into the DB2 Optimizer as a hint using the existing Optimization hints support. For example, this means users will have the ability to request that a non-correlated subquery be processed in it s correlated form. Or, that a correlated subquery be processed in its decorrelated form. This allows for greater control over how a query is processed without requiring a change to the way in which the query is coded. 12

DB2 for z/os Limits The maximum number of tables that can be specified in a single SQL statement is 225. Generation of virtual tables can cause the total number of tables to exceed 225. This is OK as long as the total number of tables after generation of virtual tables does not exceed 512. Unlikely, but if total exceeds 512 then a 101 SQLCODE will be returned 13 Limits for DB2 for z/os The maximum number of tables that can be specified in a single SQL statement is 225. However, the generation of virtual tables can cause the total number of tables to exceed 225. This is OK as long as the total number of tables after generation of virtual tables does not exceed 512. In the unlikely case that the total number of tables would exceed 512 then a 101 SQLCODE will be returned 13

Application Programming & Performance DB2 will be less sensitive to how a particular query is coded in terms of access path selection (the major factor in query performance). It is now less important whether the query is coded as a correlated subquery or as a non-correlated subquery. In many cases, DB2 will be able to consider both forms and select the more efficient one. This means that if a query was coded a certain way specifically to get a certain access path, that access path may no longer be selected by DB2. 14 Application Programming Users should be aware that these changes should make DB2 less sensitive to how a particular query is coded in terms of access path selection (the major factor in query performance). This means that with the addition of this line item, it is less important whether the query is coded as a correlated subquery or as a non-correlated subquery. DB2 should be able to select the best access path regardless of which form the query was coded in. This also means that if a query was coded a certain way specifically to get a certain access path, that access path may no longer be selected by DB2. This might result in these types of queries needing to be retuned to resolve query performance issues. Performance Monitoring and Tuning The changes to the EXPLAIN output will effect it's usage in access path analysis and tuning. Users should be aware of these changes and how to interpret the new EXPLAIN output. In addition, these changes may result in some query performance degradation. Users should be aware of what to look for and how to resolve these types of problems when they are encountered. 14

Generalize Sparse Index Before V9 What is sparse index? What is In-memory work file? Sparse index is used in non-correlated subquery starting in V4 Sparse index or In-memory work file (IMWF) is used in Star Join 15 15

More Sparse Index Usage in V9 DB2 use sparse index or IMWF internally for tables which do not have appropriate index in order to improve performance Base Table Materialized View Materialized Table Expression Temporary Table Materialized Virtual Table 16 16

Sparse Index in V9 Sparse index or IMWF can be used to access the inner table of nested-loop join, sort new table is required DB2 determines whether to use sparse index or IMWF at runtime depending on the available storage 17 17

Query Performance Improvement Avoid sorting a large outer table in a sortmerge join when inner table does not have index. Use sparse index or IMWF in the inner table of nested-loop join More efficient search in the inner table work file Significant IO reduction More exploitation of query parallelism Up to 2X query performance improvement observed 18 18

Restriction Sparse index is not used when The join method is not nested-loop join The inner table has index on the join column It is correlated subquery The target table is in a MERGE statement The predicate is not equi-join predicate The columns of join predicate have unmatched CCSID 19 19

Why Use Histogram Statistics? Family compatibility LUW has it Customer requirement Improve Query Performance 20 20

Scenario 1 Example#1: Sparse and dense ranges SALES_DATE BETWEEN 2006-12-10 AND 2006-12-24 returns significantly more rows than a 2 week range in March Query: SELECT * FROM T WHERE T.C1 between a sparse range SELECT * FROM T WHERE T.C1 between a dense range 21 21

Scenario 2 Example#2: When gaps exist in ranges SAP uses INTEGER (or worse, VARCHAR) to store YEAR- MONTH data. There are 12 values in 200501~200512, but zero value in 200513~200600. Query: SELECT * FROM T WHERE T.C1 between a skipped range SELECT * FROM T WHERE T.C1 between a non-skipped range Query: T.C1 between 200512 and 200601; 90 valid numerics, but only 2 valid dates T.C1 between 200501 and 200512; 12 valid numerics, and 12 valid dates 22 22

Scenario 3 Example#3: Non-existed values out of [lowkey, highkey] range DB2 only records the 2 nd highest/lowest value for a column. Hard to detect any out-of-range value. Query: SELECT * FROM T WHERE T.C1 = non-existed value All 3 examples: Facts: C1 column cardinality is 10,000(or bigger) Today: need to collect all 10,000(or more) distinctive values in frequency stats 23 23

Syntax RUNSTATS TABLESPACE 24 24

Syntax RUNSTATS INDEX 25 25

Equal-depth Histogram RUNSTATS will produce equal-depth histogram, i.e. each interval (range) will have about the same number of rows (not the same number of values). Maximum number of intervals is 100 Same value stays in the same interval NULL value has its own interval Possible skipped gaps between intervals Possible interval containing single value only 26 26

Example Table T of TRANSACTIONS, column C1 of TRANSACTION_DATE as DATE type. QuantileNo 1 2 3 4 5 24 Lowvalue 2006-01-01 2006-01-21 2006-03-01 2006-03-15 2006-04-01 Null Highvalue 2006-01-20 2006-02-10 2006-03-15 2006-03-15 2006-05-01 Null Cardf 20 9 10 1 30 1 FrequencyF 2.1% 1.8% 2.0% 5% 1.7% 0.0005% 27 27

Catalog New columns are added to SYSCOLDIST, SYSKEYTGTDIST QUANTILENO LOWVALUE HIGHVALUE All three new columns are updatable The new columns are also added to 4 more tables. SYSCOLDIST_HIST, SYSCOLDISTSTATS, SYSKEYTGTDIST_HIST, SYSKEYTGTDISTSTATS 28 28

RUNSTATS RUNSTATS TABLESPACE on column or column groups Sort is needed, if Frequency Statistics is also specified, then they share the same sort RUNSTATS INDEX For index with key columns of mixed order, histogram stats can only be collected on the prefix columns with same order. If the specified key columns for histogram statistics are of mixed order, a warning message DSNU633 is issued REORG TABLESPACE and LOAD do NOT support HISTOGRAM statistics. 29 29

Access Path Selection Use histogram statistics to evaluate predicate selectivity RANGE/LIKE/BETWEEN predicate All fully qualified intervals, plus Interpolation of partially qualified intervals Also helps EQ, ISNULL, INLIST and COL op COL EQ/ISNULL/INLIST a single-value interval matched the searching literal, or Interpolation within the covering interval COL op COL Pair up two histogram intervals which satisfy operator OP. Histogram improves predicate selectivity estimation 30 30

Performance Use histogram statistics to improve query performance Collect histogram stats for RANGE/LIKE/BETWEEN predicate Observed up to 2X elapsed time and/or cpu time improvement for several queries Better join sequence reduces data, index and work file getpages 31 31

Query Parallelism Enhancements Partition on inner table of the nested-loop join if the outer table is materialized work file Enable more query parallelism for correlated subquery when the query is decorrelated Enhanced query parallelism costing More balanced partitions to reduce query elapsed time 32 32

Summary Significant query performance enhancements from Cross query block optimization More usage in sparse index and IMWF More statistics to improve access path More query parallelism 33 33

Session: C13 DB2 for z/os Selected Query Performance Enhancements James Guo IBM, Silicon Valley Lab guojw@us.ibm.com 34 34