Chapter 9. Cardinality Estimation. How Many Rows Does a Query Yield? Architecture and Implementation of Database Systems Winter 2010/11

Similar documents
Module 9: Selectivity Estimation

Cost Models. the query database statistics description of computational resources, e.g.

DB2 for LUW Advanced Statistics with Statistical Views. John Hornibrook Manager DB2 for LUW Query Optimization Development

Query Optimization Overview

Architecture and Implementation of Database Systems (Winter 2015/16)

Principles of Data Management. Lecture #13 (Query Optimization II)

Announcement. Reading Material. Overview of Query Evaluation. Overview of Query Evaluation. Overview of Query Evaluation 9/26/17

Overview of Query Processing. Evaluation of Relational Operations. Why Sort? Outline. Two-Way External Merge Sort. 2-Way Sort: Requires 3 Buffer Pages

Review. Relational Query Optimization. Query Optimization Overview (cont) Query Optimization Overview. Cost-based Query Sub-System

Optimization Overview

QUERY OPTIMIZATION E Jayant Haritsa Computer Science and Automation Indian Institute of Science. JAN 2014 Slide 1 QUERY OPTIMIZATION

Database Applications (15-415)

Evaluation of relational operations

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Faloutsos 1. Carnegie Mellon Univ. Dept. of Computer Science Database Applications. Outline

Query Evaluation! References:! q [RG-3ed] Chapter 12, 13, 14, 15! q [SKS-6ed] Chapter 12, 13!

R & G Chapter 13. Implementation of single Relational Operations Choices depend on indexes, memory, stats, Joins Blocked nested loops:

Overview of Implementing Relational Operators and Query Evaluation

Database Systems. Announcement. December 13/14, 2006 Lecture #10. Assignment #4 is due next week.

DBMS Query evaluation

CS 245 Midterm Exam Solution Winter 2015

Review. Support for data retrieval at the physical level:

CS 4604: Introduction to Database Management Systems. B. Aditya Prakash Lecture #10: Query Processing

MIS Database Systems.

Chapter 12: Query Processing

Chapter 13: Query Processing

Data Modeling and Databases Ch 9: Query Processing - Algorithms. Gustavo Alonso Systems Group Department of Computer Science ETH Zürich

CMPT 354: Database System I. Lecture 7. Basics of Query Optimization

! Parallel machines are becoming quite common and affordable. ! Databases are growing increasingly large

Chapter 20: Parallel Databases

Chapter 20: Parallel Databases. Introduction

External Sorting Sorting Tables Larger Than Main Memory

Database Systems External Sorting and Query Optimization. A.R. Hurson 323 CS Building

Relational Query Optimization

CS 245 Midterm Exam Winter 2014

15-415/615 Faloutsos 1

! A relational algebra expression may have many equivalent. ! Cost is generally measured as total elapsed time for

Chapter 13: Query Processing Basic Steps in Query Processing

Database Systems SQL SL03

CMSC424: Database Design. Instructor: Amol Deshpande

Chapter 18: Parallel Databases

Chapter 18: Parallel Databases. Chapter 18: Parallel Databases. Parallelism in Databases. Introduction

Data about data is database Select correct option: True False Partially True None of the Above

Something to think about. Problems. Purpose. Vocabulary. Query Evaluation Techniques for large DB. Part 1. Fact:

Outline. Database Management and Tuning. Outline. Join Strategies Running Example. Index Tuning. Johann Gamper. Unit 6 April 12, 2012

Data Modeling and Databases Ch 10: Query Processing - Algorithms. Gustavo Alonso Systems Group Department of Computer Science ETH Zürich

From SQL-query to result Have a look under the hood

Architecture and Implementation of Database Systems (Summer 2018)

BIS Database Management Systems.

Database Systems SQL SL03

Architecture and Implementation of Database Systems Query Processing

Schema for Examples. Query Optimization. Alternative Plans 1 (No Indexes) Motivating Example. Alternative Plans 2 With Indexes

Chapter 12: Query Processing

Database Applications (15-415)

Database System Concepts

Query Optimization. Schema for Examples. Motivating Example. Similar to old schema; rname added for variations. Reserves: Sailors:

Chapter 17: Parallel Databases

ATYPICAL RELATIONAL QUERY OPTIMIZER

Principles of Data Management. Lecture #9 (Query Processing Overview)

Chapter 12: Indexing and Hashing. Basic Concepts

Query Optimization. Schema for Examples. Motivating Example. Similar to old schema; rname added for variations. Reserves: Sailors:

Administrivia. CS 133: Databases. Cost-based Query Sub-System. Goals for Today. Midterm on Thursday 10/18. Assignments

Relational Query Optimization. Overview of Query Evaluation. SQL Refresher. Yanlei Diao UMass Amherst October 23 & 25, 2007

Chapter 13: Query Optimization

Relational Model: History

Review: Query Evaluation Steps. Example Query: Logical Plan 1. What We Already Know. Example Query: Logical Plan 2.

Query Processing. Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016

Column Stores vs. Row Stores How Different Are They Really?

Chapter 12: Indexing and Hashing

File Structures and Indexing

Evaluation of Relational Operations: Other Techniques. Chapter 14 Sayyed Nezhadi

CSIT5300: Advanced Database Systems

Query optimization. Elena Baralis, Silvia Chiusano Politecnico di Torino. DBMS Architecture D B M G. Database Management Systems. Pag.

Hash-Based Indexing 165

Datenbanksysteme II: Caching and File Structures. Ulf Leser

Mahathma Gandhi University

Query Processing & Optimization. CS 377: Database Systems

Basant Group of Institution

Outline. Database Tuning. Join Strategies Running Example. Outline. Index Tuning. Nikolaus Augsten. Unit 6 WS 2014/2015

Database Applications (15-415)

Principles of Database Management Systems

Database Applications (15-415)

An SQL query is parsed into a collection of query blocks optimize one block at a time. Nested blocks are usually treated as calls to a subroutine

7. Query Processing and Optimization

CS330. Query Processing

Disks & Files. Yanlei Diao UMass Amherst. Slides Courtesy of R. Ramakrishnan and J. Gehrke

Relational Query Optimization

IV Statistical Modelling of MapReduce Joins

Lassonde School of Engineering Winter 2016 Term Course No: 4411 Database Management Systems

Introduction to Query Processing and Query Optimization Techniques. Copyright 2011 Ramez Elmasri and Shamkant Navathe

Evaluation of Relational Operations

Overview of Query Evaluation. Overview of Query Evaluation

Chapter 8. Implementing the Relational Algebra. Architecture and Implementation of Database Systems Winter 2008/09

Parser. Select R.text from Report R, Weather W where W.image.rain() and W.city = R.city and W.date = R.date and R.text.

Advanced Database Systems

Query Processing Strategies and Optimization

CMSC424: Database Design. Instructor: Amol Deshpande

Query Processing with Indexes. Announcements (February 24) Review. CPS 216 Advanced Database Systems

Query and Join Op/miza/on 11/5

SQL/MX UPDATE STATISTICS Enhancements

Transcription:

Chapter 9 How Many Rows Does a Query Yield? Architecture and Implementation of Database Systems Winter 2010/11 Wilhelm-Schickard-Institut für Informatik Universität Tübingen 9.1

Web Forms Applications SQL Interface SQL Commands Executor Parser Transaction Manager Lock Manager Operator Evaluator Files and Access Methods Buffer Manager Disk Space Manager Optimizer Recovery Manager DBMS data files, indices,... Database 9.2

A relational query optimizer performs a phase of cost-based plan search to identify the presumably cheapest alternative among a a set of equivalent execution plans ( Chapter on Query Optimization). Since page I/O cost dominates, the estimated cardinality of a (sub-)query result is crucial input to this search. typically measured in pages or rows. estimates are also valuable when it comes to buffer right-sizing before query evaluation starts (e.g., allocate B buffer pages and determine blocking factor b for external sort). 9.3

Estimating Query Result There are two principal approaches to query cardinality estimation: 1 Database Profile. Maintain statistical information about numbers and sizes of tuples, distribution of attribute values for base relations, as part of the database catalog (meta information) during database updates. Calculate these parameters for intermediate query results based upon a (simple) statistical model during query optimization. Typically, the statistical model is based upon the uniformity and independence assumptions. Both are typically not valid, but they allow for simple calculations limited accuracy. In order to improve accuracy, the system can record histograms to more closely model the actual value distributions in relations. 9.4

Estimating Query Result 2 Sampling Techniques. Gather the necessary characteristics of a query plan (base relations and intermediate results) at query execution time: Run query on a small sample of the input. Extrapolate to the full input size. It is crucial to find the right balance between sample size and the resulting accuracy. These slides focus on 1. 9.5

Keep profile information in the database catalog. Update whenever SQL DML commands are issued (database updates): Typical database profile for relation R R N R s(r) V(A, R) number of records in relation R number of disk pages allocated for these records average record size number of distinct values of attribute A. possibly many more 9.6

: IBM DB2 Excerpt of IBM DB2 catalog information for a TPC-H database 1 db2 => SELECT TABNAME, CARD, NPAGES 2 db2 (cont.) => FROM SYSCAT.TABLES 3 db2 (cont.) => WHERE TABSCHEMA = TPCH ; 4 TABNAME CARD NPAGES 5 -------------- -------------------- -------------------- 6 ORDERS 1500000 44331 7 CUSTOMER 150000 6747 8 NATION 25 2 9 REGION 5 1 10 PART 200000 7578 11 SUPPLIER 10000 406 12 PARTSUPP 800000 31679 13 LINEITEM 6001215 207888 14 8 record(s) selected. Note: Column CARD R, column NPAGES N R. 9.7

Database Profile: In order to obtain tractable cardinality estimation formulae, assume one of the following: Uniformity & independence (simple, yet rarely realistic) All values of an attribute uniformly appear with the same probability. Values of different attributes are independent of each other. 9.8

Database Profile: In order to obtain tractable cardinality estimation formulae, assume one of the following: Uniformity & independence (simple, yet rarely realistic) All values of an attribute uniformly appear with the same probability. Values of different attributes are independent of each other. Worst case (unrealistic) No knowledge about relation contents at all. In case of a selection σ p, assume all records will satisfy predicate p. (May only be used to compute upper bounds of expected cardinality.) 9.8

Database Profile: In order to obtain tractable cardinality estimation formulae, assume one of the following: Uniformity & independence (simple, yet rarely realistic) All values of an attribute uniformly appear with the same probability. Values of different attributes are independent of each other. Worst case (unrealistic) No knowledge about relation contents at all. In case of a selection σ p, assume all records will satisfy predicate p. (May only be used to compute upper bounds of expected cardinality.) Perfect knowledge (unrealistic) Details about the exact distribution of values are known. Requires huge catalog or prior knowledge of incoming queries. (May only be used to compute lower bounds of expected cardinality.) 9.8

for σ (Equality Predicate) Query: Q σ A=c (R) Selectivity sel(a = c) 1/V(A, R) Uniformity Q Record size s(q) Value Distribution V(A, Q) sel(a = c) R s(r) { 1, for A = A, c(v(a, R), Q ), otherwise. with (# of distinct colors obtained by drawing r balls from a bag of balls of m colors) 1 : r, for r < m/2, c(m, r) = (r + m)/3, for m/2 r < 2m, m, for r 2m 1 Selection without replacement : c(m, r) = m (1 (1 1/m) r ). 9.9

Selectivity Estimation for σ (Other Predicates) Equality between attributes (Q σ A=B (R)): Approximate selectivity by sel(a = B) = 1/ max(v(a, R), V(B, R)). (Assumes that each value of the attribute with fewer distinct values has a corresponding match in the other attribute.) Independence Range selections (Q = σ A>c (R)): In the database profile, maintain the minimum and maximum value of attribute A in relation R, Low(A, R) and High(A, R). Approximate selectivity by Uniformity sel(a > c) = High(A, R) c, Low(A, R) c High(A, R) High(A, R) Low(A, R) 0, otherwise 9.10

for π For Q π L (R), estimating the number of result rows is difficult (L = A 1, A 2,..., A n : list of projection attributes): Q π L (R) Q Record size s(q) V(A, R), R, if L = A if keys of R L R, no dup. elim. min ( R, A i L V(A i, R) ), otherwise Independence A i L s(a i) Val. Dist. V(A i, Q) V(A i, R) for A i L 9.11

for, \, Q R S Q R + S s(q) = s(r) = s(s) schemas of R,S identical V(A, Q) V(A, R) + V(A, S) Q R \ S Q R S max(0, R S ) Q R s(q) = s(r) = s(s) V(A, Q) V(A, R) Q = R S s(q) = s(r) { + s(s) V(A, Q) = V(A, R), V(A, S), for A R for A S 9.12

for estimation for the general join case is challenging. A special, yet very common case: foreign-key relationship between input relations R and S: Establish a foreign key relationship (SQL) 1 CREATE TABLE R (A INTEGER NOT NULL, 2... 3 PRIMARY KEY (A)); 4 CREATE TABLE S (..., 5 A INTEGER NOT NULL, 6... 7 FOREIGN KEY (A) REFERENCES R); Q R R.A=S.A S The foreign key constraint guarantees π A (S) π A (R). Thus: Q = S. 9.13

for Q R R.A=S.B S Q = R S V(A, R), R S V(B, S), s(q) = s(r) + s(s) π B(S) π A (R) π A(R) π B (S) V(A, Q) { V(A, R), V(A, S), if A attribute in R otherwise 9.14

In realistic database instances, values are not uniformly distributed in an attribute s active domain (actual values found in a column). To keep track of this non-uniformity for an attribute A, maintain a histogram to approximate the actual distribution: 1 Divide the active domain of A into adjacent intervals by selecting boundary values b i. 2 Collect statistical parameters for each interval between boundaries, e.g., # of rows r with b i 1 < r.a b i, or # of distinct A values in interval (b i 1, b i ]. The histogram intervals are also referred to as buckets. ( Y. Ioannidis: The History of (Abridged), Proc. VLDB 2003) 9.15

in IBM DB2 Histogram maintained for a column in a TPC-H database 1 SELECT SEQNO, COLVALUE, VALCOUNT 2 FROM SYSCAT.COLDIST 3 WHERE TABNAME = LINEITEM 4 AND COLNAME = L_EXTENDEDPRICE 5 AND TYPE = Q ; 6 SEQNO COLVALUE VALCOUNT 7 ----- ----------------- -------- 8 1 +0000000000996.01 3001 9 2 +0000000004513.26 315064 10 3 +0000000007367.60 633128 11 4 +0000000011861.82 948192 12 5 +0000000015921.28 1263256 13 6 +0000000019922.76 1578320 14 7 +0000000024103.20 1896384 15 8 +0000000027733.58 2211448 16 9 +0000000031961.80 2526512 17 10 +0000000035584.72 2841576 18 11 +0000000039772.92 3159640 19 12 +0000000043395.75 3474704 20 13 +0000000047013.98 3789768 21. Catalog table SYSCAT.COLDIST also contains information like the n most frequent values (and their frequency), the number of distinct values in each bucket. may even be manipulated manually to tweak optimizer decisions. 9.16

Two types of histograms are widely used: 1. All buckets have the same width, i.e., boundary b i = b i 1 + w, for some fixed w. 2. All buckets contain the same number of rows (i.e., their width is varying). Equi-depth histograms ( 2 ) are able to adapt to data skew (high uniformity). The number of buckets is the tuning knob that defines the tradeoff between estimation quality (histogram resolution) and histogram size: catalog space is limited. 9.17

Example (Actual value distribution) Column A of SQL type INTEGER (domain {..., -2, -1, 0, 1, 2,... }). Actual non-uniform distribution in relation R: 9 8 8 7 6 5 4 3 3 3 2 2 2 1 1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 9.18

Divide active domain of attribute A into B buckets of equal width. The bucket width w will be High(A, R) Low(A, R) + 1 w = B Example (Equi-width histogram (B = 4)) 19 27 13 9 8 8 7 5 6 5 4 3 3 3 2 2 2 1 1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Maintain sum of value frequencies in each bucket (in addition to bucket boundaries b i ). 9.19

: Equality Selections Example (Q σ A=5 (R)) 27 19 13 9 8 8 7 5 6 5 4 3 3 3 2 2 2 1 1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Value 5 is in bucket [5, 8] (with 19 tuples) Assume uniform distribution within the bucket: Q = 19/w = 19/4 5. Actual: Q = 1 What would be the cardinality under the uniformity assumption (no histogram)? 9.20

: Range Selections Example (Q σ A>7 AND A 16 (R)) 27 19 13 9 8 8 7 5 6 5 4 3 3 3 2 2 2 1 1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Query interval (7, 16] covers buckets [9, 12] and [13, 16]. Query interval touches [5, 8]. Q = 27 + 13 + 19/4 45. Actual: Q = 48 What would be the cardinality under the uniformity assumption (no histogram)? 9.21

Histogram: Construction To construct an equi-width histogram for relation R, attribute A: 1 Compute boundaries b i from High(A, R) and Low(A, R). 2 Scan R once sequentially. 3 While scanning, maintain B running tuple frequency counters, one for each bucket. If scanning R in step 2 is prohibitive, scan small sample R sample R, then scale frequency counters by R / R sample. To maintain the histogram under insertions (deletions): 1 Simply increment (decremeent) frequency counter in affected bucket. 9.22

Divide active domain of attribute A into B buckets of roughly the same number of tuples in each bucket, depth d of each bucket will be d = R B. Example (Equi-depth histogram (B = 4, d = 16)) 16 16 16 16 9 8 8 7 6 5 4 3 3 3 2 2 2 1 1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Maintain depth (and bucket boundaries b i ). 9.23

Example (Equi-depth histogram (B = 4, d = 16)) 16 16 16 16 9 8 8 7 6 5 4 3 3 3 2 2 2 1 1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Intuition: High value frequencies are more important than low value frequencies. Resolution of histogram adapts to skewed value distributions. 9.24

vs. Example (Histogram on customer age attribute (B = 8, R = 5,600)) 1300 1100 1000 800 600 500 200 100 0-10 10-20 20-30 30-40 40-50 50-60 60-70 70-80 700 700 700 700 700 700 700 700 0-20 21-29 30-36 37-41 44-48 49-52 55-59 60-80 Equi-depth histogram invests bytes in the densely populated customer age region between 30 and 59. 9.25

: Equality Selections Example (Q σ A=5 (R)) 16 16 16 16 9 8 8 7 6 5 4 3 3 3 2 2 2 1 1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Value 5 is in first bucket [1, 7] (with d = 16 tuples) Assume uniform distribution within the bucket: Q = d/7 = 16/7 2. (Actual: Q = 1) 9.26

: Range Selections Example (Q σ A>5 AND A 16 (R)) 16 16 16 16 9 8 8 7 6 5 4 3 3 3 2 2 2 1 1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Query interval (5, 16] covers buckets [8, 9], [10, 11] and [12, 16] (all with d = 16 tuples). Query interval touches [1, 7]. (Actual: Q = 59) Q = 16 + 16 + 16 + 2/7 16 53. 9.27

: Construction To construct an equi-depth histogram for relation R, attribute A: 1 Compute depth d = R /B. 2 Sort R by sort criterion A. 3 b 0 = Low(A, R), then determine the b i by dividing the sorted R into chunks of size d. Example (B = 4, R = 64) 1 d = 64 /4 = 16. 2 Sorted R.A: 1,2,2,3,3,5,6,6,6,6,6,6,7,7,7,7,8,8,8,8,8,8,8,8,9,9,9,9,9,9,9,9,10,10,... 3 Boundaries of d-sized chunks in sorted R: 1,2,2,3,3,5,6,6,6,6,6,6,7,7,7,7,8,8,8,8,8,8,8,8,9,9,9,9,9,9,9,9,10,10,... }{{}}{{} b 1 =7 b 2 =9 9.28

A (Mis-)Estimation Scenario Because exact cardinalities and estimated selectivity information is provided for base tables only, the DBMS relies on projected cardinalities for derived tables. In the case of foreign key joins, IBM DB2 promotes selectivity factors for one join input to the join result, for example. Example (Selectivity promotion; K is key of S, π A (R) π K (S)) R R.A=S.K (σ B=10 (S)) If sel(b = 10) = x, then assume that the join will yield x R rows. Whenever the value distribution of A in R does not match the distribution of B in S, the cardinality estimate may be severly off. 9.29

A (Mis-)Estimation Scenario Example (Excerpt of a data warehouse) Dimension table STORE: STOREKEY STORE_NUMBER CITY STATE DISTRICT } 63 rows Dimension table PROMOTION: PROMOKEY PROMOTYPE PROMODESC PROMOVALUE Fact table DAILY_SALES: STOREKEY CUSTKEY PROMOKEY SALES_PRICE } 35 rows } 754 069 426 rows Let the tables be arranged in a star schema: The fact table references the dimension tables, the dimension tables are small/stable, the fact table is large/continously update on each sale. are maintained for the dimension tables. 9.30

A (Mis-)Estimation Scenario Query against the data warehouse Find the number of those sales in store 01 (18 of the overall 63 locations) that were the result of the sales promotion of type XMAS ( star join ): 1 SELECT COUNT(*) 2 FROM STORE d1, PROMOTION d2, DAILY_SALES f 3 WHERE d1.storekey = f.storekey 4 AND d2.promokey = f.promokey 5 AND d1.store_number = 01 6 AND d2.promotype = XMAS The query yields 12,889,514 rows. The histograms lead to the following selectivity estimates: sel(store_number = 01 ) = 18 /63 (28.57 %) sel(promotype = XMAS ) = 1 /35 (2.86 %) 9.31

A (Mis-)Estimation Scenario Estimated cardinalities and selected plan 1 SELECT COUNT(*) 2 FROM STORE d1, PROMOTION d2, DAILY_SALES f 3 WHERE d1.storekey = f.storekey 4 AND d2.promokey = f.promokey 5 AND d1.store_number = 01 6 AND d2.promotype = XMAS Plan fragment (top numbers indicates estimated cardinality):... 6.15567e+06 IXAND ( 8) /------------------+------------------\ 2.15448e+07 2.15448e+08 NLJOIN NLJOIN ( 9) ( 13) /---------+--------\ /---------+--------\ 1 2.15448e+07 18 1.19694e+07 FETCH IXSCAN FETCH IXSCAN ( 10) ( 12) ( 14) ( 16) /---+---\ /---+---\ 35 35 7.54069e+08 18 63 7.54069e+08 IXSCAN TABLE: DB2DBA INDEX: DB2DBA IXSCAN TABLE: DB2DBA INDEX: DB2DBA ( 11) PROMOTION PROMO_FK_IDX ( 15) STORE STORE_FK_IDX 35 63 INDEX: DB2DBA INDEX: DB2DBA PROMOTION_PK_IDX STOREX1 9.32

IBM DB2: To provide database profile information (estimate cardinalities, value distributions,... ) for derived tables: 1 Define a view that precomputes the derived table (or possibly a small sample of it, IBM DB2: 10 %), 2 use the view result to gather and keep statistics, then delete the result. Statistical views 1 CREATE VIEW sv_store_dailysales AS 2 (SELECT s.* 3 FROM STORE s, DAILY_SALES ds 4 WHERE s.storekey = ds.storekey); 5 6 CREATE VIEW sv_promotion_dailysales AS 7 (SELECT p.* 8 FROM PROMOTION p, DAILY_SALES ds 9 WHERE p.promokey = ds.promokey); 10 11 ALTER VIEW sv_store_dailysales ENABLE QUERY OPTIMIZATION; 12 ALTER VIEW sv_promotion_dailysales ENABLE QUERY OPTIMIZATION; 13 14 RUNSTATS ON TABLE sv_store_dailysales WITH DISTRIBUTION; 15 RUNSTATS ON TABLE sv_promotion_dailysales WITH DISTRIBUTION; 9.33

with Estimated cardinalities and selected plan after reoptimization 1.04627e+07 IXAND ( 8) /------------------+------------------\ 6.99152e+07 1.12845e+08 NLJOIN NLJOIN ( 9) ( 13) /---------+--------\ /---------+--------\ 18 3.88418e+06 1 1.12845e+08 FETCH IXSCAN FETCH IXSCAN ( 10) ( 12) ( 14) ( 16) /---+---\ /---+---\ 18 63 7.54069e+08 35 35 7.54069e+08 IXSCAN TABLE:DB2DBA INDEX: DB2DBA IXSCAN TABLE: DB2DBA INDEX: DB2DBA DB2DBA ( 11) STORE STORE_FK_IDX ( 15) PROMOTION PROMO_FK_IDX 63 35 INDEX: DB2DBA INDEX: DB2DBA STOREX1 PROMOTION_PK_IDX Note new estimated selectivities after join: Selectivity of PROMOTYPE = XMAS now only 14.96 % (was: 2.86 %) Selectivity of STORE_NUMBER = 01 now 9.27 % (was: 28.57 %) 9.34