Effect of Stats on Two Columns Optimizer Statistics on tables and indexes are vital. Arup Nanda

Similar documents
Oracle 11g Db Stats Enhancements Inderpal S. Johal. Inderpal S. Johal, Data Softech Inc.

Oracle 10g Dbms Stats Gather Table Stats Examples

TECHNOLOGY: Testing Performing Through Changes

Oracle Database 11gR2 Optimizer Insights

Oracle Sql Tuning- A Framework

Die Wundertüte DBMS_STATS: Überraschungen in der Praxis

Pitfalls & Surprises with DBMS_STATS: How to Solve Them

Tuning with Statistics

Should You Drop Indexes on Exadata?

Real-World Performance Training SQL Performance

Arup Nanda Longtime Oracle DBA (and now DMA)

Exadata for Oracle DBAs. Arup Nanda Longtime Oracle DBA (and now DMA)

.. Spring 2008 CSC 468: DBMS Implementation Alexander Dekhtyar..

Networking for the Cloud DBA. Arup Nanda Longtime Oracle DBA And Explorer of New Things

Database statistics gathering: Synopsis

Interpreting Explain Plan Output. John Mullins

1 Copyright 2011, Oracle and/or its affiliates. All rights reserved.

When Object Statistics aren't Enough

Oracle Optimizer: What s New in Oracle Database 12c? Maria Colgan Master Product Manager

7. Query Processing and Optimization

Slide 2 of 79 12/03/2011

Query Optimization, part 2: query plans in practice

Cost Based Optimizer CBO: Configuration Roadmap

Microservices using Python and Oracle. Arup Nanda Longtime Oracle DBA And Explorer of New Things

Ensuring Optimal Performance. Vivek Sharma. 3 rd November 2012 Sangam 2012

D B M G Data Base and Data Mining Group of Politecnico di Torino

Partitioning WWWH What, When, Why & How

<Insert Picture Here> Inside the Oracle Database 11g Optimizer Removing the black magic

Oracle 12c Features You Should Know. Arup Nanda Longtime Oracle DBA

Managing Performance Through Versioning of Statistics

D B M G. SQL language: basics. Managing tables. Creating a table Modifying table structure Deleting a table The data dictionary Data integrity

Top 10 Features in Oracle 12C for Developers and DBA s Gary Bhandarkar Merck & Co., Inc., Rahway, NJ USA

Real-World Performance Training SQL Performance

Resolving Latch Contention

Data Engineering for Data Science

THE DBMS_STATS PACKAGE

Answer: Reduce the amount of work Oracle needs to do to return the desired result.

MANAGING COST-BASED OPTIMIZER STATISTICS FOR PEOPLESOFT DAVID KURTZ UKOUG PEOPLESOFT ROADSHOW 2018

Estimating Cardinality: Use of Jonathan Lewis CBO methodology

Advanced indexing methods Usage and Abusage. Riyaj Shamsudeen Ora!nternals

Exploring Best Practices and Guidelines for Tuning SQL Statements

T-sql Check If Index Exists Information_schema

SQLSaturday Sioux Falls, SD Hosted by (605) SQL

Querying Data with Transact SQL

3. Getting data out: database queries. Querying

Tuna Helper Proven Process for SQL Tuning. Dean Richards Senior DBA, Confio Software

Oracle Performance Tuning. Overview of performance tuning strategies

ORACLE DATABASE 12C INTRODUCTION

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Querying Data with Transact-SQL

Infrastructure at your Service. In-Memory-Pläne für den 12.2-Optimizer: Teuer oder billig?

Querying Data with Transact-SQL (20761)

What do you mean the Oracle Optimizer won't use my Index? John Mullins

MANAGING DATA(BASES) USING SQL (NON-PROCEDURAL SQL, X401.9)

Improving Database Performance: SQL Query Optimization

CSC Web Programming. Introduction to SQL

Location Agnostic Data

Appendix: Application of the Formal Principle for the Analysis of Performance Problems After an Oracle Migration

Banner Performance on Oracle 10g

Chapter 7. Introduction to Structured Query Language (SQL) Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel

Latches Demystified. What is a Latch. Longtime Oracle DBA. Arup Nanda. From Glossary in Oracle Manuals:

SQL Data Manipulation Language. Lecture 5. Introduction to SQL language. Last updated: December 10, 2014

Interview Questions on DBMS and SQL [Compiled by M V Kamal, Associate Professor, CSE Dept]

Database Foundations. 6-3 Data Definition Language (DDL) Copyright 2015, Oracle and/or its affiliates. All rights reserved.

When should an index be used?

Data Warehouse Tuning. Without SQL Modification

Oracle 11g Optimizer Statistics Inderpal S. Johal. Inderpal S. Johal, Data Softech Inc.

Top 5 Issues that Cannot be Resolved by DBAs (other than missed bind variables)

An Introduction to DB2 Indexing

Advanced Oracle SQL Tuning v3.0 by Tanel Poder

Adaptive Cursor Sharing: An Introduction

Unit 27 Web Server Scripting Extended Diploma in ICT

Seminar: Presenter: Oracle Database Objects Internals. Oren Nakdimon.

Course Modules for MCSA: SQL Server 2016 Database Development Training & Certification Course:

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2007 Lecture 9 - Query optimization

April 22-26, 2012 Mandalay Bay Convention Center Las Vegas, Nevada, USA. How Oracle Locking Works Session 889 Arup Nanda Longtime Oracle DBA

UNIT 4 DATABASE SYSTEM CATALOGUE

Querying Data with Transact-SQL

Query Optimizer MySQL vs. PostgreSQL

"Charting the Course... MOC C: Querying Data with Transact-SQL. Course Summary

Independent consultant. Oracle ACE Director. Member of OakTable Network. Available for consulting In-house workshops. Performance Troubleshooting

Referring to Fig 1, suppose the user wants to select the rows of all the customers who are angry. The following query would make it happen:

Oracle 9i Application Development and Tuning

Microsoft Querying Data with Transact-SQL - Performance Course

QUERY TRANSFORMATION Part 1

20 Essential Oracle SQL and PL/SQL Tuning Tips. John Mullins

Query Optimizer MySQL vs. PostgreSQL

RAC for Beginners. Arup Nanda Longtime Oracle DBA (and a beginner, always)

Top 7 Plan Stability Pitfalls & How to Avoid Them. Neil Chandler Chandler Systems Ltd UK

DB2 UDB: App Programming - Advanced

Components of a DMBS

Independent consultant. Oracle ACE Director. Member of OakTable Network. Available for consulting In-house workshops. Performance Troubleshooting

Database Programming with SQL

Upgrading from Oracle Database 9i to 10g: What to expect from the Optimizer. An Oracle White Paper July 2008

(a) (14 points) Find all people who are friends with Clara Oswin Oswald both on Facebook and Twitter since Return their names.

Querying Data with Transact-SQL

Semantic Errors in Database Queries

An Oracle White Paper April Best Practices for Gathering Optimizer Statistics

Optimizer with Oracle Database 12c Release 2 O R A C L E W H I T E P A P E R J U N E

Relational Database Index Design and the Optimizers

Transcription:

Stats with Intelligence Arup Nanda Longtime Oracle DBA Effect of Stats on Two Columns Optimizer Statistics on tables and indexes are vital for the optimizer to compute optimal execution plans If there are stats on two different columns used in the query, how does the optimizer decide? It takes the selectivity it of each column, and multiplies that to get the selectivity for the query. 2

Example Two columns Month of Birth: selectivity = 1/12 Zodiac Sign: selectivity = 1/12 What will be the selectivity of a query? Where zodiac sign = Pisces And month of birth = January Problem: According to the optimizer it will be 1/12 + 1/12 In reality, it will be 0, size the combination is not possible 3 Multi-column Intelligence If the Optimizer knew about these combinations, it would have been able to choose the proper p path How would you let the optimizer learn about these? In Oracle 10g, we saw a good approach SQL Profiles which allowed data to be considered for execution plans but was not a complete approach it still lacked a dynamism applicability in all circumstances In 11g, there is an ability to provide this information to the optimizer Multi-column stats 4

An Example Table BOOKINGS Index on (HOTEL_ID, RATE_CODE) What will be plan for the following? select min(book_txn) from bookings where hotel_id = 10 and rate_code = 23 HOTEL_ID RATE_CODE COUNT(1) ---------- ---------- ---------- 10 11 444578 10 12 50308 20 22 100635 20 23 404479 vals.sql 5 The Plan Here is the plan Id Operation Name Rows Bytes Cost (%CPU) Time 0 SELECT STATEMENT 1 10 769 (3) 00:00:10 1 SORT AGGREGATE 1 10 * 2 TABLE ACCESS FULL BOOKINGS 199K 1951K 769 (3) 00:00:10 Predicate Information (identified by operation id): PLAN_ TABLE_ OUTPUT --------------------------------------------------- 2 - filter("rate_code"=23 AND "HOTEL_ID"=10)) It didn t choose index scan expl1.sql The estimated number of rows are 199K, or about 20%; so full table scan was favored over index scan 6

Solution Create Extended Stats in the related columns HOTEL_ID and RATE_CODE var ret varchar2(2000) begin :ret := dbms_ stats.create_ extended_ stats( 'ARUP', 'BOOKINGS','(HOTEL_ID, RATE_CODE)' ); end; / print ret The variable ret shows the name of the extended statistics xstats.sql 7 Then Collect Stats Normally begin dbms_stats.gather_table_stats t th t bl t t ( ownname => 'ARUP', tabname => 'BOOKINGS', estimate_percent=> 100, method_opt => 'FOR ALL COLUMNS SIZE SKEWONLY', cascade => true ); end; / stats.sql 8

The Plan Now After extended stats, the plan looks like this: ---------------- Id Operation Name Rows Bytes Cost (%CPU) Time ---------------- 0 SELECT STATEMENT 1 10 325 (1) 00:00:04 1 SORT AGGREGATE 1 10 2 TABLE ACCESS BY INDEX ROWID BOOKINGS 23997 234K 325 (1) 00:00:04 * 3 INDEX RANGE SCAN IN_BOOKINGS_01 23997 59 (0) 00:00:01 ---------------- Note: No of Rows is now more accurate As a result, the index scan was chosen expl1.sql 9 Extended Stats Extended stats store the correlation of data among the columns The correlation helps optimizer decide on an execution path that takes into account the data Execution plans are more accurate Under the covers, extended stats create an invisible virtual column Stats on the columns collects stats on this virtual column as well 10

10053 Trace Single Table Cardinality Estimation for BOOKINGS[BOOKINGS] Column (#2): NewDensity:0.247422, OldDensity:0.000000 BktCnt:1000000, PopBktCnt:1000000, PopValCnt:2, NDV:2 Column (#3): NewDensity:0.025295, OldDensity:0.000000 BktCnt:1000000, PopBktCnt:1000000, PopValCnt:4, NDV:4 Column (#5): NewDensity:0.025295, OldDensity:0.000000 BktCnt:1000000, PopBktCnt:1000000, PopValCnt:4, NDV:4 ColGroup (#1, VC) SYS_STU4JHE7J4YQ3ZLDXSW5L1O8KX Col#: 2 3 CorStregth: 2.00 ColGroup Usage:: PredCnt: 2 Matches Full: Using density: 0.025295 of col #5 as selectivity of unpopular value pred 11 Extended Stats This hidden virtual column shows up in column statistics select column_name, density, num_distinct from user_ tab_ col_ statistics where table_name = 'BOOKINGS' COLUMN _ NAME DENSITY NUM _ DISTINCT ------------------------------ ---------- ------------ BOOKING_ID.000001 1000000 HOTEL_ID.0000005 2 RATE_CODE.0000005 4 BOOK_TXN.002047465 2200 SYS_STU4JHE7J4YQ3ZLDXSW5L1O8KX STU4JHE7J4YQ3ZLDXSW5L1O8KX.0000005 4 12

Checking for Extended Stats To check the presence of extended stats, check the view dba_stat_extensions. select extension_name, name extension from dba_stat_extensions where table_ name='bookings'; Output: EXTENSION_NAME EXTENSION ------------------------------ ------------------------ SYS_STU4JHE7J4YQ3ZLDXSW5L1O8KX ("HOTEL_ID","RATE_CODE") check.sql 13 Deleting Extended Stats If you want, you can drop the extended stats, you can use the dbms_stats package, specifically the procedure drop_exteneded_stats begin dbms_stats.drop_extended_stats ( ownname => 'ARUP', tabname => 'BOOKINGS',' extension => '("HOTEL_ID","RATE_CODE")' ); end; drop.sql 14

Another way You can collect the extended stats using the normal dbms_statsstats as well: begin dbms_stats.gather_table_stats ( ownname => 'ARUP', tabname => 'BOOKINGS', estimate_percent => 100, method_opt => 'FOR ALL COLUMNS SIZE SKEWONLY FOR COLUMNS (HOTEL_ID,RATE_CODE)', cascade => true ); end; / startx.sql 15 The Case on Case Sensitivity A table of CUSTOMERS with 1 million rows LAST_NAME field has the values McDonald 20% MCDONALD 10% McDONALD 10% mcdonald 10% They make up 50% of the rows, with the variation of the same name. When you issue a query like this: select * from customers where upper(last_name) = 'MCDONALD' 16

Normal Plan No of rows wrongly estimated The plan looks like this: Id Operation Name Rows Bytes Cost (%CPU) Time 0 SELECT STATEMENT 10000 498K 2140 (2) 00:00:26 * 1 TABLE ACCESS FULL CUSTOMERS 10000 498K 2140 (2) 00:00:26 Predicate Information (identified d by operation id): --------------------------------------------------- PLAN_TABLE_OUTPUT ----------------------------------------------------------------------------- 1 - filter(upper("last_name")='mcdonald') expl2.sql 17 Extended Stats You collect the stats for the UPPER() function begin dbms_stats.gather_table_stats ( ownname => 'ARUP', tabname => 'CUSTOMERS', method_opt => 'for all columns size skewonly for columns (upper(last_name))' ); end; statsx_cust.sql 18

With Extended Stats No of rows The plan is now: No of rows correctly estimated Id Operation Name Rows Bytes Cost (%CPU) Time 0 SELECT STATEMENT 500K 33M 2140 (2) 00:00:26 * 1 TABLE ACCESS FULL CUSTOMERS 500K 33M 2140 (2) 00:00:26 Predicate Information (identified by operation id): --------------------------------------------------- expl2.sql l PLAN_TABLE_OUTPUT 1 - filter("customers"."sys SYS_STUJ6BPFDTE396EPTURAB2DBI5"='MCDONALD') Extended stats name come here 19 Alternatives Remember, the extended stats create a virtual column hidden from you You can have the same functionality as extended stats by defining virtual columns Advantage You can have a column name of your choice You can index it, if needed You can partition it You can create Foreign Key constraints on it 20

Restrictions Has to be 11.0 or higher Not for SYS owned tables Not on IOT, clustered tables, GTT or external tables Can t be on a virtual column An Expression can t contain a subquery must have 1 columns A Column Group no of columns should be 32 and 2 can t contain expressions can t have the same column repeated 21 Summary Normally the optimizer does not know the correlation between the columns e.g. no one born in January can have a sun sign of Pisces Therefore, perform an index scan if that combination is passed as predicate Extended d statistics ti ti enable the optimizer i to know that relationship More information to the optimizer i results in better plans 22

Thank You! Download Scripts: proligence.com/pres/sangam11 My Blog: arup.blogspot.com My Tweeter: arupnanda 23