Slide 2 of 79 12/03/2011

Similar documents
Introduction SPM Overview Our Objectives Our Results Other Considerations Summary

Die Wundertüte DBMS_STATS: Überraschungen in der Praxis

Pitfalls & Surprises with DBMS_STATS: How to Solve Them

1 Copyright 2011, Oracle and/or its affiliates. All rights reserved.

Oracle 11g Optimizer Statistics Inderpal S. Johal. Inderpal S. Johal, Data Softech Inc.

Managing Performance Through Versioning of Statistics

Oracle 10g Dbms Stats Gather Table Stats Examples

Understanding Optimizer Statistics With Oracle Database 12c Release 2 O R A C L E W H I T E P A P E R M A R C H

Oracle Optimizer: What s New in Oracle Database 12c? Maria Colgan Master Product Manager

Top 7 Plan Stability Pitfalls & How to Avoid Them. Neil Chandler Chandler Systems Ltd UK

Top 10 Features in Oracle 12C for Developers and DBA s Gary Bhandarkar Merck & Co., Inc., Rahway, NJ USA

Oracle Database 11gR2 Optimizer Insights

Database statistics gathering: Synopsis

Update The Statistics On A Single Table+sql Server 2005

Real-World Performance Training SQL Performance

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

It Might Be Valid, But It's Still Wrong Paul Maskens and Andy Kramek

Identifying and Fixing Parameter Sniffing

Effect of Stats on Two Columns Optimizer Statistics on tables and indexes are vital. Arup Nanda

An Oracle White Paper April Best Practices for Gathering Optimizer Statistics

MANAGING COST-BASED OPTIMIZER STATISTICS FOR PEOPLESOFT DAVID KURTZ UKOUG PEOPLESOFT ROADSHOW 2018

<Insert Picture Here> Inside the Oracle Database 11g Optimizer Removing the black magic

Tuning with Statistics

Advanced Oracle Performance Troubleshooting. Query Transformations Randolf Geist

SQL Gone Wild: Taming Bad SQL the Easy Way (or the Hard Way) Sergey Koltakov Product Manager, Database Manageability

Estimating Cardinality: Use of Jonathan Lewis CBO methodology

Cost Based Optimizer CBO: Configuration Roadmap

Real-World Performance Training SQL Performance

Advanced Oracle SQL Tuning v3.0 by Tanel Poder

.. Spring 2008 CSC 468: DBMS Implementation Alexander Dekhtyar..

Optimizer with Oracle Database 12c Release 2 O R A C L E W H I T E P A P E R J U N E

big picture parallel db (one data center) mix of OLTP and batch analysis lots of data, high r/w rates, 1000s of cheap boxes thus many failures

TECHNOLOGY: Testing Performing Through Changes

Join Selectivity. Jonathan Lewis JL Computer Consultancy London, UK

The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into

Gather Schema Statistics Oracle 10g Examples

SQL Plan Management. on 12c Kerry Osborne OakTable World, 2013

Who am I? I m a python developer who has been working on OpenStack since I currently work for Aptira, who do OpenStack, SDN, and orchestration

The Right Read Optimization is Actually Write Optimization. Leif Walsh

T-sql Check If Index Exists Information_schema

Demystifying WORKLOAD System Statistics Gathering

Being day eight of the DBCC Command month at SteveStedman.com, today's featured DBCC Command is DBCC CLEANTABLE.

Best Practices for Gathering Optimizer Statistics with Oracle Database 12c Release 2 O R A C L E W H I T E P A P E R J U N E

Part 1: Indexes for Big Data

Hi everyone. Starting this week I'm going to make a couple tweaks to how section is run. The first thing is that I'm going to go over all the slides

Neil Chandler, Chandler Systems Oracle & SQL Server DBA

Post Experiment Interview Questions

A Journey to DynamoDB

Ms Sql Server 2008 R2 Check If Temp Table Exists

Documentation Nick Parlante, 1996.Free for non-commerical use.

Oracle Database 11g: SQL Tuning Workshop

Sql Server Check If Global Temporary Table Exists

Oracle. Exam Questions 1Z Oracle Database 11g Release 2: SQL Tuning Exam. Version:Demo

How To Install Windows Updates 8 From Usb

Background. Let s see what we prescribed.

Microsoft SQL Server" 2008 ADMINISTRATION. for ORACLE9 DBAs

Inside the PostgreSQL Shared Buffer Cache

Banner Performance on Oracle 10g

Answer: Reduce the amount of work Oracle needs to do to return the desired result.

Excel 2010 Formulas Don't Update Automatically

TEN QUERY TUNING TECHNIQUES

Experiences of Global Temporary Tables in Oracle 8.1

Staleness and Isolation in Prometheus 2.0. Brian Brazil Founder

THE DBMS_STATS PACKAGE

So you think you know everything about Partitioning?

Tuning slow queries after an upgrade

MongoDB - a No SQL Database What you need to know as an Oracle DBA

Migrating? Don't forget the Optimizer.

DESIGNING FOR PERFORMANCE SERIES. Smokin Fast Queries Query Optimization

Oracle Data Warehousing Pushing the Limits. Introduction. Case Study. Jason Laws. Principal Consultant WhereScape Consulting

Windows 7 Read The Manual Update Full Version

CS Final Exam Review Suggestions

Virtualization. Q&A with an industry leader. Virtualization is rapidly becoming a fact of life for agency executives,

Independent consultant. Oracle ACE Director. Member of OakTable Network. Available for consulting In-house workshops. Performance Troubleshooting

Oracle Sql Tuning- A Framework

Introductory Excel. Spring CS130 - Introductory Excel 1

OracleMan Consulting

Tuning SQL without the Tuning Pack. John Larkin JP Morgan Chase

Evaluating Cloud Storage Strategies. James Bottomley; CTO, Server Virtualization

Arrays are a very commonly used programming language construct, but have limited support within relational databases. Although an XML document or

Donald K. Burleson Dave Ensor Christopher Foot Lisa Hernandez Mike Hordila Jonathan Lewis Dave Moore Arup Nanda John Weeg

Microsoft Windows Vista Manual Update Not

I Want To Go Faster! A Beginner s Guide to Indexing

Independent consultant. Oracle ACE Director. Member of OakTable Network. Available for consulting In-house workshops. Performance Troubleshooting

Instructor: Craig Duckett. Lecture 04: Thursday, April 5, Relationships

Kerry Osborne Enkitec Chris Wones - dunnhumby E My PX Goes to 11

Top 5 Issues that Cannot be Resolved by DBAs (other than missed bind variables)

Copyright 2018, Oracle and/or its affiliates. All rights reserved.

Database Architectures

This is the forth SAP MaxDB Expert Session and this session covers the topic database performance analysis.

Oracle DB-Tuning Essentials

Advanced Database Systems

Update Manual Ios 7.1 Iphone 4s Wont >>>CLICK HERE<<<

Database Architectures

Tablespace Usage By Schema In Oracle 11g Query To Check Temp

Bind Peeking The Endless Tuning Nightmare

Top 10 Essbase Optimization Tips that Give You 99+% Improvements

Distributed Systems. Lec 10: Distributed File Systems GFS. Slide acks: Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung

Query Optimization, part 2: query plans in practice

Database Systems. Announcement. December 13/14, 2006 Lecture #10. Assignment #4 is due next week.

Transcription:

Doug Burns

Introduction Simple Fundamentals Statistics on Partitioned Objects The Quality/Performance Trade-off Aggregation Scenarios Alternative Strategies Incremental Statistics Conclusions and References Slide 2 of 79

Who am I? Why am I talking? Setting Expectations Slide 3 of 79

Possibly a question some of us will be asking ourselves at 8:30 am tomorrow after tonight's party I am Doug Doug I am Actually I am Douglas or, if you're Scottish, Dougie or Doogie I'm not from round here You will have probably noticed that already See Twitter @doug_conference for lots of whining about my 21 hour journey Slide 4 of 79

Slide 5 of 79

Slide 6 of 79

Slide 7 of 79

Slide 8 of 79

1986 Zilog Z80A (3.5MHz) 32KB Usable RAM Yes, Cary, we used profiles! Slide 9 of 79

Partitioned objects are a given when working with large databases Maintaining statistics on partitioned objects is one of the primary challenges of the DW designer/developer/dba There are many options that vary between versions but the fundamental challenges are the same Trade-off between statistics quality and collection effort People keep getting it wrong! Slide 10 of 79

What I will and won't include No Histograms No Sampling Sizes No Indexes No Detail Level of depth paper WeDoNotUseDemos A lot to get through! Questions Slide 11 of 79

Introduction Simple Fundamentals Statistics on Partitioned Objects The Quality/Performance Trade-off Aggregation Scenarios Alternative Strategies Incremental Statistics Conclusions and References Slide 12 of 79

The CBO evaluates potential execution plans using Rules and formulae embedded in the code Some control through Configuration parameters Hints Statistics Describing the content of data objects (Object Statistics) e.g. Tables, Indexes, Clusters Describing system characteristics (System Statistics) Slide 13 of 79

The CBO uses statistics to estimate row source cardinalities How many rows do we expect a specific operation to return Primary driver in selecting the best operations to perform and their order Inaccurate or missing statistics are the most common cause of sub-optimal execution plans Hard work on designing and implementing appropriate statistics maintenance will pay off across the system Slide 14 of 79

Introduction Simple Fundamentals Statistics on Partitioned Objects The Quality/Performance Trade-off Aggregation Scenarios Alternative Strategies Incremental Statistics Conclusions and References Slide 15 of 79

TEST_TAB1 Global P_20110201 Moscow P_20110202 Moscow Partition (Global) Range Partition by Date List Subpartition by Source System Subpartition London Others Slide 16 of 79

Global Describe the entire table or index and all of it's underlying partitions and subpartitions as a whole Important /NO Partition Describe individual partitions and potentially the underlying subpartitions as a whole Important /NO Subpartition Describe individual subpartitions Implictly, Slide 17 of 79

If a statement accesses multiple partitions the CBO will use Global Statistics. If a statement is able to limit access to a single partition, then the partition statistics can be used. If a statement accesses a single subpartition, then subpartition statistics can be used. However, prior to 10.2.0.4, subpartition statistics are rarely used. For most applications you will need both Global and Partition stats for the CBO to operate effectively Slide 18 of 79

Introduction Simple Fundamentals Statistics on Partitioned Objects The Quality/Performance Trade-off Aggregation Scenarios Alternative Strategies Incremental Statistics Conclusions and References Slide 19 of 79

TEST_TAB1 P_20110201 P_20110202 Moscow Moscow London Data loaded for Moscow / 20110202 Others Slide 20 of 79

TEST_TAB1 P_20110201 P_20110202 Moscow Moscow London Potentially Stale Statistics Others Slide 21 of 79

GRANULARITY ALL AUTO DEFAULT GLOBAL GLOBAL AND PARTITION PARTITION SUBPARTITION Statistics Gathered Global, Partition and Subpartition Determines granularity based on partitioning type. This is the default Gathers global and partition-level stats. This option is deprecated, and while currently supported, it is included in the documentation for legacy reasons only. You should use 'GLOBAL AND PARTITION' for this functionality. Global Global and Partition (but not subpartition) stats Partition (specify PARTNAME for a specific partition. Default is all partitions.) Subpartition (specify PARTNAME for a specific subpartition. Default is all subpartitions.) Slide 22 of 79

TEST_TAB1 P_20110201 P_20110202 Moscow Moscow London Others dbms_stats.gather_table_stats( GRANULARITY => 'SUBPARTITION', PARTNAME => 'P_20110202_MOSCOW'); Slide 23 of 79

TEST_TAB1 P_20110201 P_20110202 Moscow Moscow London dbms_stats.gather_table_stats( GRANULARITY => 'ALL'); Others Slide 24 of 79

TEST_TAB1 P_20110201 P_20110202 Moscow Moscow London dbms_stats.gather_table_stats( GRANULARITY => 'GLOBAL'); Others Slide 25 of 79

TEST_TAB1 P_20110201 P_20110202 Moscow London Others Moscow dbms_stats.gather_table_stats( GRANULARITY => 'DEFAULT', PARTNAME => 'P_20110202_MOSCOW'); dbms_stats.gather_table_stats( GRANULARITY => 'GLOBAL AND PARTITION', PARTNAME => 'P_20110202_MOSCOW'); Slide 26 of 79

To address the high cost of collecting Global Stats, Oracle provides another option Aggregated or Approximate Global Stats Only gather stats on the lower levels of the object Partition on partitioned tables Subpartition on composite-partitioned tables DBMS_STATS will aggregate the underlying statistics to generate approximate global statistics at higher levels Important GLOBAL_STATS=NO Slide 27 of 79

TEST_TAB1 GLOBAL_STATS=NO NUM_ROWS = 11 GRANULARITY => 'SUBPARTITION' P_20110201 GLOBAL_STATS=NO P_20110202 GLOBAL_STATS=NO NUM_ROWS = 8 8 rows inserted for Moscow 20110202 MOSCOW LONDON MOSCOW NUM_ROWS = 5 Slide 28 of 79

TEST_TAB1 GLOBAL_STATS=NO NUM_ROWS = 11 19 P_20110201 GLOBAL_STATS=NO P_20110202 GLOBAL_STATS=NO NUM_ROWS = 8 16 Stats gathered on subpartition MOSCOW LONDON MOSCOW NUM_ROWS = 5 11 Slide 29 of 79

TEST_TAB1 STATUS NDV = 1 STATUS H/L = P/P NDV = Number of Distinct Values in STATUS H/L = Highest and Lowest P_20110201 STATUS NDV = 1 STATUS H/L = P/P P_20110202 STATUS NDV = 1 STATUS H/L = P/P MOSCOW LONDON MOSCOW STATUS NDV = 1 STATUS H/L = P/P STATUS NDV = 1 STATUS H/L = P/P STATUS NDV = 1 STATUS H/L = P/P Slide 30 of 79

TEST_TAB1 STATUS NDV = 1 4 STATUS H/L = P/P P/U P_20110201 STATUS NDV = 1 STATUS H/L = P/P P_20110202 STATUS NDV = 1 3 STATUS H/L = P/P P/U New STATUS=U appeared MOSCOW LONDON MOSCOW STATUS NDV = 1 STATUS H/L = P/P STATUS NDV = 1 STATUS H/L = P/P STATUS NDV = 1 2 STATUS H/L = P/P P/U Slide 31 of 79

You have a choice Gather True Global Stats More accurate NDVs Requires high-cost full table scan (which will get progressively slower and more expensive as tables grow) Maybe an occasional activity? Gather True Partition Stats and Aggregated Global Stats Accurate row counts and column High/Low values Wildly inaccurate NDVs Requires low-cost partition scan activity plus aggregation Slide 32 of 79

Introduction Simple Fundamentals Statistics on Partitioned Objects The Quality/Performance Trade-off Aggregation Scenarios Alternative Strategies Incremental Statistics Conclusions and References Slide 33 of 79

Take care if you decide to use Aggregated Global Stats Several implicit rules govern the aggregation process I have seen every issue I'm about to describe In the past 18 months Working on systems with people who are usually pretty smart Slide 34 of 79

Scenario 1 Aggregated Global Stats at Table-level Subpartition Stats gathered at subpartition-level as part of new subpartition load process Emergency hits when someone tries to INSERT data for which there is no valid subpartition Solution quickly add a new partition and gather stats on new subpartition. Slide 35 of 79

TEST_TAB1 GLOBAL_STATS=NO NUM_ROWS = 11 P_20110201 GLOBAL_STATS=NO NUM_ROWS = 11 MOSCOW NUM_ROWS = 11 Slide 36 of 79

TEST_TAB1 GLOBAL_STATS=NO NUM_ROWS IS? What will number of rows be? P_20110201 GLOBAL_STATS=NO NUM_ROWS = 11 New subpartition with no stats yet P_20110202 GLOBAL_STATS=NO NUM_ROWS IS? New data inserted and stats gathered MOSCOW LONDON MOSCOW NUM_ROWS = 11 GLOBAL_STATS=NO NUM_ROWS = NULL Slide 37 of 79

TEST_TAB1 GLOBAL_STATS=NO NUM_ROWS IS NULL Aggregated global stats invalidated P_20110201 GLOBAL_STATS=NO NUM_ROWS = 11 MOSCOW No partition stats as not all subpartitions have stats LONDON P_20110202 GLOBAL_STATS=NO NUM_ROWS IS NULL MOSCOW NUM_ROWS = 11 GLOBAL_STATS=NO NUM_ROWS = NULL Slide 38 of 79

TEST_TAB1 GLOBAL_STATS=NO NUM_ROWS IS 14... and fixes aggregated global stats P_20110201 GLOBAL_STATS=NO NUM_ROWS = 11 P_20110202 GLOBAL_STATS=NO NUM_ROWS IS 3... updates aggregated stats on partition MOSCOW LONDON MOSCOW NUM_ROWS = 11 Gathering stats on all subpartitions... NUM_ROWS = 0 Slide 39 of 79

Scenario 2 Aggregated Global Stats at Table-level Partition Stats gathered at Partition-level as part of new partition load process Performance of several queries is horrible and poor NDVs at the Table-level are identified as root cause Solution Gather Global Stats quickly! Slide 40 of 79

TEST_TAB1 GLOBAL_STATS=NO P_20110201 GLOBAL_STATS=NO MOSCOW Slide 41 of 79

TEST_TAB1 Global Stats gathered P_20110201 GLOBAL_STATS=NO MOSCOW Slide 42 of 79

TEST_TAB1 NUM_ROWS =? What will new New partition & number of subpartitions with rows be? stats gathered P_20110201 GLOBAL_STATS=NO P_20110202 GLOBAL_STATS=NO NUM_ROWS = 8 MOSCOW LONDON MOSCOW NUM_ROWS = 5 Slide 43 of 79

TEST_TAB1 P_20110201 GLOBAL_STATS=NO P_20110202 GLOBAL_STATS=NO NUM_ROWS = 8 MOSCOW LONDON MOSCOW NUM_ROWS = 5 Slide 44 of 79

Scenario 3 Aggregated Global Stats at Table-level Statistics are gathered on temporary Load Table Load Table is exchanged with partition of target table Objective is to minimise activity on target table and ensure that stats are available on partition immediately on exchange Slide 45 of 79

TEST_TAB1 GLOBAL_STATS=NO P_20110201 GLOBAL_STATS=NO Temporary Load Table with stats MOSCOW LOAD_TAB1 NUM_ROWS = 10 Slide 46 of 79

TEST_TAB1 GLOBAL_STATS=NO New Partition & Subpartition without stats P_20110201 GLOBAL_STATS=NO P_20110202 GLOBAL_STATS=NO NUM_ROWS IS NULL MOSCOW LONDON LOAD_TAB1 GLOBAL_STATS=NO NUM_ROWS IS NULL NUM_ROWS = 10 Slide 47 of 79

TEST_TAB1 GLOBAL_STATS=NO NUM_ROWS =? All subpartitions have stats, so what happened to Global Stats? P_20110201 GLOBAL_STATS=NO Data and stats appear at partition exchange P_20110202 GLOBAL_STATS=NO NUM_ROWS =? MOSCOW LONDON NUM_ROWS = 10 LOAD_TAB1 GLOBAL_STATS=NO NUM_ROWS IS NULL Slide 48 of 79

TEST_TAB1 GLOBAL_STATS=NO No statistics aggregation! P_20110201 GLOBAL_STATS=NO P_20110202 GLOBAL_STATS=NO NUM_ROWS IS NULL MOSCOW LONDON NUM_ROWS = 10 Slide 49 of 79

Hidden parameter used to minimise the impact of statistics aggregation process Default is TRUE which means minimise aggregation Partition exchange will not trigger the aggregation process! Solutions Change hidden parameter speak to Support Exchange-then-Gather (another good reason for this later) Slide 50 of 79

Wildly inaccurate NDVs which will impact Execution Plans Take care with the aggregation process Do not use aggregated statistics unless you really don't have time to gather true Global Stats But the problem is, what if your table is so damn big that you can never manage to update those Global Stats? Slide 51 of 79

Introduction Simple Fundamentals Statistics on Partitioned Objects The Quality/Performance Trade-off Aggregation Scenarios Alternative Strategies Incremental Statistics Conclusions and References Slide 52 of 79

If stats collection is such a nightmare, perhaps we shouldn't bother gathering stats at all? Dynamic Sampling could be used Gather no stats manually When statements are parsed, Oracle will execute queries against objects to generate temporary stats on-the-fly I would not recommend this as a system-wide strategy What happened when stats were missing in earlier examples! Recurring overhead for every query Either expensive or low quality stats Slide 53 of 79

Gathering stats takes time and resources The resulting stats describe your data to help the CBO determine optimal execution plans If you know your data well enough to know the appropriate stats, why not just set them manually and avoid the collection overhead? Plenty of appropriate DBMS_STATS procedures Not a new idea and discussed in several places on the net (including JL chapter in latest Oak Table book) Slide 54 of 79

Positives Very fast and low resource method for setting statistics on new partitions Potential improvements to plan stability when accessing timeperiod partitions that are filled over time Negatives You need to know your data well, particularly any time periodicity You need to develop your own code implementation You could undermine the CBO's ability to use more appropriate execution plans as data changes over time Does not eliminate the difficulty in maintaining accurate Global Statistics, although these could be set manually too Slide 55 of 79

Extending the concept of setting statistics manually Instead of trying to work out what the appropriate statistics are for a new partition, copy the statistics from another partition The previous partition increasing volumes? A golden template partition plan stability? A prior partition to reflect the periodicity of your data. The second Tuesday from last month, Tuesday from last week, the 8 th of last month Supported from 10.2.0.4 Slide 56 of 79

TEST_TAB1 P_20110201 MOSCOW dbms_stats.copy_table_stats( 'TESTUSER', TEST_TAB1', srcpartname => 'P_20110201', dstpartname => 'P_20110202'); dbms_stats.copy_table_stats( 'TESTUSER', TEST_TAB1', srcpartname => 'P_20110201_MOSCOW', dstpartname => 'P_20110202_MOSCOW'); Slide 57 of 79

TEST_TAB1 P_20110201 P_20110202 MOSCOW MOSCOW Slide 58 of 79

The previous example doesn't work on an unpatched 10.2.0.4 When copying stats between partitions on a composite partitioned object (one with subpartitions) SQL> exec dbms_stats.copy_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1', srcpartname => 'P_20110201', dstpartname => 'P_20110202'); BEGIN dbms_stats.copy_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1', srcpartname => 'P_20110201', dstpartname => 'P_20110202'); END; * ERROR at line 1: ORA-06533: Subscript beyond count ORA-06512: at "SYS.DBMS_STATS", line 17408 ORA-06512: at line 1 Slide 59 of 79

Bug number 8318020 Merge Label Request 8866627 Fixes a variety of stats-related bugs Patchset 10.2.0.5 Upgrade to 11.2.0.2 Slide 60 of 79

TEST_TAB1 REPORTING_DATE High/Low = 20110201 P_20110201 P_20110202 REPORTING_DATE High/Low = 20110201 Slide 61 of 79

TEST_TAB1 REPORTING_DATE High/Low = 20110201 P_20110201 REPORTING_DATE High/Low = 20110201 P_20110202 REPORTING_DATE High/Low = 20110201 Slide 62 of 79

We might reasonably expect Oracle to understand the implicit High/Low values of a partition key Merge Label Request 8866627 Patchset 10.2.0.5 Upgrade to 11.2 The wider issue here is that High/Low values (other than Partition Key columns and NDVs) will simply be copied Are you sure that's what you want? Slide 63 of 79

TEST_TAB1 P_20110201 P_20110202 OTHERS OTHERS Slide 64 of 79

ORA-03113 / 07445 while copying list partition statistics Core dump in qospminmaxpartcol I initially thought this was because the OTHERS subpartition was the last one I copied stats for It is because it is a DEFAULT list subpartition Bug number 10268597 Still in 10.2.0.5 and 11.2.0.2 Marked as fixed in 11.2.0.3 and 12.1.0.0 Slide 65 of 79

Positives Very fast and low resource method for setting statistics on new partitions Potential improvements to plan stability when accessing timeperiod partitions that are filled over time Negatives Bugs and related patches although better using 10.2.0.5 or 11.2 Does not eliminate the difficulty in maintaining accurate Global Statistics. Does not work well with composite partitioned tables. Does not work in current releases with List Partitioning where there is a DEFAULT partition Slide 66 of 79

New 10.2 GRANULARITY option as an alternative to GLOBAL AND PARTITION Uses the aggregation process, but can replace gathered global statistics If the aggregation process is unavailable, e.g. Because there are missing partition statistics, it falls back to GLOBAL AND PARTITION All the same NDV issues with aggregated stats so you should use with occasional Global Stats gather process Slide 67 of 79

Introduction Simple Fundamentals Statistics on Partitioned Objects The Quality/Performance Trade-off Aggregation Scenarios Alternative Strategies Incremental Statistics Conclusions and References Slide 68 of 79

What's the problem with the process for aggregating NDVs? Oracle knows the number of distinct values in the other partitions but not what those values were This might seem counter-intuitive. Oracle must have known what the values were when stats were gathered. But they are not stored anywhere Aggregation is a destructive process Incremental Statistics feature tracks the distinct values, stored as synopses Stored in WRI$_OPTSTAT_SYNPOSIS_HEAD$ and WRI$_OPTSTAT_SYNPOSIS$ Slide 69 of 79

Prerequisites INCREMENTAL setting for the partitioned table is TRUE Set using DBMS_STATS.SET_TABLE_PREFS PUBLISH setting for the partitioned table is TRUE Which is the default setting anyway The user specifies (both defaults) ESTIMATE_PERCENT => AUTO_SAMPLE_SIZE GRANULARITY => 'AUTO' Slide 70 of 79

Gather initial statistics using the default settings Oracle will gather statistics at all appropriate levels using onepass distinct sampling and store initial synopses As partitions are added or stats become stale, keep gathering using AUTO granularity and Oracle will Gather missing or stale partition stats Update synopses for those partitions Merge the synopses with synopses for higher levels of the same object, maintaining all Global Stats along the way Intelligent and accurate aggregation process Slide 71 of 79

Amit Poddar's excellent paper and presentation from earlier Hotsos Symposium Robin Moffat's blog post Synopses can take a lot of space in SYSAUX Aggregation seems hopelessly slow in older releases. Probably because WRI$_OPTSTAT_SYNOPSIS$ is not partitioned (it is in 11.2.0.2) Incremental Stats looks like the solution to our problems If you have the time to gather using defaults Slide 72 of 79

Introduction Simple Fundamentals Statistics on Partitioned Objects The Quality/Performance Trade-off Aggregation Scenarios Alternative Strategies Incremental Statistics Conclusions and References Slide 73 of 79

Aggregated NDVs are very low quality DBMS_STATS will only update aggregated stats when stats have been gathered appropriately on all underlying structures DBMS_STATS will never overwrite properly gathered Global Stats with aggregated results Unless you use 'APPROX_GLOBAL AND PARTITION' APPROX_GLOBAL stats otherwise suffer from the same problems as any other aggregated stats If aggregation fails because of missing partition stats, you will suddenly be using GLOBAL AND PARTITION Slide 74 of 79

Dynamic Sampling is almost certainly not the answer to your problems The default setting of _minimal_stats aggregation implies that you should normally use exchange-thengather If you are using Incremental Stats you must use exchange-then-gather anyway Slide 75 of 79

Try the Oracle default options first, particularly 11.2 and up If you do not have time to gather using the default granularity, gather the best statistics you can as data is loaded and gather proper global statistics later DBMS_STATS is constantly evolving so you should try to be on the latest patchsets with all relevant one-off patches applied Checking stats means checking all levels, including GLOBAL_STATS column NUM_DISTINCT and High/Low Values Slide 76 of 79

Design a strategy Develop any surrounding code Stick to the strategy Always gather stats using the wrapper code Lock and unlock stats programmatically to prevent human errors ruining the strategy Slide 77 of 79

Optimiser Development Group blog Greg Rahn's blog Amit Poddar's Paper Jonathan Lewis chapter in latest Oak Table book Lots of others in references section of paper Slide 78 of 79

Doug Burns dougburns@yahoo.com http://oracledoug.com/stats.docx