Kerry Osborne Enkitec Chris Wones - dunnhumby E My PX Goes to 11

Similar documents
Exadata. Presented by: Kerry Osborne. February 23, 2012

Real World Exadata. Presented by: Kerry Osborne. March 7, 2012

Exadata X3 in action: Measuring Smart Scan efficiency with AWR. Franck Pachot Senior Consultant

Automatic Parallel Execution Presented by Joel Goodman Oracle University EMEA

Oracle. Exam Questions 1Z Oracle Database 11g Release 2: SQL Tuning Exam. Version:Demo

Workload Management for an Operational Data Warehouse Oracle Database Jean-Pierre Dijcks Sr. Principal Product Manager Data Warehousing

Adaptive Optimization. Presented by: Kerry Osborne Red Gate Webinar, Nov. 2013

SQL Plan Management. on 12c Kerry Osborne OakTable World, 2013

Oracle DB-Tuning Essentials

Exadata Implementation Strategy

Parallel Execution with Oracle Database 18c Fundamentals ORACLE WHITE PAPER FEBRUARY 2018

Enkitec Exadata Implementation & POC Success Stories

Oracle Exadata Implementation Strategy HHow to Implement Exadata In-house

<Insert Picture Here> Controlling resources in an Exadata environment

<Insert Picture Here> Controlling resources in an Exadata environment

Harnessing Power of Parallelism in an Oracle Data Warehouse

Key to A Successful Exadata POC

Was ist dran an einer spezialisierten Data Warehousing platform?

Oracle Exadata Performance

Deployment Best Practices for PPM Operational Reporting

Advanced Oracle SQL Tuning v3.0 by Tanel Poder

Oracle Autonomous Database

Top 5 Issues that Cannot be Resolved by DBAs (other than missed bind variables)

Does Exadata Need Performance Tuning? Jože Senegačnik, Oracle ACE Director, Member of OakTable DbProf d.o.o. Ljubljana, Slovenia

Background. $VENDOR wasn t sure either, but they were pretty sure it wasn t their code.

Oracle Exadata X7. Uwe Kirchhoff Oracle ACS - Delivery Senior Principal Service Delivery Engineer

Slide 2 of 79 12/03/2011

Oracle Exadata: Strategy and Roadmap

Top 7 Plan Stability Pitfalls & How to Avoid Them. Neil Chandler Chandler Systems Ltd UK

Copyright 2018, Oracle and/or its affiliates. All rights reserved.

Exadata Implementation Strategy

SQL Gone Wild: Taming Bad SQL the Easy Way (or the Hard Way) Sergey Koltakov Product Manager, Database Manageability

Oracle Database New Performance Features

Applying a Blockcentric Approach to Oracle Tuning. Daniel W. Fink

Effec%ve Use of Oracle s 12c Database Opera%on Monitor

Real-World Performance Training Exadata and Database In-Memory

Oracle 1Z Oracle Database 11g Release 2- SQL Tuning. Download Full Version :

SQL Transla+on Framework

OpenWorld 2018 SQL Tuning Tips for Cloud Administrators

Lecture 12. Lecture 12: The IO Model & External Sorting

Business Analytics in the Oracle 12.2 Database: Analytic Views. Event: BIWA 2017 Presenter: Dan Vlamis and Cathye Pendley Date: January 31, 2017

Identifying and Fixing Parameter Sniffing

CIS 45, The Introduction. What is a database? What is data? What is information?

Oracle 1Z0-515 Exam Questions & Answers

Speech 2 Part 2 Transcript: The role of DB2 in Web 2.0 and in the IOD World

TIM 50 - Business Information Systems

Background. Let s see what we prescribed.

Security analytics: From data to action Visual and analytical approaches to detecting modern adversaries

Learning Objectives : This chapter provides an introduction to performance tuning scenarios and its tools.

(RAPID) Landing Page Building. A Practical Guide Presented by Thrive Themes

Interpreting Explain Plan Output. John Mullins

Actual4Test. Actual4test - actual test exam dumps-pass for IT exams

Cache Fusion Demystified By Arup Nanda

[Contents. Sharing. sqlplus. Storage 6. System Support Processes 15 Operating System Files 16. Synonyms. SQL*Developer

Eine für Alle - Oracle DB für Big Data, In-memory und Exadata Dr.-Ing. Holger Friedrich

Db2 9.7 Create Table If Not Exists >>>CLICK HERE<<<

Oracle Database 18c New Performance Features

Under the Hood of Oracle Database Appliance. Alex Gorbachev

Querying Data with Transact-SQL

A Unit of SequelGate Innovative Technologies Pvt. Ltd. All Training Sessions are Completely Practical & Real-time

Recent Innovations in Data Storage Technologies Dr Roger MacNicol Software Architect

<Insert Picture Here> Managing Oracle Exadata Database Machine with Oracle Enterprise Manager 11g

DESIGNING FOR PERFORMANCE SERIES. Smokin Fast Queries Query Optimization

Eternal Story on Temporary Objects

Oracle: From Client Server to the Grid and beyond

Hybrid Columnar Compression (HCC) on Oracle Database 18c O R A C L E W H IT E P A P E R FE B R U A R Y

Real-World Performance Training SQL Performance

Oracle Diagnostics Pack For Oracle Database

Oracle Exadata. Smart Database Platforms - Dramatic Performance and Cost Advantages. Juan Loaiza Senior Vice President Oracle Database Systems

Leveraging Customer Behavioral Data to Drive Revenue the GPU S7456

Certification Exam Preparation Seminar: Oracle Database SQL

2

<Insert Picture Here> Inside the Oracle Database 11g Optimizer Removing the black magic

B. Using Data Guard Physical Standby to migrate from an 11.1 database to Exadata is beneficial because it allows you to adopt HCC during migration.

Oracle Optimizer: What s New in Oracle Database 12c? Maria Colgan Master Product Manager

Optimizing tempdb Performance

Pitfalls & Surprises with DBMS_STATS: How to Solve Them

Should You Drop Indexes on Exadata?

<Insert Picture Here> Looking at Performance - What s new in MySQL Workbench 6.2

Oracle Database In-Memory What s New and What s Coming

Open Connect: Starting from a Greenfield (a mostly Layer 0 talk) Dave Temkin 06/01/2015

(10393) Database Performance Tuning Hands-On Lab

Oracle #1 RDBMS Vendor

Consolidate with Oracle Exadata

An Oracle White Paper June Exadata Hybrid Columnar Compression (EHCC)

Real World Web Scalability. Ask Bjørn Hansen Develooper LLC

An Oracle White Paper April 2010

Session 1079: Using Real Application Testing to Successfully Migrate to Exadata - Best Practices and Customer Case Studies

Custom Performance Reporting Changes in Oracle 10g. Brian Doyle BEZ Systems VP, Product Service

White Paper. How the Meltdown and Spectre bugs work and what you can do to prevent a performance plummet. Contents

Oracle Database: Introduction to SQL Ed 2

1 Copyright 2011, Oracle and/or its affiliates. All rights reserved. reserved. Insert Information Protection Policy Classification from Slide 8

Designing dashboards for performance. Reference deck

Introduction to Statistics in SQL Server. Andy Warren

OracleMan Consulting

Evolving To The Big Data Warehouse

<Insert Picture Here> DBA s New Best Friend: Advanced SQL Tuning Features of Oracle Database 11g

MySQL Database Scalability

Jyotheswar Kuricheti

How to find and fix your. application performance problem. Cary Millsap Method R Corporation and Accenture Enkitec Group

Transcription:

Kerry Osborne Enkitec Chris Wones - dunnhumby E4 2014 My PX Goes to 11

About Me (Kerry) Working with Oracle since V2 (1982) Working with Exadata since V2 (2010) Co-author of Expert Oracle Exadata & Pro Oracle SQL I Work for Enkitec!

About Me (Chris) Working with Oracle since 1992 Working with Exadata since 2012 I work for dunnhumby! I have a party boat!

Obligatory Marketing Slide Enkitec is Awesome!

Obligatory Marketing Slide dunnhumby is also Awesome!

What s the Point? Big Data Warehouse Running on SAND DB (column oriented) Conversion to Exadata (X3-8) Pay Down Technical Debt Wanted consolidation from many data marts Needed Strong Vendor (Oracle) and partner (Enkitec) Security, Innovation, and Performance (SIP)

Teamwork Joint Effort dunnhumby and Enkitec Already had good relationship Both teams good technically dunnhumby provided domain knowledge Enkitec provided outside view Team members: dunnhumby: David, Michael, Jay, Springers, Rashmi, Enkitec: Tanel, Karen, Alex, Kerry, Oracle: Maria, Tom, Hermann, Sue,

Original Project Goals Verify Exadata Configuration was Optimal Individual Report Performance Overall Report Throughput Recommend Additional Hardware

Original Project Goals Configuration Just wanted make sure configuration was optimal As you know, Oracle is very configurable!

Original Project Goals Individual Report Performance 90% of 10% sample reports finish within 15 minutes (queue time + SQL run-time + serialisation time) 90% of 100% sample reports finish within 50 minutes (queue time + SQL run-time + serialisation time)

Original Project Goals Report Throughput Be able to run 1,000 reports per day 977 was highest volume ever run on old system Test Cycle was 12 hours

Original Project Goals Recommend Additional Hardware I ve never been asked ahead of time to recommend an alternative in case of failure. Probably should have been scared But we re always optimistic!

Some Numbers to Start With DB - 11.2.0.3 cellsrv 11.2.3.2.1 (version before auto flash cache scans) Best 12 hour throughput test was 224 Biggest table was 4TB+ Many other tables in 0.5-1TB range Platform was 4 node cluster (2 Exadata X3-8 s) Reports were heavily parallelized Using HCC Heavily Using Smart Scans Heavily Write Back Cache On Top SQL was mostly CTAS Some SQL Doing 100M+ gets and 150M+ pio

Initial Challenges - business Restatement of card to household relationship Product filter groups unknown Date ranges not always standard Segmentations defined using request criteria

Initial Challenges - technical Extremely Large Data Set No Pre-Aggregation Throughput was way lower than expectations Temp Usage was biggest bottle neck Variability of DOP was biggest frustration Code was still being modified

It Was a Little Out of Control! 3 Pronged Approach: Get DOP Settled Down - consistent DOP - eliminate downgrades Remove Bottle Necks - start with Temp - move to next one Improve Code - to do less work Note that we focused on throughput

Digression Tale w/ Multiple Story Lines Initial Architecture / Configuration Bottleneck Removal / Avoidance Coding Changes Oracle Features

Initial PX Setup parallel_adaptive_multi_user FALSE parallel_degree_limit 24 parallel_degree_policy AUTO parallel_force_local TRUE parallel_max_servers 1280 parallel_min_servers 96 parallel_servers_target 960 parallel_threads_per_cpu 1 X3-8 - 80 cores (for most of the project we had it capped at about 65%)

Initial DOP Control AutoDOP Tables Defined with DEFAULT Tables Defined with specific Degrees Hints used in Statements Alter Session used to control PX RESMGR used to control PX Downgrades Occurring?

DOP Control Too Many Knobs? Yes there are a lot of knobs! AutoDOP is an effort to fix this!

Evaluating Potential DOP Setups Goal No. 1 Improve stability Needed to decide quickly Came Up with 3 Options

Potential Setup #1 - Classic Policy = Manual Control with Hints (or alter session) Turn Everything Else Off Tried and True Has Worked in Past on Big Systems PX queuing = True

Digression PX Queuing I m a Big Fan! Unfortunately it s Linked to autodop Fortunately there is a separate parameter for it Unfortunately it s a hidden parameter Fortunately Some Brave Souls at Oracle Have Documented It Seems to be some debate inside Oracle about _parallel_statement_queuing In Memory Parallel can short circuit smart scans making performance unpredictable

Potential Setup #2 - LTD Policy = Limited Big Tables Degree = Default Turn Everything Else Off PX queuing = True

Potential Setup #3 - Jetson Policy = Auto Turn Everything Else Off It s the Wave of the Future But it s Not Widely Accepted And It s Not Well Documented But Some Day We ll All Have Jetpacks! Note that sekng policy=auto turns on In Memory PX as well Note also that Auto mode can kick in when policy = manual

We Picked Classic! parallel_degree_policy = MANUAL parallel_force_local = FALSE _parallel_statement_queuing = TRUE Mitigating Factors Time was short

Further DOP Challenges Still Too Much of a Good Thing! - Eliminate PX for SQL accessing < 1M rows - Reduce DOP in general - Differentiate between query and create on CTAS Still Lot s of Downgrades -Confusing because setting should have eliminated them -Forensics difficult -SQL Monitor has data but it ages out quickly

*^$%#% Downgrades! Setup -parallel_adaptive_multi_user=false -Queuing On (target set below max) Shouldn t have downgrades (unless single stmt asks for more than max) (or bugs) Difficulty in tracking slaves/execution started capturing v$pq_sesstat after each statement Realized CTE s and Unions causing multiple DFO s - resulting in multiple sets of slaves (bug) Note that resmgr only limits DOP, not number of slaves (in 11g)

*^$%#% Downgrades! Using gv$px_session (via Tanel s handy px.sql script): 248 s not too bad, but there we re many that were over 1000

*^$%#% Downgrades! Using v$pq_sesstat: 1152 = 96x12, if target is 1000 and max is 1200, 2 of these would run and 2 nd would be downgraded

Bottlenecks But back to Winterfell... Spilling to Temp - added as much PGA memory as possible - see Alex s presentation, Hotsos 2014 - increased PX for worst offenders - proposed moving to ZFS - see Alex s presentation, E4 2014 - of course doing less work is best - changes to SQL most effective tool The fastest way to do anything is not to do it. ~ Cary Millsap

Bottlenecks Contention - Overloading CPUs - AMM - seg$ - HWM Brokering - Temp file header (enq: SS) - CF enque - etc

Coding Now Back to King s Landing... Limit DOP Re-write to eliminate CTE s and Union Alls - To avoid multiple sets of slaves - To eliminate multiple passes through the same table Use EXIST clause rather than joining Separate Joins and Aggregations Use Advanced Grouping (Cube, Rollup, Grouping Sets) Experiment with NDV Synopsis Reconsider Pre-Aggregation Grouping via chunking

Oracle Additions Now Back to the Wall... Aggressive Bloom Filters General Education on PX and Resmgr NDV hyperloglog algorythm - approx_count_distinct

Results (to date) As of March - 1700+ reports in 12 hours = 3400 / 24 hours - ~3.5X the initial throughput goal In Production Now As of last week (from AWR) the system was spending > 50% dbtime on cell smart scan ~ 20-30% - read / write temp < 10% CPU Customer Feedback Holy &^%!

Things to Come LTD and Maybe even Jetson In Memory (12c) - X3-8 has 2 TB RAM Move temp to ZFS and free up some PGA NDV enhancements (PL/SQL functions too slow) Apply 11.2.3.3.0 adds flash cache for full scans Pre-aggregation?

Questions Email: chris.wones@us.dunnhumby.com Blog: www.bzzagent.com/blog/ Twitter: @chriswones Email: kerry.osborne@enkitec.com Blog: kerryosborne.oracle-guy.com Twitter: @kerryoracleguy