FUN WITH ANALYTIC FUNCTIONS UTOUG TRAINING DAYS 2017

Similar documents
Rows and Range, Preceding and Following

Aster Data Basics Class Outline

DB2 SQL Class Outline

Greenplum SQL Class Outline

Aster Data SQL and MapReduce Class Outline

20461: Querying Microsoft SQL Server 2014 Databases

Objectives. After completing this lesson, you should be able to do the following:

Real-World Performance Training SQL Introduction

COURSE OUTLINE MOC 20461: QUERYING MICROSOFT SQL SERVER 2014

Databases - 5. Problems with the relational model Functions and sub-queries

GIFT Department of Computing Science. CS-217: Database Systems. Lab-4 Manual. Reporting Aggregated Data using Group Functions

CS2 Current Technologies Lecture 2: SQL Programming Basics

AVANTUS TRAINING PTE LTD

20461: Querying Microsoft SQL Server

Still using. Windows 3.1? So why stick to -

Querying Microsoft SQL Server (MOC 20461C)

SQL Server Windowing Functions

Querying Microsoft SQL Server

INTERMEDIATE SQL GOING BEYOND THE SELECT. Created by Brian Duffey

Based on the following Table(s), Write down the queries as indicated: 1. Write an SQL query to insert a new row in table Dept with values: 4, Prog, MO

Chapter 9 Windowed Tables and Window Functions in SQL. Recent Development for Data Models 2016 Stefan Deßloch

Postgres Window Magic

Querying Microsoft SQL Server

Structured Query Language Continued. Rose-Hulman Institute of Technology Curt Clifton

After completing this course, participants will be able to:

Querying Microsoft SQL Server

Chapter 6 Windowed Tables and Window Functions in SQL

Chapter 6 Windowed Tables and Window Functions in SQL

Querying Microsoft SQL Server 2008/2012

COURSE OUTLINE: Querying Microsoft SQL Server

20461D: Querying Microsoft SQL Server

Table of Contents. Oracle SQL PL/SQL Training Courses

SYSTEM CODE COURSE NAME DESCRIPTION SEM

Creating and Managing Tables Schedule: Timing Topic

QUERYING MICROSOFT SQL SERVER COURSE OUTLINE. Course: 20461C; Duration: 5 Days; Instructor-led

Database Lab Queries. Fall Term 2017 Dr. Andreas Geppert

Querying Microsoft SQL Server 2012/2014

A Window into Your Data. Using SQL Window Functions

$99.95 per user. Writing Queries for SQL Server (2005/2008 Edition) CourseId: 160 Skill level: Run Time: 42+ hours (209 videos)

Microsoft Querying Microsoft SQL Server 2014

Chapter 6 Windowed Tables and Window Functions in SQL

Introduction to Views

SQL Saturday Cork Welcome to Cork. Andrea Martorana Tusa T-SQL advanced: Grouping and Windowing

ORACLE VIEWS ORACLE VIEWS. Techgoeasy.com

Course 20461C: Querying Microsoft SQL Server

Querying Microsoft SQL Server

Andrea Martorana Tusa. T-SQL Advanced: Grouping and Windowing

"Charting the Course to Your Success!" MOC D Querying Microsoft SQL Server Course Summary

In This Lecture. Yet More SQL SELECT ORDER BY. SQL SELECT Overview. ORDER BY Example. ORDER BY Example. Yet more SQL

Querying Microsoft SQL Server

MIS NETWORK ADMINISTRATOR PROGRAM

Advanced Data Management Technologies

Institute of Aga. Microsoft SQL Server LECTURER NIYAZ M. SALIH

T-SQL Training: T-SQL for SQL Server for Developers

Aster Data Database Administration Class Outline

Optimizing and Simplifying Complex SQL with Advanced Grouping. Presented by: Jared Still

Institute of Aga. Network Database LECTURER NIYAZ M. SALIH

SQL Queries. COSC 304 Introduction to Database Systems SQL. Example Relations. SQL and Relational Algebra. Example Relation Instances

Copyright 2017, Oracle and/or its aff iliates. All rights reserved.

SQL. Char (30) can store ram, ramji007 or 80- b

Oracle Database 11g: SQL and PL/SQL Fundamentals

Modern SQL: Evolution of a dinosaur

Querying Microsoft SQL Server 2014

Database Management Systems,

COSC 304 Introduction to Database Systems SQL. Dr. Ramon Lawrence University of British Columbia Okanagan

CSC Web Programming. Introduction to SQL

INDEX. 1 Basic SQL Statements. 2 Restricting and Sorting Data. 3 Single Row Functions. 4 Displaying data from multiple tables

Chapter 16: Advanced MySQL- Grouping Records and Joining Tables. Informatics Practices Class XII. By- Rajesh Kumar Mishra

Querying Data with Transact SQL

Instructor: Craig Duckett. Lecture 03: Tuesday, April 3, 2018 SQL Sorting, Aggregates and Joining Tables

Full file at

CGS 3066: Spring 2017 SQL Reference

Oracle Database: SQL and PL/SQL Fundamentals NEW

Advanced Data Management Technologies

Oracle Database: SQL and PL/SQL Fundamentals Ed 2

Database Usage (and Construction)

SQL, the underestimated Big Data technology

GIFT Department of Computing Science Data Selection and Filtering using the SELECT Statement

Follow these steps to get started: o Launch MS Access from your start menu. The MS Access startup panel is displayed:

Slicing and Dicing Data in CF and SQL: Part 1

Database implementation Further SQL

Experimental Finance. IEOR Department. Mike Lipkin, Alexander Stanton

II. Structured Query Language (SQL)

Data Warehousing and Decision Support

Suppose we need to get/retrieve the data from multiple columns which exists in multiple tables...then we use joins..

2) SQL includes a data definition language, a data manipulation language, and SQL/Persistent stored modules. Answer: TRUE Diff: 2 Page Ref: 36

Table of Contents. PDF created with FinePrint pdffactory Pro trial version

Course Outline and Objectives: Database Programming with SQL

Introduction to Computer Science and Business

WEEK 3 TERADATA EXERCISES GUIDE

CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

MIS2502: Data Analytics SQL Getting Information Out of a Database Part 1: Basic Queries

Test Bank for Database Processing Fundamentals Design and Implementation 13th Edition by Kroenke

SQL Data Query Language

Star Schema Design (Additonal Material; Partly Covered in Chapter 8) Class 04: Star Schema Design 1

20761 Querying Data with Transact SQL

SQL Structured Query Language Introduction

Guest Lecture. Daniel Dao & Nick Buroojy

Logical Operators and aggregation

SQL for Analysis, Reporting and Modeling

Transcription:

FUN WITH ANALYTIC FUNCTIONS UTOUG TRAINING DAYS 2017

ABOUT ME Born and raised here in UT In IT for 10 years, DBA for the last 6 Databases and Data are my hobbies, I m rather quite boring This isn t why you re here though

ANALYTIC FUNCTIONS SAY WHAT? Analytic Functions compute a value based upon a subset of the rows in a query result The subset it referred to as the partition Unrelated to table partitioning The best way to understand these functions is to compare them to standard Aggregate functions (SUM, MIN, MAX, etc.)

AGGREGATE VS. ANALYTIC The Data Aggregate AVG Analytic Function AVG

41 FLAVORS 41 different Analytic Functions Positional (FIRST, LAST, ROW_NUMBER, LEAD, LAG, RANK, etc.) Statistical (CORR, REG_R, N_TILE, STDDEV, etc.) Aggregate (SUM, AVG, MIN, MAX, etc.) Pattern Matching (Find patterns, like V shaped dips in stock ticker data) ListAgg

SAMPLES! Samples based on SCOTT schema View -> Snippets

THE SYNTAX It s not as complicated as it looks

QUICK EXAMPLES FUNCTION(<field a>) OVER (PARTITION by <field b>) The Data Analytic Function AVG select ename, job, deptno, avg(sal)over (partition by deptno) avg_sal_by_deptno, sal, sal/(avg(sal) over (partition by deptno)) pct_of_average from scott.emp order by deptno desc;

MIX N MATCH select ename, job, deptno, avg(sal)over (partition by deptno) avg_sal_by_deptno, sal, sal/(avg(sal) over (partition by deptno)) pct_of_average from scott.emp order by deptno desc; select ename, job, deptno, min(sal) over (partition by deptno) min_sal_by_deptno, sal, sal/(min(sal) over (partition by deptno)) pct_of_min from scott.emp order by deptno desc;

REAL LIFE

C-LEVEL ASKS EASY QUESTION Can you tell me the order that accounts were opened in? Can you give me an ordinal number (1st, 2nd, 3rd)? row_number() over (partition by acct order by acct_open_date)

WHAT ABOUT WHEN TWO SUB ACCOUNTS ARE OPENED ON THE SAME DAY, CAN YOU MAKE THOSE BE THE SAME? Original Query row_number() over (partition by acct order by acct_open_date) dense_rank() over (partition by acct order by acct_open_date) rank() over (partition by acct order by acct_open_date)

CAN YOU TELL ME HOW LONG IT TAKES BETWEEN ONE ACCOUNT AND ANOTHER? LEAD LAG lag(acct_open_date) over (partition by acct order by acct_open_date) acct_open_date - lag(acct_open_date) over (partition by acct order by acct_open_date)

WHAT SHE REALLY WANTED I just need the sequence patterns, in general This uses LISTAGG

LISTAGG LISTAGG(<string to concatenate>, <concatenator> within group (order by <field>) LISTAGG(job, ' -> ') within group (order by hiredate)

NOT GOOD ENOUGH Analytic Functions can t go in a GROUP BY Clause Can you order those by how common each pattern is? Sure? SELECT, DISTINCT listagg(acct_description, ' -> ') WITHIN GROUP (order by ACCT_OPEN_DATE) count(distinct listagg(acct_description,' -> ') WITHIN GROUP (order by ACCT_OPEN_DATE)) pattern_observance_count

DON T PUT YOUR AF S WHERE THEY DON T BELONG Use a subquery to get around this select deptno,avg_sal_by_deptno,sal,pct_of_average select from ( select deptno, avg(sal)over deptno, select (partition by deptno) avg_sal_by_deptno, avg(sal)over (partition by deptno) sal, avg_sal_by_deptno, avg(sal)over (partition by deptno) sal/(avg(sal) sal, over (partition by deptno)) pct_of_average sal/(avg(sal) sal, over (partition by deptno)) from pct_of_average scott.emp sal/(avg(sal) over (partition by from scott.emp >1 from scott.emp order by deptno desc; avg_sal_by_deptno, deptno)) where sal/(avg(sal) pct_of_average over (partition by deptno)) order by deptno order desc; by deptno desc ) where pct_of_average >=1

GETTING ROLLED Can you tell me the transactions an account has done? Can you sum the Amounts?

NO, COULD YOU SUM UP THE AMOUNTS FOR EACH MONTH, BUT DON'T HIDE THE TRANSACTION DETAILS? Original Data sum(amount) sum(amount)over (partition by trunc(business_date,'mm'), acct_num) monthly_total

COULD YOU BREAK IT OUT BY THE TYPE OF TRANSACTION IT WAS? DEBIT VS. CREDIT? sum(amount)over (partition by trunc(business_date,'mm'), acct_num) monthly_total sum(amount)over (partition by trunc(business_date,'mm'), acct_num,tran_type) monthly_total Different partition => different total Same partition => same total Nulls treated together

COULD YOU MAKE A ROLLING SUM TOO, BROKEN OUT THE SAME WAY? sum(amount)over (partition by trunc(business_date,'mm'),acct_num,tran_type) monthly_total, sum(amount) over ( partition by trunc(business_date,'mm'),acct,suffix,tran_type order by acct_seq_num) rolling_monthly_total

PERFECT, BUT COULD YOU EXCLUDE THE CURRENT TRANSACTION FROM THE ROLLING MONTHLY TOTAL? sum(amount)over (partition by trunc(business_date,'mm'), acct_num,tran_type) monthly_total, sum(amount) over ( partition by trunc(business_date,'mm'),acct,suffix,tran_type order by acct_seq_num) rolling_monthly_total, sum(amount) over ( partition by trunc(business_date,'mm'),acct,suffix,tran_type ROWS BETWEEN UNBOUNDED PRECEDING and 1 PRECEDING ) roll_mnthly_tot_excl_cur_tran

ROWS AND RANGE SUB PARTITIONS ROWS BETWEEN UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING ROWS BETWEEN UNBOUNDED PRECEDING and X PRECEDING ROWS is number of Rows RANGE is a numeric or date range PRECEEDING is before the current row FOLLOWING is after the current row

SIMPLE EXAMPLE lead(row_number) over (partition by 'X' order by row_number) next_number, first_value(row_number) over (partition by 'X' order by row_number rows between 2 FOLLOWING and 3 FOLLOWING) number_after_the_next_number, sum(row_number) over (partition by 'X' order by row_number rows between 1 FOLLOWING and 2 FOLLOWING) sum_of_next_2_nums, sum(row_number) over (partition by 'X' order by row_number rows between 1 FOLLOWING and UNBOUNDED FOLLOWING) sum_nums_from_this_to_the_end, sum(row_number) over (partition by 'X' order by row_number rows between 1 PRECEDING and 1 FOLLOWING) sum_nums_1_before_to_1_after

FILLING HOLES Can you tell me a drawer s end of day totals are each day? Lots of missing days How can we fill in those gaps?

LET S GET THE NEXT USED DATE ON EACH ROW lead(branch_date) over (partition by branch_code,cashbox_id order by branch_date) next_used_date Lets fix this null

AF S CAN BE USED ALMOST ANYWHERE case when lead(branch_date) over (partition by branch_code,cashbox_id order by branch_date)is null then branch_date else lead(branch_date) over (partition by branch_code,cashbox_id order by branch_date) end next_used_date,

NULLS FIXED! Before After But we still have gaps

JOIN THIS TO A CALENDAR Begin Date 20161101* to_date('20161101','yyyymmdd') SELECT to_date('20161101','yyyymmdd')+ ROWNUM -1 calendar_date FROM ( SELECT 1 just_a_column FROM dual CONNECT BY LEVEL <= (10000) Some big number larger than how far you want to go back. This would calculate out the End Date

JOINING TO A CALENDAR 20161115 is between 20161115 and (20161116-1) 20 th is missing, but 20161120 is between 20161119 and (20161121 1) WHERE calendar_date BETWEEN branch_date and next_used_date-1

FILLED GAPS THANKS TO AN AF Before After

HOW BIG IS THAT CANYON? Department wanted to know details of accounts going negative 1500 1000 500 They wanted to know how deep and how wide the canyon was when looking at a daily history of account balances How wide? End Time? Start Time? 0-500 -1000-1500 How deep? -2000

USE PATTERN MATCHING (12C) The Data The Result 1500 1000 500 0-500

THINGS YOU CAN DO WITH IT: Find V, W and other patterns in Stock Prices Find timeframes of high database use Group clicks in web logs into sessions Detect traversal patterns of Finite State Machines We won t go much deeper but look into these, they re neat!

NOT COMPLICATED, JUST INVOLVED Used wherever you can put data into a line graph, i.e. data is a log of events Lots of great resources: Ask Tom - http://www.oracle.com/technetwork/issue-archive/2013/13-nov/o63asktom-2034271.html GitHub - https://github.com/oracle/analytical-sql-examples/tree/master/pattern-matching Burleson - http://www.dba-oracle.com/t_sql_match_recognize.htm YouTube has some good demos too

AF PERFORMANCE? Keep an eye on performance these do lots of sorts Try to use indexes, filter your data before applying analytic functions Sometimes AF s can help improve performance, other times it can reduce it Tom Kyte says: In general, analytics are great for answering "really big" questions or questions against "small sets" https://asktom.oracle.com/pls/apex/f?p=100:11:0::::p11_question_id:1137250200346660664

QUESTIONS?