Administrivia# CS#133:#Databases# CostXbased#Query#SubXSystem# Goals#for#Today# Lab#3#starts#next#week# No#problem#set#out#this#week#

Similar documents
R & G Chapter 13. Implementation of single Relational Operations Choices depend on indexes, memory, stats, Joins Blocked nested loops:

Administrivia. CS 133: Databases. Cost-based Query Sub-System. Goals for Today. Midterm on Thursday 10/18. Assignments

Examples of Physical Query Plan Alternatives. Selected Material from Chapters 12, 14 and 15

Overview of Implementing Relational Operators and Query Evaluation

Query Optimization. Schema for Examples. Motivating Example. Similar to old schema; rname added for variations. Reserves: Sailors:

Schema for Examples. Query Optimization. Alternative Plans 1 (No Indexes) Motivating Example. Alternative Plans 2 With Indexes

Query Optimization. Schema for Examples. Motivating Example. Similar to old schema; rname added for variations. Reserves: Sailors:

CS330. Query Processing

Relational Query Optimization

Overview of Query Evaluation. Overview of Query Evaluation

Relational Query Optimization. Highlights of System R Optimizer

Evaluation of Relational Operations: Other Techniques. Chapter 14 Sayyed Nezhadi

Overview of Query Evaluation

Query Evaluation (i)

Relational Query Optimization. Overview of Query Evaluation. SQL Refresher. Yanlei Diao UMass Amherst October 23 & 25, 2007

Principles of Data Management. Lecture #12 (Query Optimization I)

Administrivia. Relational Query Optimization (this time we really mean it) Review: Query Optimization. Overview: Query Optimization

Review. Relational Query Optimization. Query Optimization Overview (cont) Query Optimization Overview. Cost-based Query Sub-System

Database Systems. Announcement. December 13/14, 2006 Lecture #10. Assignment #4 is due next week.

Database Applications (15-415)

Relational Query Optimization. Overview of Query Evaluation. SQL Refresher. Yanlei Diao UMass Amherst March 8 and 13, 2007

CompSci 516 Data Intensive Computing Systems

Implementation of Relational Operations

Overview of Query Processing

Query Evaluation Overview, cont.

Query Processing and Query Optimization. Prof Monika Shah

Evaluation of Relational Operations. Relational Operations

Principles of Data Management. Lecture #9 (Query Processing Overview)

Overview of Query Evaluation

Query Evaluation Overview, cont.

Evaluation of Relational Operations

QUERY OPTIMIZATION E Jayant Haritsa Computer Science and Automation Indian Institute of Science. JAN 2014 Slide 1 QUERY OPTIMIZATION

CSIT5300: Advanced Database Systems

Overview of DB & IR. ICS 624 Spring Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa

Administriva# CS#133:#Databases# Logical#Plan#to#Physical#Plan# Goals#for#Today# Spring#2017# Lec#11# #2/21# Query#Evalua?on# Prof.

An SQL query is parsed into a collection of query blocks optimize one block at a time. Nested blocks are usually treated as calls to a subroutine

QUERY OPTIMIZATION [CH 15]

Administriva. CS 133: Databases. General Themes. Goals for Today. Fall 2018 Lec 11 10/11 Query Evaluation Prof. Beth Trushkowsky

CSE 444: Database Internals. Section 4: Query Optimizer

CSIT5300: Advanced Database Systems

Overview of Query Processing. Evaluation of Relational Operations. Why Sort? Outline. Two-Way External Merge Sort. 2-Way Sort: Requires 3 Buffer Pages

Operator Implementation Wrap-Up Query Optimization

Query Optimization. Query Optimization. Optimization considerations. Example. Interaction of algorithm choice and tree arrangement.

Relational Query Optimization

Query Optimization. Kyuseok Shim. KDD Laboratory Seoul National University. Query Optimization

Announcement. Reading Material. Overview of Query Evaluation. Overview of Query Evaluation. Overview of Query Evaluation 9/26/17

Evaluation of Relational Operations

Database Applications (15-415)

CS330. Some Logistics. Three Topics. Indexing, Query Processing, and Transactions. Next two homework assignments out today Extra lab session:

ATYPICAL RELATIONAL QUERY OPTIMIZER

System R Optimization (contd.)

Query Processing and Optimization

Query Processing. Solutions to Practice Exercises Query:

Evaluation of relational operations

CSE 444: Database Internals. Sec2on 4: Query Op2mizer

Spring 2017 QUERY PROCESSING [JOINS, SET OPERATIONS, AND AGGREGATES] 2/19/17 CS 564: Database Management Systems; (c) Jignesh M.

CAS CS 460/660 Introduction to Database Systems. Query Evaluation II 1.1

DBMS Query evaluation

Evaluation of Relational Operations. SS Chung

Evaluation of Relational Operations

CIS 330: Applied Database Systems

CS 4604: Introduction to Database Management Systems. B. Aditya Prakash Lecture #10: Query Processing

SQL: Queries, Programming, Triggers

CSE 344 APRIL 27 TH COST ESTIMATION

Implementation of Relational Operations: Other Operations

Implementation of Relational Operations. Introduction. CS 186, Fall 2002, Lecture 19 R&G - Chapter 12

Announcements. Two typical kinds of queries. Choosing Index is Not Enough. Cost Parameters. Cost of Reading Data From Disk

Fundamentals of Database Systems

Project. CIS611 Spring 2014 SS Chung Due by April 15. Performance Evaluation Experiment on Query Rewrite Optimization

CSE 344 APRIL 20 TH RDBMS INTERNALS

Συστήματα Διαχείρισης Βάσεων Δεδομένων

Database Management Systems. Chapter 4. Relational Algebra. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1

Evaluation of Relational Operations: Other Techniques

Relational Algebra. [R&G] Chapter 4, Part A CS4320 1

Final Review. CS634 May 11, Slides based on Database Management Systems 3 rd ed, Ramakrishnan and Gehrke

External Sorting Implementing Relational Operators

CompSci 516 Data Intensive Computing Systems. Lecture 11. Query Optimization. Instructor: Sudeepa Roy

Evaluation of Relational Operations: Other Techniques

CS122 Lecture 4 Winter Term,

Midterm Review CS634. Slides based on Database Management Systems 3 rd ed, Ramakrishnan and Gehrke

Evaluation of Relational Operations: Other Techniques

Faloutsos 1. Carnegie Mellon Univ. Dept. of Computer Science Database Applications. Outline

Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Administrivia. Administrivia. Faloutsos/Pavlo CMU /615

Basic form of SQL Queries

EECS 647: Introduction to Database Systems

Database Applications (15-415)

Database System Concepts

What happens. 376a. Database Design. Execution strategy. Query conversion. Next. Two types of techniques

Advances in Data Management Query Processing and Query Optimisation A.Poulovassilis

Implementing Joins 1

Query Processing. Introduction to Databases CompSci 316 Fall 2017

Database Applications (15-415)

Project. Building a Simple Query Optimizer with Performance Evaluation Experiment on Query Rewrite Optimization

Introduction to Data Management CSE 344. Lecture 12: Cost Estimation Relational Calculus

ECS 165B: Database System Implementa6on Lecture 7

CSE 344 FEBRUARY 21 ST COST ESTIMATION

15-415/615 Faloutsos 1

Final Exam Review 2. Kathleen Durant CS 3200 Northeastern University Lecture 23

Database System Concepts

Database Management System. Relational Algebra and operations

Transcription:

Administrivia CS133:Databases Spring2017 Lec13 3/02 Prof.BethTrushkowsky Lab3startsnextweek Noproblemsetoutthisweek Updatedgrutoringhours TakepollonPiazzaformyofficehoursNming! GoalsforToday ReasonaboutthestagesofqueryopNmizaNon UnderstandhowtoesNmatethecostofafull queryplan Pipeliningvs.materializaNon Intermediateresultsizes What%plans% are% considered?% CostXbasedQuerySubXSystem Queries QueryParser QueryOpNmizer Plan Generator PlanCost EsNmator Ideally:%find%the%best%query%plan% Reality:%avoid%the%worst%plans!% How%is%the%cost%of%a% plan%es4mated?% CatalogManager QueryPlanEvaluator Schema StaNsNcs

QueryOpNmizaNonOverview QueryconvertedtorelaNonalalgebraexpression RelaNonalalgebraconvertedtotree,joinsasbranches Operators)can)also)be)applied)in)different)order!) Each%operator%has%implementa4on%choices%!%Choosing%forms%physical%plan% QueryOpNmizeralgorithm Goal:given%a%a%query,theopNmizerwantsto Decidewhichqueryplanstoconsider Compareplansandchoosethe best one (best=shortestnmetorun) SELECTS. FROMR,S WHERER.sid=S.sidAND R.ANDS.raNng>5 π () σ ( ranng>5) ( ) % % %% ra4ng%>%5% % Howaboutthisalgorithm? Step1:enumerate%the%space%of%all%possible%plans% Step2:run%each%query%plan,%measure%its%run4me% Step3:choose%the%plan%that%ran%the%fastest!% Lecbranchisthe outer relanon % % QueryOpNmizeralgorithm Goal:given%a%a%query,theopNmizerwantsto Decidewhichqueryplanstoconsider Compareplansandchoosethe best one (best=shortestnmetorun) Algorithm Step1:consider%a%set%of%possible%plans% Step2:es4mate%cost%for%each%plan% Step3:choose%the%plan%with%lowest%cost% EsNmaNngCost Don twanttoexecuteaplantofigureoutitsrunxnme! Insteades4mate%costoftheplan Usecostasaproxy%for%runJ4me% Costofaplan=sum%of%the%costs%for%each%operator%in% the%plan%

Reasoningaboutoperatorcost ForquesNonsbelow,assume: EachrelaNonis5pagesandstoredasaheapfile,noindexes Bufferpoolhas4frames JoinalgorithmispageXnestedXloopXjoin(PNLJ) Order)by)operatorusesgeneralexternalmergeXsort 1. (Review)WhatisthecostinI/Osforthisplan,ignoringcostof finaloutput? A B 2. Nowwhataboutthecostofthisplan?What)informa8on)are)you) missing?) ORDERBY(A.foo) Pipelinedvs.Materialized Queryplanoperator soutputcouldbegenerated ineithermaterializedorpipelined)fashion Materialized% OutputofanoperatorwriMen%back%to%disk%asa temporaryfilebeforeitsparentreadsitin Pipelining( onxthexfly ) Outputofoperator%immediately%given%to%parent%as input A B Pipelining Parentandchildoperators)execu8ng)concurrently) Iteratormodel Parentcallsnext()onchild/children (Asneeded)childcallsnext()onitschild/children SavingscomparedtomaterializaNon NowriteI/Ocostforchild soutput NoreadI/Ocostforparent sinput Algorithmsofoperatorsmustsupportpipeliningfor thistowork Exercise2:Pipelining UsePageXNestedXLoopjoinsforthejoin algorithm Someexamples: (AjoinB)joinC Pipelined Cjoin(AjoinB) Since(AjoinB)istheinnerrelaNonforthesecondjoin,need tomaterializeit

SchemaforExamples (sid:integer,:string,ra8ng:integer,age:real) (sid:integer,bid:integer,day:date,rname:string) : Eachrecordis40byteslong 100recordperpage 1000pages : Supposethereare100%boats% (uniformlydistributed) Eachrecordis50byteslong, 80recordperpage Supposethereare10%ra4ngs (uniformlydistributed1x10) 500pages Cost:500+500*1000I/Os Bynomeanstheworstplan! Misses)several)opportuni8es: selecnonscouldhavebeen pushed earlier,nouseismadeofanyavailable indexes... Goal1of1op3miza3on:))Tofindmore efficientplansthatcomputethesame answer. MoNvaNngExample SELECT S. FROM R, S WHERE R.sid=S.sid AND R. AND S.rating>5 QueryPlan: AlternaNvePlans PushSELECTs (NoIndexes) AlternaNvePlans PushSELECTs (NoIndexes) rating>5 (Scan & Write to temp T) 500,500IOs 250,500IOs 250,500IOs 4010IOs 500+1000+10+(250*10)

Exercise3X4:EsNmateI/Ocost Exercise4:EsNmateI/Ocost (Scan & Write to temp T) 6000IOs 4250IOs 1000+500+250+(10*250) 6000IOs 1000+(10*500) AlternaNvePlan:Indexes Supposehavetheseindexes: ClusteredAlt1%hash%index%on%bid)of UnclusteredAlt2hash%index%on%sid%of Gesngwith: Usingindex,weget100,000/100boats =1000recordson1000/100=10pages Cost:SelecNonon(10I/Os); then,foreachtuple,get[one]%matching tuple(1000*1.2) =1210I/Os. Joincolumnsidisa keyfor! (Use Alt 1 hash Index on bid) (Index Nested Loops on sid) Duetoindexon sid,decidenot topushdown ra8ng)>5) QueryBlocks:UnitsofOpNmizaNon Outer)block) SELECTS. FROMS WHERES.ageIN (SELECT))MAX)(S2.age)) )))))))FROM)))S2) )))))))GROUP)BY))S2.ra8ng) Nested)block) AnSQLqueryisparsedintoacollecNonofquery blocks,andtheseareopnmizedoneblockatanme. InnerblocksareusuallytreatedassubrouNnes Computed: onceperquery(foruncorrelatedsubxqueries) oronceperoutertuple(forcorrelatedsubxqueries)

TheSystemRaka SelingerXstyle QueryOpNmizer Impact: InspiredmostopNmizersinusetoday WorkswellforsmallXmediumcomplexityqueries (<10joins) Cost%es4ma4on: Veryinexact,butworksokinpracNce. StaNsNcs,maintainedinsystemcatalogs,usedtoesNmate costofoperanonsandresultsizes. ConsidersasimplecombinaNonofCPUandI/Ocosts. Plan%Space:Toolarge,mustbepruned! StaNsNcsandcardinalityesNmaNon Catalogstypicallycontainatleast: tuples(ntuples)andpages(npages)perrelanon andforeachindex: disnnctkeyvalues(nkeys). low/highkeyvalues(low/high). Indexheight(Height)foreachtreeindex. Indexsize(NPages)(e.g.,leafpagesfortree). StaNsNcsincatalogsupdatedperiodically. UpdaNngwheneverdatachangesistooexpensive;lotsof approximanonanyway,soslightinconsistencyok. PatSelinger:hups://wwwX03.ibm.com/ibm/history/witexhibit/wit_fellows_selinger.html SizeEsNmaNonandReducNonFactors Consideraqueryblock: Reduc8on)factor)(RF))associatedwitheachterm reflectstheimpactoftheterminreducingresultsize RF)is)also)called) selec3vity ) SELECTauributelist FROMrelaNonlist WHEREterm1AND...ANDtermk Howtopredictsizeofoutput? Needtoknow/esNmateinputsize Needtoknow/esNmateRFs Needtoknow/assumehowtermsarerelated ResultSizeEsNmaNonforSelecNons Resultcardinality(forconjuncNveterms)= %%input%tuples*%%product%of%all%rf s AssumpNons: 1.Valuesareuniformlydistributed and%terms%are%independent! 2.InSystemR,statsonlytrackedforindexedauributes (modernsystemshaveremovedthisrestricnon) Term% col=value col>value) Reduc4on%Factor% 1/Nkeys(I) (High(I)Xvalue)/(High(I)XLow(I)) Note:)in)System)R,)if)missing)indexes,)assume)RF)=)1/10)

Exercise5 RF=16/40*1/10=1/25 Resultsize:20pagesor1600tuples ForequiXjoinofRandS )range)of)result)sizes)(in))of)tuples)? IfRandShaveno%join%aMribute%values%incommon? Ifjoinauributesareakey%for%S? AndifthejoinauributesarealsoaforeignkeyinR? General%case:joinauributesaincommon,akeyforneither Assump8on:thesetofdisNnctR.avaluesiscontainedinS.a )Idea:))eachtupleofRhasa1/NKeys(S)chanceofjoiningwitha tupleins NTuples(R)*NTuples(S)/NKeys(S) ReversingaboveassumpNonyields Ntuples(S)*Ntuples(R)/Nkeys(R) ResultSizeEsNmaNonforJoins (use%smaller%of%two%if%different)%