Module 9: Query Optimization
|
|
- Julius Byrd
- 5 years ago
- Views:
Transcription
1 Module 9: Query Optimization Module Outline Web Forms Applications SQL Interface 9.1 Outline of Query Optimization 9.2 Motivating Example 9.3 Equivalences in the relational algebra 9.4 Heuristic optimization 9.5 Explosion of search space 9.6 Dynamic programming strategy (System R) Transaction Manager Lock Manager Plan Executor Operator Evaluator Concurrency Control SQL Commands Files and Index Structures Buffer Manager Disk Space Manager Parser Optimizer You are here! Query Processor Recovery Manager DBMS Index Files Data Files System Catalog Database 275
2 9.1 Outline of Query Optimization The success of relational database technology is largely due to the systems ability to automatically find evaluation plans for declaratively specified queries. Given some (SQL) query Q, the system 1 parses and analyzes Q, 2 derives a relational algebra expression E that computes Q, 3 transforms and simplifies E, and 4 annotates the operators in E with access methods and operator algorithms to obtain an evaluation plan P. Discussed here: Task 3 is often called algebraic (or re-write) query optimization, while task 4 is also called non-algebraic (or cost-based) query optimization. 276
3 9.2 Motivating Example From query to plan Example: List the airports from which flights operated by Swiss (airline code LX) fly to any German (DE) airport. Airport : code country name FRA DE Frankfurt ZHR CH Zurich MUC DE Munich. Flight : from to airline FRA ZHR LX ZHR MUC LX FRA MUC US. SQL query Q: SELECT FROM WHERE f.from Flight f, Airport a f.to = a.code AND f.airline = LX AND a.country = DE 277
4 From query to plan... SQL query Q: SELECT FROM WHERE f.from Flight f, Airport a f.to = a.code AND f.airline = LX AND a.country = DE Relational algebra expression E that computes Q: from airline= LX country= DE Airport to=code Flight 278
5 From query to plan... Relational algebra expression E that computes Q: from airline= LX country= DE Airport to=code Flight One (of many) plan(s) P to evaluate Q: from scan airline= LX country= DE scan Airport heap scan to=code NL- Flight index scan on to 279
6 9.3 Equivalences in the relational algebra Two relational algebra expressions E 1, E 2 are equivalent if on every legal database instance the two expressions generate the same set of tuples. Note: the order of tuples is irrelevant Such equivalences are denoted by equivalence rules of the form E 1 E 2 (such a rule may be applied by the system in both directions, ). We know those equivalence rules from the course Information Systems. 280
7 Some equivalence rules 1 Conjunctive selections can be deconstructed into a sequence of individual selections: p 1 p 2 (E) p 1 (p 2 (E)) 2 Selection operations are commutative: p 1 (p 2 (E)) p 2 (p 1 (E)) 3 Only the last projection in a sequence of projections is needed, the others can be omitted: L 1 (L 2 ( L n (E) )) L 1 (E) 4 Selections can be combined with Cartesian products and joins: i) ii) p(e 1 E 2 ) E 1 p E 2 p(e 1 q E 2 ) E 1 p q E 2 281
8 Pictorial description of 4 i): p E 1 E 2 p E 1 E 2 282
9 5 Join operations are commutative: E 1 p E 2 E 2 p E 1 6 i) Natural joins (equality of common attributes) are associative: (E 1 E 2 ) E 3 E 1 (E 2 E 3 ) ii) Generals joins are associative in the following sense: (E 1 p E 2 ) q r E 3 ) E 1 p q (E 2 r E 3 ) where predicate r involves attributes of E 2, E 3 only. 7 Selection distributes over joins in the following ways: i) If predicate p involves attributes of E 1 only: p(e 1 q E 2 ) p(e 1 ) q E 2 ii) If predicate p involves only attributes of E 1 and q involves only attributes of E 2 : p q(e 1 r E 2 ) p(e 1 ) r q(e 2 ) (this is a consequence of rules 7 (a) and 1 ). 283
10 8 Projection distributes over join as follows: L 1 L 2 (E 1 p E 2 ) L 1 (E 1 ) p L 2 (E 2 ) if p involves attributes in L 1 L 2 only and L i contains attributes of E i only. 9 The set operations union and intersection are commutative: E 1 E 2 E 2 E 1 E 1 E 2 E 2 E 1 10 The set operations union and intersection are associative: (E 1 E 2 ) E 3 E 1 (E 2 E 3 ) (E 1 E 2 ) E 3 E 1 (E 2 E 3 ) 11 The selection operation distributes over, and \: p(e 1 E 2 ) p(e 1 ) p(e 2 ) p(e 1 E 2 ) p(e 1 ) p(e 2 ) p(e 1 \ E 2 ) p(e 1 ) \ p(e 2 ) Also: p(e 1 E 2 ) p(e 1 ) E 2 p(e 1 \ E 2 ) p(e 1 ) \ E 2 284
11 (this does not apply for ) 12 The projection operation distributes over : L(E 1 E 2 ) L(E 1 ) L(E 2 ) 285
12 9.4 Heuristic optimization Query optimizers use the equivalence rules of relational algebra to improve the expected performance of a given query in most cases. The optimization is guided by the following heuristics: (a) Break apart conjunctive selections into a sequence of simpler selections (rule 1 preparatory step for (b)). (b) Move down the query tree for the earliest possible execution (rules 2, 7, 11 reduce number of tuples processed). (c) Replace pairs by (rule 4 (a) avoid large intermediate results). (d) Break apart and move as far down the tree as possible lists of projection attributes, create new projections where possible (rules 3, 8, 12 reduce tuple widths early). (e) Perform the joins with the smallest expected result first. 286
13 Heuristic optimization: example SQL query Q: SELECT FROM WHERE AND AND p.ticketno Flight f, Passenger p, Crew c f.flightno = g.flightno AND f.flightno = c.flightno f.date = AND f.to = FRA p.name = c.name AND c.job = Pilot ( What would be a natural language formulation of Q?) 287
14 SELECT FROM WHERE AND AND p.ticketno Flight f, Passenger p, Crew c f.flightno = g.flightno AND f.flightno = c.flightno f.date = AND f.to = FRA p.name = c.name AND c.job = Pilot Canonical relational algebra expression (reflects the semantics of the SQL SELECT- FROM-WHERE block directly): p.ticketno f.flightno=p.flightno f.flightno=c.flightno c.job= Pilot Flight f Crew c Passenger p 288
15 Heuristic optimization: example 1 Break apart conjunctive selection to prepare push-down of selections: p.ticketno f.flightno=p.flightno f.flightno=c.flightno f.date= f.to= FRA p.name=c.name Flight f c.job= Pilot Crew c Passenger p 289
16 Heuristic optimization: example 2 Push down selection as far as possible (but no further!): p.ticketno f.flightno=c.flightno p.name=c.name f.flightno=p.flightno c.job= Pilot Crew c Passenger p f.to= FRA f.date= Flight f 290
17 Heuristic optimization: example 3 Re-unite sequences of selections into single conjunctive selections: p.ticketno f.flightno=c.flightno p.name=c.name f.flightno=p.flightno c.job= Pilot f.to= FRA f.date= Passenger p Crew c Flight f 291
18 Heuristic optimization: example 4 Introduce projections to reduce tuple widths: p.ticketno f.flightno=c.flightno p.name=c.name f.flightno=p.flightno c.flightno,c.name f.flightno c.job= Pilot Crew c f.to= FRA f.date= p.ticketno,p.flightno,p.name Flight f Passenger p 292
19 Heuristic optimization: example 5 Combine cartesian products and selections into joins: p.ticketno f.flightno f.flightno=p.flightno f.flightno=c.flightno p.name=c.name c.flightno,c.name c.job= Pilot Crew c f.to= FRA f.date= p.ticketno,p.flightno,p.name Flight f Passenger p 293
20 Heuristic optimization: example 6 Relation Passenger presumably is the largest relation, re-order the joins (associativity of general joins, rule 6 ii)): p.ticketno f.flightno f.flightno=c.flightno f.flightno=p.flightno p.name=c.name c.flightno,c.name p.ticketno,p.flightno,p.name Passenger p f.to= FRA f.date= c.job= Pilot Flight f Crew c 294
21 Choosing an evaluation plan When the optimizer annotates the resulting algebra expression E it needs to consider the interaction of the chosen operator algorithms/access methods. Choosing the cheapest (in terms of I/O) algorithm for each operation independently may not yield overall cheapest plan P. Example: merge join may be costlier than nested loops join (operands need to be sorted first), but yields output in sorted order (good for subsequent duplicate elimination, selection, grouping,... ) We need to consider all possible plans and then choose the best one in a cost-based fashion. 295
22 9.5 Explosion of search space Consider finding the best join order for the query R 1 R 2 R 3 R 4 Several join tree shapes (due to associativity, commutativity of ): R 1 R 2... R 1 R 2 bushy R 3 R 4... R 1 R 2 R 4 R 3 left-deep # of different join orders for an n-way join: R 3 R 4 right-deep (2n 2)! (n 1)! (n = 7 : , n = 10 : ) 296
23 Restricting the search space Fact: query optimization will not be able to find the overall best plan. Instead: optimizers tries to avoid the really bad plans (I/O cost of different plans may differ substantially!) Restrict the search space: consider left-deep join orders only (left is outer relation, right is inner): R 1 R 2 R 4 R 3 Left-deep trees may be evaluated in a fully-pipelined fashion (inner input is relation), intermediate results need not be written to temporary files, (Block) NL- may profit from available indexes on inner relation. Number of possible left-deep join orders for n-way join is n! 297
24 Single relation plans Optimizer enumerates (generates) all possible plans to assess their cost. If query involves a single relation R only: Single relation plans: Consider each available method (e.g., heap scan, (un)clustered index scan) to access the tuples of a single relation R i. Keep the access method involving the least estimated cost. 298
25 Cost estimates for single relation plans (System R style) IBM System R ( 1970s): first successful relational database system, introduced most of the query optimization techniques still in use today. Pragmatic yet successful cost model for access methods on rel. R: Access method Cost { Height(I) + 1 if I is B + tree access primary key index I 2.2 if I is hash index clustered index I matching predicate p ( I + R ) sel(p) 1 unclustered index I matching predicate p sequential scan ( I + R ) sel(p) R 1 If sel(p) is unknown, assume 1/
26 Cost estimates for a single relation plan Query Q: SELECT FROM WHERE A R B = c Database profile: R = 500, R = , V (B, R) = 10 Q 1/V (B, R) R = 1/ = tuples retrieved }{{} sel(b=c) 1 Database maintains clustered index I B ( I B = 50) on attribute B: cost = ( I B + R ) 1/V (B, R) = ( ) 1/10 = 55 pages 300
27 Cost estimates for a single relation plan 2 Database maintains unclustered index I B ( I B = 50) on attribute B: cost = ( I B + R ) 1/V (B, R) = ( ) 1/10 = pages 3 No index support, use sequential file scan to access R: cost = R = 500 pages To evaluate query Q, use clustered index I B 301
28 Plans for multiple relation (join) queries We need to make sure not to miss the best left-deep join plan. Degrees of freedom left: 1 For each base relation in the query, consider all access methods. 2 For each join operation, select a join algorithm. How many possible query plans are left now? Back-of-envelope calculation (query with n relations) Assume j join algorithms available, i indexes per relation: #plans n! j n 1 (i + 1) n Example: with n = 3 relations and j = 3, i = 2: #plans 3! =
29 Plan enumeration 1 : example setup Example query (n = 3): SELECT FROM WHERE a.name, f.airline, c.name Airport a, Flight f, Crew c f.to = a.code AND f.flightno = c.flightno (Airport = A, Flight = F, Crew = C) Assumptions: Available join algorithms: hash join, block NL-, block INL- Available indexes: clustered B + tree index I on attribute Flight.to, I = 50 A = 500, 80 tuples/page F = 1000, 100 tuples/page C = F A tuples fit on a page 303
30 Plan enumeration 2 : candidate plans Enumerate n! left-deep join trees (3! = 6): C A F F A C C F A A C F F C A A F C Prune plans with (note: no join predicate between A, C) immediately! 4 candidate plans remain. 304
31 Plan enumeration 3 : join algorithm choices Candidate plan: C A F Possible join algorithm choices: NL- NL- NL- C H- C A F A F H- H- NL- C H- C A F A F Repeat for remaining 3 candidate plans. 305
32 Plan enumeration 4 : access method choices NL- Candidate plan: NL- C A F Possible access method choices: NL- NL- NL- C heap scan INL- C heap scan heap scan A F heap scan heap scan A F index scan on F.to Repeat for remaining candidate plans. 306
33 Plan enumeration 5 : cost estimation Estimate cost for candidate plan: NL- INL- C heap scan heap scan A F index scan Cost heap scan A: 500 (pages) Cost of A F : A sel(a.code = F.to) ( F + I ) = F.to is key / ( ) A F = A F /100 = F /100 = /100 = (pages) Cost of (A F ) C: A F C = = Total estimated cost: =
34 Plan enumeration 5 : cost estimation Current candidate plan: Remember: A = 500, F = 1 000, C = 10 A F = NL- NL- C heap scan heap scan A F heap scan NL-: scan left input + scan right input once for each page in left input H- (assume 2 passes): 2 (scan both inputs + hash both inputs into buckets) + read hash buckets with join partners 308
35 Plan enumeration 5 : cost estimation Current candidate plan: Remember: A = 500, F = 1 000, C = 10 A F = H- NL- C heap scan heap scan A F heap scan NL-: scan left input + scan right input once for each page in left input H- (assume 2 passes): 2 (scan both inputs + hash both inputs into buckets) + read hash buckets with join partners 309
36 Plan enumeration 5 : cost estimation Current candidate plan: Remember: A = 500, F = 1 000, C = 10 A F = NL- H- C heap scan heap scan A F heap scan NL-: scan left input + scan right input once for each page in left input H- (assume 2 passes): 2 (scan both inputs + hash both inputs into buckets) + read hash buckets with join partners 310
37 Plan enumeration 5 : cost estimation Current candidate plan: Remember: A = 500, F = 1 000, C = 10 A F = H- H- C heap scan heap scan A F heap scan NL-: scan left input + scan right input once for each page in left input H- (assume 2 passes): 2 (scan both inputs + hash both inputs into buckets) + read hash buckets with join partners 311
38 Repeated enumeration of identical sub-plans The plan enumeration reconsiders the same sub-plans over and over again. Cost and result size of sub-plan indepedent of larger embedding plan: NL- H- NL- NL- C scan NL- C scan H- C scan scan A F scan scan A F scan scan A F scan H- NL- H- H- C scan INL- scan A C scan INL- C scan F scan scan A F index scan A F index! Idea: Remember already considered sub-plans in memoization data structure. Resulting approach known as dynamic programming. 312
39 9.6 Dynamic programming strategy (System R) Divide plan enumeration into n passes (for a query with n joined relations): 1 Pass 1 (all 1-relation plans): Find best 1-relation plans for each relation (i.e., select access method) 2 Pass 2 (all 2-relation plans): Find best way to join plans of Pass 1 to another relation (generate left-deep trees: sub-plans of Pass 1 appear as outer in join). 3 Pass n (all n-relation plans): Find best way to join plans of Pass n 1 to the nth relation (sub-plans of Pass n 1 appear as outer in join) A k 1 relation sub-plan P is not combined with a kth relation R unless there is a join condition between the relations in P and R or all join conditions already present in P (avoid if possible). 313
40 Plan enumeration: pruning, interesting orders For each sub-plan obtained this way, remember cost and result size estimates! Pruning: For each subset of relations joined, keep only cheapest sub-plan overall + cheapest sub-plans that generate an intermediate result with an interesting order of tuples. Interesting order determined by presence of SQL ORDER BY clause in the query presence of SQL GROUP BY clause in the query join attributes of subsequent equi-joins (prepare for merge-). 314
41 System R style plan enumeration Example query: SELECT FROM WHERE a.name, f.airline, c.name Airport a, Flight f, Crew c f.to = a.code AND f.flightno = c.flightno Now assume: Available join algorithms: merge-, block NL-, block INL- Available indexes: clustered B + tree index I on A.code, height(i) = 3, I leaf = 500 A = , 5 tuples/page F = 10, 10 tuples/page C = 10, 20 tuples/page 10 F A tuples fit on a page, 10 F C tuples fit on a page 315
42 System R: Pass 1 (1-relation plans) Access methods for A: 1 heap scan cost = A = index scan on A.code, index I cost = I + A = = Keep 1 and 2 since 2 has interesting order on attribute to which is a join attribute. Access method for F : 1 heap scan cost = F = 10 Access method for C: 1 heap scan cost = C =
43 System R: Pass 2 (2-relation plans) Start with 1-relation plan to access A as outer:? Heap scan of A as outer: 1? = NL- A F cost = F = = ? = M- (assume 2-way sort/merge): cost = F + F = Index scan of A as outer: 3? = NL- cost = F = = ? = M- (assume 2-way sort/merge): cost = F + F = Keep 4 only (N.B. uses interesting order in non-optimal sub-plan!) 317
44 System R: Pass 2 (cont d) Start with F as outer:? A as inner: F A/C? 1? = NL-, heap scan A cost = F + F A = ? = INL-, index scan A cost = F + F (height(i) + 1) = (3 + 1) = 410 3? = M-, heap scan A cost = F + A + 2 ( F + A ) = ? = M-, index scan A cost = F + 2 F = Keep! C as inner: 5? = NL- cost = F + F C = = 110 6? = M- Keep! cost = F + C + 2 ( F + C ) = ( ) =
45 System R: Pass 2 (cont d) Start with C as outer:? C F 1? = NL- cost = C + C F = = 110 2? = M- cost = C + F + 2 ( C + F ) = ( ) = 60 Keep! N.B. C A not enumerated because of cross product ( ) avoidance. 319
46 System R: further pruning of 2-relation plans A F : M- INL- 1 index A F scan 2 scan F A index cost = , order on to cost = 410, no order C F : M- M- 3 scan C F scan 4 scan F C scan cost = 60, order on flightno cost = 60, order on flightno Keep 2 and 3 or 4 (order in 1 not interesting for subsequent join(s)). 320
47 System R: Pass 3 (3-relation plans) Best (A F ) sub-plan: cost = 410, no order, A F = 10 NL- INL- C 1 scan F A index cost = A F C = = M- INL- C scan F A index cost = C + 2 ( A F + C ) = ( ) =
48 System R: Pass 3 (cont d) Best (C F ) sub-plan: cost = 60, order on flightno, C F = 10, C F = 100 NL- M- A scan scan F C scan cost = M- M- A scan scan F C scan cost = INL- M- A index scan F C scan cost = = 460 M- M- A index scan F C scan cost =
49 System R: And the winner is... INL- M- A index cost = 460 Observations: scan F C scan Best plan mixes join algorithms and exploits indexes. Worst plan had cost > (exact cost unknown due to pruning). Optimization yielded 1000-fold improvement over worst plan! 323
50 Bibliography Astrahan, M. M., Schkolnick, M., and Kim, W. (1980). Performance of the System R access path selection mechanism. In IFIP Congress, pages Chamberlin, D., Astrahan, M., Blasgen, M., Gray, J., King, W., Lindsay, B., Lorie, R., Mehl, J., Price, T., Putzolu, F., Selinger, P., Schkolnick, M., Shultz, D., Traiger, I., Wade, B., and Yost, R. (1981). History and evaluation of System/R. Communications of the ACM, 24(10): Jarke, M. and Koch, J. (1984). Query optimization in database systems. ACM Computing Surveys, 16(2): Ramakrishnan, R. and Gehrke, J. (2003). Database Management Systems. McGraw-Hill, New York, 3 edition. W. Kim, D.S. Reiner, D. B., editor (1985). Query Processing in Database Systems. Springer-Verlag. 324
Query Processing & Optimization
Query Processing & Optimization 1 Roadmap of This Lecture Overview of query processing Measures of Query Cost Selection Operation Sorting Join Operation Other Operations Evaluation of Expressions Introduction
More informationPrinciples of Data Management. Lecture #9 (Query Processing Overview)
Principles of Data Management Lecture #9 (Query Processing Overview) Instructor: Mike Carey mjcarey@ics.uci.edu Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Today s Notable News v Midterm
More informationChapter 12: Query Processing
Chapter 12: Query Processing Overview Catalog Information for Cost Estimation $ Measures of Query Cost Selection Operation Sorting Join Operation Other Operations Evaluation of Expressions Transformation
More informationQUERY OPTIMIZATION E Jayant Haritsa Computer Science and Automation Indian Institute of Science. JAN 2014 Slide 1 QUERY OPTIMIZATION
E0 261 Jayant Haritsa Computer Science and Automation Indian Institute of Science JAN 2014 Slide 1 Database Engines Main Components Query Processing Transaction Processing Access Methods JAN 2014 Slide
More informationRelational Query Optimization. Highlights of System R Optimizer
Relational Query Optimization Chapter 15 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Highlights of System R Optimizer v Impact: Most widely used currently; works well for < 10 joins.
More informationReview. Relational Query Optimization. Query Optimization Overview (cont) Query Optimization Overview. Cost-based Query Sub-System
Review Relational Query Optimization R & G Chapter 12/15 Implementation of single Relational Operations Choices depend on indexes, memory, stats, Joins Blocked nested loops: simple, exploits extra memory
More informationChapter 13: Query Optimization. Chapter 13: Query Optimization
Chapter 13: Query Optimization Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 13: Query Optimization Introduction Equivalent Relational Algebra Expressions Statistical
More informationR & G Chapter 13. Implementation of single Relational Operations Choices depend on indexes, memory, stats, Joins Blocked nested loops:
Relational Query Optimization R & G Chapter 13 Review Implementation of single Relational Operations Choices depend on indexes, memory, stats, Joins Blocked nested loops: simple, exploits extra memory
More informationAdvanced Databases. Lecture 4 - Query Optimization. Masood Niazi Torshiz Islamic Azad university- Mashhad Branch
Advanced Databases Lecture 4 - Query Optimization Masood Niazi Torshiz Islamic Azad university- Mashhad Branch www.mniazi.ir Query Optimization Introduction Transformation of Relational Expressions Catalog
More informationQuery Optimization. Schema for Examples. Motivating Example. Similar to old schema; rname added for variations. Reserves: Sailors:
Query Optimization Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Schema for Examples (sid: integer, sname: string, rating: integer, age: real) (sid: integer, bid: integer, day: dates,
More informationChapter 11: Query Optimization
Chapter 11: Query Optimization Chapter 11: Query Optimization Introduction Transformation of Relational Expressions Statistical Information for Cost Estimation Cost-based optimization Dynamic Programming
More informationRelational Query Optimization
Relational Query Optimization Module 4, Lectures 3 and 4 Database Management Systems, R. Ramakrishnan 1 Overview of Query Optimization Plan: Tree of R.A. ops, with choice of alg for each op. Each operator
More informationPrinciples of Data Management. Lecture #12 (Query Optimization I)
Principles of Data Management Lecture #12 (Query Optimization I) Instructor: Mike Carey mjcarey@ics.uci.edu Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Today s Notable News v B+ tree
More informationSchema for Examples. Query Optimization. Alternative Plans 1 (No Indexes) Motivating Example. Alternative Plans 2 With Indexes
Schema for Examples Query Optimization (sid: integer, : string, rating: integer, age: real) (sid: integer, bid: integer, day: dates, rname: string) Similar to old schema; rname added for variations. :
More informationModule 9: Selectivity Estimation
Module 9: Selectivity Estimation Module Outline 9.1 Query Cost and Selectivity Estimation 9.2 Database profiles 9.3 Sampling 9.4 Statistics maintained by commercial DBMS Web Forms Transaction Manager Lock
More informationOverview of Implementing Relational Operators and Query Evaluation
Overview of Implementing Relational Operators and Query Evaluation Chapter 12 Motivation: Evaluating Queries The same query can be evaluated in different ways. The evaluation strategy (plan) can make orders
More informationCopyright 2016 Ramez Elmasri and Shamkant B. Navathe
CHAPTER 19 Query Optimization Introduction Query optimization Conducted by a query optimizer in a DBMS Goal: select best available strategy for executing query Based on information available Most RDBMSs
More informationRelational Query Optimization. Overview of Query Evaluation. SQL Refresher. Yanlei Diao UMass Amherst October 23 & 25, 2007
Relational Query Optimization Yanlei Diao UMass Amherst October 23 & 25, 2007 Slide Content Courtesy of R. Ramakrishnan, J. Gehrke, and J. Hellerstein 1 Overview of Query Evaluation Query Evaluation Plan:
More informationQuery processing and optimization
Query processing and optimization These slides are a modified version of the slides of the book Database System Concepts (Chapter 13 and 14), 5th Ed., McGraw-Hill, by Silberschatz, Korth and Sudarshan.
More informationDatabase System Concepts
Chapter 14: Optimization Departamento de Engenharia Informática Instituto Superior Técnico 1 st Semester 2007/2008 Slides (fortemente) baseados nos slides oficiais do livro c Silberschatz, Korth and Sudarshan.
More informationQuery Optimization. Shuigeng Zhou. December 9, 2009 School of Computer Science Fudan University
Query Optimization Shuigeng Zhou December 9, 2009 School of Computer Science Fudan University Outline Introduction Catalog Information for Cost Estimation Estimation of Statistics Transformation of Relational
More informationCh 5 : Query Processing & Optimization
Ch 5 : Query Processing & Optimization Basic Steps in Query Processing 1. Parsing and translation 2. Optimization 3. Evaluation Basic Steps in Query Processing (Cont.) Parsing and translation translate
More informationFaloutsos 1. Carnegie Mellon Univ. Dept. of Computer Science Database Applications. Outline
Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications Lecture #14: Implementation of Relational Operations (R&G ch. 12 and 14) 15-415 Faloutsos 1 introduction selection projection
More informationHash-Based Indexing 165
Hash-Based Indexing 165 h 1 h 0 h 1 h 0 Next = 0 000 00 64 32 8 16 000 00 64 32 8 16 A 001 01 9 25 41 73 001 01 9 25 41 73 B 010 10 10 18 34 66 010 10 10 18 34 66 C Next = 3 011 11 11 19 D 011 11 11 19
More informationQuery Optimization. Schema for Examples. Motivating Example. Similar to old schema; rname added for variations. Reserves: Sailors:
Query Optimization atabase Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Schema for Examples (sid: integer, sname: string, rating: integer, age: real) (sid: integer, bid: integer, day: dates,
More informationAdministrivia. CS 133: Databases. Cost-based Query Sub-System. Goals for Today. Midterm on Thursday 10/18. Assignments
Administrivia Midterm on Thursday 10/18 CS 133: Databases Fall 2018 Lec 12 10/16 Prof. Beth Trushkowsky Assignments Lab 3 starts after fall break No problem set out this week Goals for Today Cost-based
More informationIntroduction Alternative ways of evaluating a given query using
Query Optimization Introduction Catalog Information for Cost Estimation Estimation of Statistics Transformation of Relational Expressions Dynamic Programming for Choosing Evaluation Plans Introduction
More informationCS 4604: Introduction to Database Management Systems. B. Aditya Prakash Lecture #10: Query Processing
CS 4604: Introduction to Database Management Systems B. Aditya Prakash Lecture #10: Query Processing Outline introduction selection projection join set & aggregate operations Prakash 2018 VT CS 4604 2
More informationDBMS Query evaluation
Data Management for Data Science DBMS Maurizio Lenzerini, Riccardo Rosati Corso di laurea magistrale in Data Science Sapienza Università di Roma Academic Year 2016/2017 http://www.dis.uniroma1.it/~rosati/dmds/
More informationContents Contents Introduction Basic Steps in Query Processing Introduction Transformation of Relational Expressions...
Contents Contents...283 Introduction...283 Basic Steps in Query Processing...284 Introduction...285 Transformation of Relational Expressions...287 Equivalence Rules...289 Transformation Example: Pushing
More information15-415/615 Faloutsos 1
Carnegie Mellon Univ. Dept. of Computer Science 15-415/615 - DB Applications Lecture #14: Implementation of Relational Operations (R&G ch. 12 and 14) 15-415/615 Faloutsos 1 Outline introduction selection
More informationCMSC424: Database Design. Instructor: Amol Deshpande
CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu Databases Data Models Conceptual representa1on of the data Data Retrieval How to ask ques1ons of the database How to answer those ques1ons
More informationIntroduction to Query Processing and Query Optimization Techniques. Copyright 2011 Ramez Elmasri and Shamkant Navathe
Introduction to Query Processing and Query Optimization Techniques Outline Translating SQL Queries into Relational Algebra Algorithms for External Sorting Algorithms for SELECT and JOIN Operations Algorithms
More informationOverview of Query Evaluation. Chapter 12
Overview of Query Evaluation Chapter 12 1 Outline Query Optimization Overview Algorithm for Relational Operations 2 Overview of Query Evaluation DBMS keeps descriptive data in system catalogs. SQL queries
More informationOutline. Query Processing Overview Algorithms for basic operations. Query optimization. Sorting Selection Join Projection
Outline Query Processing Overview Algorithms for basic operations Sorting Selection Join Projection Query optimization Heuristics Cost-based optimization 19 Estimate I/O Cost for Implementations Count
More informationChapter 14: Query Optimization
Chapter 14: Query Optimization Database System Concepts 5 th Ed. See www.db-book.com for conditions on re-use Chapter 14: Query Optimization Introduction Transformation of Relational Expressions Catalog
More informationOverview of Query Evaluation. Overview of Query Evaluation
Overview of Query Evaluation Chapter 12 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Overview of Query Evaluation v Plan: Tree of R.A. ops, with choice of alg for each op. Each operator
More informationQUERY OPTIMIZATION [CH 15]
Spring 2017 QUERY OPTIMIZATION [CH 15] 4/12/17 CS 564: Database Management Systems; (c) Jignesh M. Patel, 2013 1 Example SELECT distinct ename FROM Emp E, Dept D WHERE E.did = D.did and D.dname = Toy EMP
More informationRelational Query Optimization. Overview of Query Evaluation. SQL Refresher. Yanlei Diao UMass Amherst March 8 and 13, 2007
Relational Query Optimization Yanlei Diao UMass Amherst March 8 and 13, 2007 Slide Content Courtesy of R. Ramakrishnan, J. Gehrke, and J. Hellerstein 1 Overview of Query Evaluation Query Evaluation Plan:
More informationQuery Optimization. Query Optimization. Optimization considerations. Example. Interaction of algorithm choice and tree arrangement.
COS 597: Principles of Database and Information Systems Query Optimization Query Optimization Query as expression over relational algebraic operations Get evaluation (parse) tree Leaves: base relations
More informationAdvances in Data Management Query Processing and Query Optimisation A.Poulovassilis
1 Advances in Data Management Query Processing and Query Optimisation A.Poulovassilis 1 General approach to the implementation of Query Processing and Query Optimisation functionalities in DBMSs 1. Parse
More informationDatabase System Concepts
Chapter 13: Query Processing s Departamento de Engenharia Informática Instituto Superior Técnico 1 st Semester 2008/2009 Slides (fortemente) baseados nos slides oficiais do livro c Silberschatz, Korth
More informationEvaluation of Relational Operations: Other Techniques. Chapter 14 Sayyed Nezhadi
Evaluation of Relational Operations: Other Techniques Chapter 14 Sayyed Nezhadi Schema for Examples Sailors (sid: integer, sname: string, rating: integer, age: real) Reserves (sid: integer, bid: integer,
More informationAlgorithms for Query Processing and Optimization. 0. Introduction to Query Processing (1)
Chapter 19 Algorithms for Query Processing and Optimization 0. Introduction to Query Processing (1) Query optimization: The process of choosing a suitable execution strategy for processing a query. Two
More informationChapter 12: Query Processing. Chapter 12: Query Processing
Chapter 12: Query Processing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 12: Query Processing Overview Measures of Query Cost Selection Operation Sorting Join
More informationAn SQL query is parsed into a collection of query blocks optimize one block at a time. Nested blocks are usually treated as calls to a subroutine
QUERY OPTIMIZATION 1 QUERY OPTIMIZATION QUERY SUB-SYSTEM 2 ROADMAP 3. 12 QUERY BLOCKS: UNITS OF OPTIMIZATION An SQL query is parsed into a collection of query blocks optimize one block at a time. Nested
More informationChapter 3. Algorithms for Query Processing and Optimization
Chapter 3 Algorithms for Query Processing and Optimization Chapter Outline 1. Introduction to Query Processing 2. Translating SQL Queries into Relational Algebra 3. Algorithms for External Sorting 4. Algorithms
More informationImplementing Relational Operators: Selection, Projection, Join. Database Management Systems, R. Ramakrishnan and J. Gehrke 1
Implementing Relational Operators: Selection, Projection, Join Database Management Systems, R. Ramakrishnan and J. Gehrke 1 Readings [RG] Sec. 14.1-14.4 Database Management Systems, R. Ramakrishnan and
More informationQuery Optimization. Chapter 13
Query Optimization Chapter 13 What we want to cover today Overview of query optimization Generating equivalent expressions Cost estimation 432/832 2 Chapter 13 Query Optimization OVERVIEW 432/832 3 Query
More informationChapter 13: Query Optimization
Chapter 13: Query Optimization Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 13: Query Optimization Introduction Transformation of Relational Expressions Catalog
More informationExamples of Physical Query Plan Alternatives. Selected Material from Chapters 12, 14 and 15
Examples of Physical Query Plan Alternatives Selected Material from Chapters 12, 14 and 15 1 Query Optimization NOTE: SQL provides many ways to express a query. HENCE: System has many options for evaluating
More informationOverview of Query Evaluation
Overview of Query Evaluation Chapter 12 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Overview of Query Evaluation Plan: Tree of R.A. ops, with choice of alg for each op. Each operator
More informationFinal Exam Review 2. Kathleen Durant CS 3200 Northeastern University Lecture 23
Final Exam Review 2 Kathleen Durant CS 3200 Northeastern University Lecture 23 QUERY EVALUATION PLAN Representation of a SQL Command SELECT {DISTINCT} FROM {WHERE
More informationDBMS Y3/S5. 1. OVERVIEW The steps involved in processing a query are: 1. Parsing and translation. 2. Optimization. 3. Evaluation.
Query Processing QUERY PROCESSING refers to the range of activities involved in extracting data from a database. The activities include translation of queries in high-level database languages into expressions
More informationRelational Query Optimization
Relational Query Optimization Chapter 15 Ramakrishnan & Gehrke (Sections 15.1-15.6) CPSC404, Laks V.S. Lakshmanan 1 What you will learn from this lecture Cost-based query optimization (System R) Plan space
More informationChapter 14 Query Optimization
Chapter 14 Query Optimization Chapter 14: Query Optimization! Introduction! Catalog Information for Cost Estimation! Estimation of Statistics! Transformation of Relational Expressions! Dynamic Programming
More informationChapter 14 Query Optimization
Chapter 14 Query Optimization Chapter 14: Query Optimization! Introduction! Catalog Information for Cost Estimation! Estimation of Statistics! Transformation of Relational Expressions! Dynamic Programming
More informationChapter 14 Query Optimization
Chapter 14: Query Optimization Chapter 14 Query Optimization! Introduction! Catalog Information for Cost Estimation! Estimation of Statistics! Transformation of Relational Expressions! Dynamic Programming
More informationChapter 12: Query Processing
Chapter 12: Query Processing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Overview Chapter 12: Query Processing Measures of Query Cost Selection Operation Sorting Join
More informationQuery optimization. Elena Baralis, Silvia Chiusano Politecnico di Torino. DBMS Architecture D B M G. Database Management Systems. Pag.
Database Management Systems DBMS Architecture SQL INSTRUCTION OPTIMIZER MANAGEMENT OF ACCESS METHODS CONCURRENCY CONTROL BUFFER MANAGER RELIABILITY MANAGEMENT Index Files Data Files System Catalog DATABASE
More informationOverview of Query Processing. Evaluation of Relational Operations. Why Sort? Outline. Two-Way External Merge Sort. 2-Way Sort: Requires 3 Buffer Pages
Overview of Query Processing Query Parser Query Processor Evaluation of Relational Operations Query Rewriter Query Optimizer Query Executor Yanlei Diao UMass Amherst Lock Manager Access Methods (Buffer
More informationCSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2007 Lecture 9 - Query optimization
CSE 544 Principles of Database Management Systems Magdalena Balazinska Fall 2007 Lecture 9 - Query optimization References Access path selection in a relational database management system. Selinger. et.
More informationCompSci 516 Data Intensive Computing Systems. Lecture 11. Query Optimization. Instructor: Sudeepa Roy
CompSci 516 Data Intensive Computing Systems Lecture 11 Query Optimization Instructor: Sudeepa Roy Duke CS, Fall 2017 CompSci 516: Database Systems 1 Announcements HW2 has been posted on sakai Due on Oct
More informationQuery Processing. Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016
Query Processing Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016 Slides re-used with some modification from www.db-book.com Reference: Database System Concepts, 6 th Ed. By Silberschatz,
More informationCS330. Query Processing
CS330 Query Processing 1 Overview of Query Evaluation Plan: Tree of R.A. ops, with choice of alg for each op. Each operator typically implemented using a `pull interface: when an operator is `pulled for
More informationAdministrivia. Relational Query Optimization (this time we really mean it) Review: Query Optimization. Overview: Query Optimization
Relational Query Optimization (this time we really mean it) R&G hapter 15 Lecture 25 dministrivia Homework 5 mostly available It will be due after classes end, Monday 12/8 Only 3 more lectures left! Next
More informationChapter 13: Query Processing
Chapter 13: Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 13.1 Basic Steps in Query Processing 1. Parsing
More informationWhat happens. 376a. Database Design. Execution strategy. Query conversion. Next. Two types of techniques
376a. Database Design Dept. of Computer Science Vassar College http://www.cs.vassar.edu/~cs376 Class 16 Query optimization What happens Database is given a query Query is scanned - scanner creates a list
More information! A relational algebra expression may have many equivalent. ! Cost is generally measured as total elapsed time for
Chapter 13: Query Processing Basic Steps in Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 1. Parsing and
More informationChapter 13: Query Processing Basic Steps in Query Processing
Chapter 13: Query Processing Basic Steps in Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 1. Parsing and
More informationCSE 544 Principles of Database Management Systems. Alvin Cheung Fall 2015 Lecture 7 - Query optimization
CSE 544 Principles of Database Management Systems Alvin Cheung Fall 2015 Lecture 7 - Query optimization Announcements HW1 due tonight at 11:45pm HW2 will be due in two weeks You get to implement your own
More informationSomething to think about. Problems. Purpose. Vocabulary. Query Evaluation Techniques for large DB. Part 1. Fact:
Query Evaluation Techniques for large DB Part 1 Fact: While data base management systems are standard tools in business data processing they are slowly being introduced to all the other emerging data base
More informationAdvanced Database Systems
Lecture IV Query Processing Kyumars Sheykh Esmaili Basic Steps in Query Processing 2 Query Optimization Many equivalent execution plans Choosing the best one Based on Heuristics, Cost Will be discussed
More informationImplementation of Relational Operations. Introduction. CS 186, Fall 2002, Lecture 19 R&G - Chapter 12
Implementation of Relational Operations CS 186, Fall 2002, Lecture 19 R&G - Chapter 12 First comes thought; then organization of that thought, into ideas and plans; then transformation of those plans into
More informationQuery Processing and Optimization *
OpenStax-CNX module: m28213 1 Query Processing and Optimization * Nguyen Kim Anh This work is produced by OpenStax-CNX and licensed under the Creative Commons Attribution License 3.0 Query processing is
More informationDatabase Management System
Database Management System Lecture Join * Some materials adapted from R. Ramakrishnan, J. Gehrke and Shawn Bowers Today s Agenda Join Algorithm Database Management System Join Algorithms Database Management
More informationParser: SQL parse tree
Jinze Liu Parser: SQL parse tree Good old lex & yacc Detect and reject syntax errors Validator: parse tree logical plan Detect and reject semantic errors Nonexistent tables/views/columns? Insufficient
More informationCSE 444: Database Internals. Lecture 11 Query Optimization (part 2)
CSE 444: Database Internals Lecture 11 Query Optimization (part 2) CSE 444 - Spring 2014 1 Reminders Homework 2 due tonight by 11pm in drop box Alternatively: turn it in in my office by 4pm, or in Priya
More informationDatabase Applications (15-415)
Database Applications (15-415) DMS Internals- Part X Lecture 21, April 7, 2015 Mohammad Hammoud Last Session: DMS Internals- Part IX Query Optimization Today Today s Session: DMS Internals- Part X Query
More informationEvaluation of Relational Operations: Other Techniques
Evaluation of Relational Operations: Other Techniques Chapter 12, Part B Database Management Systems 3ed, R. Ramakrishnan and Johannes Gehrke 1 Using an Index for Selections v Cost depends on #qualifying
More informationQuery Processing Strategies and Optimization
Query Processing Strategies and Optimization CPS352: Database Systems Simon Miner Gordon College Last Revised: 10/25/12 Agenda Check-in Design Project Presentations Query Processing Programming Project
More informationModule 4: Tree-Structured Indexing
Module 4: Tree-Structured Indexing Module Outline 4.1 B + trees 4.2 Structure of B + trees 4.3 Operations on B + trees 4.4 Extensions 4.5 Generalized Access Path 4.6 ORACLE Clusters Web Forms Transaction
More informationDatabase Systems External Sorting and Query Optimization. A.R. Hurson 323 CS Building
External Sorting and Query Optimization A.R. Hurson 323 CS Building External sorting When data to be sorted cannot fit into available main memory, external sorting algorithm must be applied. Naturally,
More informationECS 165B: Database System Implementa6on Lecture 7
ECS 165B: Database System Implementa6on Lecture 7 UC Davis April 12, 2010 Acknowledgements: por6ons based on slides by Raghu Ramakrishnan and Johannes Gehrke. Class Agenda Last 6me: Dynamic aspects of
More informationAnnouncement. Reading Material. Overview of Query Evaluation. Overview of Query Evaluation. Overview of Query Evaluation 9/26/17
Announcement CompSci 516 Database Systems Lecture 10 Query Evaluation and Join Algorithms Project proposal pdf due on sakai by 5 pm, tomorrow, Thursday 09/27 One per group by any member Instructor: Sudeepa
More informationEvaluation of Relational Operations
Evaluation of Relational Operations Yanlei Diao UMass Amherst March 13 and 15, 2006 Slides Courtesy of R. Ramakrishnan and J. Gehrke 1 Relational Operations We will consider how to implement: Selection
More informationOverview of Query Evaluation
Overview of Query Evaluation Chapter 12 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Overview of Query Evaluation Plan: Tree of R.A. operators, with choice of algorithm for each operator.
More informationMagda Balazinska - CSE 444, Spring
Query Optimization Algorithm CE 444: Database Internals Lectures 11-12 Query Optimization (part 2) Enumerate alternative plans (logical & physical) Compute estimated cost of each plan Compute number of
More informationQuery Processing & Optimization. CS 377: Database Systems
Query Processing & Optimization CS 377: Database Systems Recap: File Organization & Indexing Physical level support for data retrieval File organization: ordered or sequential file to find items using
More informationPrinciples of Data Management. Lecture #13 (Query Optimization II)
Principles of Data Management Lecture #13 (Query Optimization II) Instructor: Mike Carey mjcarey@ics.uci.edu Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Today s Notable News v Reminder:
More informationCSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2007 Lecture 7 - Query execution
CSE 544 Principles of Database Management Systems Magdalena Balazinska Fall 2007 Lecture 7 - Query execution References Generalized Search Trees for Database Systems. J. M. Hellerstein, J. F. Naughton
More informationQUERY PROCESSING & OPTIMIZATION CHAPTER 19 (6/E) CHAPTER 15 (5/E)
QUERY PROCESSING & OPTIMIZATION CHAPTER 19 (6/E) CHAPTER 15 (5/E) 2 LECTURE OUTLINE Query Processing Methodology Basic Operations and Their Costs Generation of Execution Plans 3 QUERY PROCESSING IN A DDBMS
More informationReminders. Query Optimizer Overview. The Three Parts of an Optimizer. Dynamic Programming. Search Algorithm. CSE 444: Database Internals
Reminders CSE 444: Database Internals Lab 2 is due on Wednesday HW 5 is due on Friday Lecture 11 Query Optimization (part 2) CSE 444 - Winter 2017 1 CSE 444 - Winter 2017 2 Query Optimizer Overview Input:
More informationDatabase Applications (15-415)
Database Applications (15-415) DBMS Internals- Part VIII Lecture 16, March 19, 2014 Mohammad Hammoud Today Last Session: DBMS Internals- Part VII Algorithms for Relational Operations (Cont d) Today s Session:
More informationATYPICAL RELATIONAL QUERY OPTIMIZER
14 ATYPICAL RELATIONAL QUERY OPTIMIZER Life is what happens while you re busy making other plans. John Lennon In this chapter, we present a typical relational query optimizer in detail. We begin by discussing
More informationQuery Processing and Query Optimization. Prof Monika Shah
Query Processing and Query Optimization Query Processing SQL Query Is in Library Cache? System catalog (Dict / Dict cache) Scan and verify relations Parse into parse tree (relational Calculus) View definitions
More informationCMSC 424 Database design Lecture 18 Query optimization. Mihai Pop
CMSC 424 Database design Lecture 18 Query optimization Mihai Pop More midterm solutions Projects do not be late! Admin Introduction Alternative ways of evaluating a given query Equivalent expressions Different
More informationDatabase Applications (15-415)
Database Applications (15-415) DBMS Internals- Part VI Lecture 14, March 12, 2014 Mohammad Hammoud Today Last Session: DBMS Internals- Part V Hash-based indexes (Cont d) and External Sorting Today s Session:
More informationEECS 647: Introduction to Database Systems
EECS 647: Introduction to Database Systems Instructor: Luke Huan Spring 2009 External Sorting Today s Topic Implementing the join operation 4/8/2009 Luke Huan Univ. of Kansas 2 Review DBMS Architecture
More informationSystem R Optimization (contd.)
System R Optimization (contd.) Instructor: Sharma Chakravarthy sharma@cse.uta.edu The University of Texas @ Arlington Database Management Systems, S. Chakravarthy 1 Optimization Criteria number of page
More information