DATABASE DESIGN II - 1DL400
|
|
- Elwin Jones
- 5 years ago
- Views:
Transcription
1 DTSE DESIGN II - 1DL400 Fall 2016 course on modern database systems Kjell Orsborn Uppsala Database Laboratory Department of Information Technology, Uppsala University, Uppsala, Sweden 06/12/16 1
2 Query processing Elmasri/Navathe ch?? Padron-McCarthy/Risch ch?? Kjell Orsborn Uppsala Database Laboratory Department of Information Technology, Uppsala University, Uppsala, Sweden 06/12/16 2
3 What is query processing? given SQL query is translated by the query processor into a low level program called an execution plan n execution plan is a program in a functional language: The language physical relational algebra, specialized for internal storage representation in the DMS. The physical relational algebra extends the relational algebra with: Primitives to search through the internal storage structures of the DMS 06/12/16 3
4 What is query optimization? SQL is a very high level language: The users specify what to search for not how the search is actually done The algorithms are chosen automatically by the DMS For a given SQL query there may be very many possible execution plans The cost of these execution plans very widely The costs vary with orders of magnitude The task of the query optimizer is to choose the cheapest plan out of the possible ones The query optimizer is the most complex (and important) part of the query processor 06/12/16 4
5 Complexity of query optimization Very many possible execution plans for a given SQL query, e.g.: J joins can be permutated with different costs O(J!) In addition there are different join algorithms to choose from Query optimization combinatorial over # of operations Q in query. With dynamic programming O( Q 2 ) in best case 06/12/16 5
6 Why does query optimization pay off? Query optimization radically improves performance of executing query The complexity of a good plan may be O(log N), while a bad one is O(N 2 ), where N is size of the database Query optimization enables scalability of declarative queries Since N is large the payoff is huge The size of the results of the query Q << N Classical query optimization can handle up to ca 12 joins Good query optimizer critical for competitive DMS! Query optimization is the key to the success of SQL 06/12/16 6
7 Query Processing Steps SQL Query PRSER (parsing and semantic checking as in any compiler) Parse tree (~ tree structure representing relational calculus expression) OPTIMIZER (very advanced) Execution plan (physical relation algebra expression) EXECUTOR (execution plan interpreter) DMS kernel Data structures 06/12/16 7
8 Query Optimizer Steps Tuple calculus View expansion Tuple calculus Query transformations Tuple calculus Cost-based query optimization Physical relational algebra 06/12/16 8
9 View expansion view is a virtual table expressed through a single SQL statement, a named query Views are textually substituted (macro expanded) by view expansion View expansion makes query larger, and thus optimization becomes even slower. View expansion allows optimizer to look inside view definitions rather than regarding views as black boxes View expansion required in order to detect indexes hidden inside views 06/12/16 9
10 View expansion CRETE TLE SUPPLIES( STORE CHR(10), ITEM CHR(10), PRICE DECIML(10,2), PRIMRY KEY(STORE, ITEM)) View defined over supplies: CRETE VIEW ICSUPPLIES S SELECT * FROM SUPPLIES WHERE STORE = IC 06/12/16 10
11 View expansion SELECT PRICE FROM ICSUPPLIES S WHERE S.ITEM = Tomatoes Translated by the view expander into: SELECT PRICE FROM SUPPLIES S WHERE S.ITEM = Tomatoes ND S.STORE = IC Query optimizer will now discover index(es) on ITEM or STORE! These indexes would NOT have been discovered if view was black box. 06/12/16 11
12 lank 06/12/16 12
13 Cost-based query optimization Elmasri/Navathe ch?? Padron-McCarthy/Risch ch?? Kjell Orsborn Uppsala Database Laboratory Department of Information Technology, Uppsala University, Uppsala, Sweden 06/12/16 13
14 Cost-based query optimization Principle: 1. Generate all possible execution plans (heuristics to avoid some unlikely ones) 2. Estimate the cost of executing each of the generated plans using a cost model based on database statistics and properties of DMS algorithms 3. Choose the cheapest one Optimization criteria # of disk blocks read (dominates) only metric for classical disk DMS CPU usage, CP important when operators are computing much Communication costs for distributed data, CO important in distributed databases. Normally weighted average of different criteria: W 1 * D + W 2 * CP + W 3 * CO, where W 1, W 2, W 3 system configured weights The costs are computed based on data statistics and cost model for operators in physical relational algebra 06/12/16 14
15 Query execution plan (physical algebra) Query execution plan is functional program with evaluation primitives such as: Segment scan operator Primary index scan operators per index kind Secondary index scan operators per index kind One or several join algorithms Sort operator Duplicate elimination operator Stop after n tuples operator, stop_after(n,s) that emits the n first tuples in scan s Other DMS-specific operators Normally pipelined execution Streams of tuples produced as intermediate results 06/12/16 15
16 Degrees of freedom for optimizer Query plan must be correct and sufficiently optimized Optimizer chooses among physical operators over streams e.g.: rute force linear table scan (most general access path) Index scan of selected part of index structure (e.g. -tree, hash table) Reverse scans traversing indexes and tables backwards Materialize stream on disk or in memory if favorable Eliminate duplicates in stream Stop scanning stream s after n tuples read, stopafter(n,s) 06/12/16 16
17 Degrees of freedom for optimizer Optimizer chooses among join strategies, e.g.: Choose order of joining tables and intermediate result streams Choice of join algorithms, e.g. Nested Loop Join, Merge Join, Hash Join Sort intermediate results if necessary lgorithm choices may depend on available amount of main memory 06/12/16 17
18 Nested-loop join Nested-loop join traverses one operand stream and for each row seeks a matching row in the second stream Cost: C NLJ = b T + (b T * b U ) + b res 06/12/16 18
19 Sort-merge join Sort-merge join traverses the two operand streams simultaneously while matching a row from the first stream with another row in the second stream. (both T and U are assumed to be sorted on the join attributes.) Cost: C SMJ = b T + b U + b res 06/12/16 19
20 Hash join Hash join the operand stream T are read into the buffer and a hash table is created on its hash field (join attribute), then for each row i U any matching row in T is retrieved by applying the same hash function on T:s hash table. (assumes the complete hash table fits in memory - otherwise partitioned hash join) Cost: C HJ = 3 * (b T + b U ) + b res or Optimal cost: (b T + b U ) + b res 06/12/16 20
21 Query Cost Model Purpose: Estimate the cost of executing a given execution plan generated by the query optimizer asic cost parameters in cost models Cost of randomly accessing disk block i.e. time for disk arm positioning Data transfer rate i.e. time to move disk arm continously forwards or backwards while transferring data to main-memory Clustering order of data tuples on disk Non-clustered index have slow index scans Sort order of data tuples on disk No sorting necessary if query specifies order by and data ordered by index Cost models for different index access methods -tree structures - LH Cost models for different join methods Cost of sorting intermediate results 06/12/16 21
22 Total Query Cost The total cost estimate of an execution plan is a formula defined in terms of the basic cost parameters. The total cost depends on how often primitive operations are invoked if they are expensive. In classical DMSs only joins and scans were expensive while selections (<,>,=,!=) were cheap Now functions doing computations may have high costs e.g. statistics or user defined computations may be expensive The execution cost of an operator depends on data volume of intermediate results. The intermediate result sizes are estimated by statistical models over data stored in database. Central factor in these statistical models: Selectivity: The percentage tuples matched by a predicate 06/12/16 22
23 Selectivity Selectivity: Percentage tuples fulfilling a predicate Selectivity used for estimating size of (intermediate) query results. Example query joining of relations T1(SSN,NME) and T2(SSN,INCOME): SELECT t2.income FROM T1 t1, T2 t2 WHERE t1.nme = Kalle ND t2.income > ND t1.ssn = t2.ssn ssume indexes on T1.SSN, T2.SSN, T1.NME, and T2.INCOME. ssume we are using nested-loop-join (NLJ) as only join algorithm 06/12/16 23
24 Evaluated plans Plan : project(t2.income, select(nlj SSN (index_scan(t2.income>95000), T1), T1.NME= Kalle )) Plan : project(t2.income,select(nlj SSN (index_scan(t1.nme= Kalle ),T2), T2.INCOME>95000)) π T2.INCOME π T2.INCOME σ T1.NME= Kalle σ T2.INCOME>95000 NLJ NLJ T1.SSN=T2.SSN T2.SSN=T1.SSN IS T2.INCOME>95000 T1 IS T1.NME= Kalle T2 If T2.INCOME > selects fewer tuples than T1.NME = Kalle then plan seems best, otherwise plan seems best 06/12/16 24
25 Selectivity calculation selectivitity(p(t)) is defined as percentage of tuples t that are selected by predicate P(t) For example: selectivity(t2.income>95000) may be The selectivity depends on value distributions in selected table column For example, the distribution of values in T2.INCOME The DMS maintains statistics, e.g. Number of rows of each table T, cardinality(t) Number of different values of each table column T.C, distinct_count(t.c). max(t.c) and min(t.c) for each table column T.C 06/12/16 25
26 Estimating selectivity ssume: max(t2.income) =100000, min(t2.income) = distinct_count(t1.nme) = 1000 ssume uniform distributions of values in all columns Modern DMSs maintain histograms for better selectivity estimates Selectivity of T2.INCOME>95000: selectivity(t2.income > 95000) = (max(t2.income) ) / (max(t2.income ) - min(t2.income)) = ( ) / ( ) = Selectivity of T1.name= Kalle : selectivity(t1.nme = Kalle ) = 1/distinct_count(T1.NME) = 1/1000 = Notice that selectivity is independent of tables size 06/12/16 26
27 Computing intermediate result sizes Computing the size of the result of a physical operator requires the size of the table T, called table cardinality, denoted cardinality(t). For example: assume cardinality(t1) = and cardinality(t2) = 8000 Number of tuples matching predicate P, matches(p), is computed by multiplying selectivity(p) with size of table. For example matches(t1.nme = Kalle ) = selectivity(t1.nme = Kalle ) * cardinality(t1) = 0.001*10000 = 10 rows matches(t2.income > 95000) = selectivity(t2.income > 95000) * cardinality(t2) = * 8000 = 500 rows Optimal plan is Since: matches(t1.nme= Kalle ) < matches(t2.income>95000) 06/12/16 27
28 Execution plans with data flows 0.5 π T2.INCOME σ T1.NME= Kalle 0.5 π T2.INCOME 500 NLJ T1.SSN=T2.SSN IS T2.INCOME>95000 T1 σ T2.INCOME> NLJ T2.SSN=T1.SSN 10 8 (since card(t2)=0.8*card(t1)) IS T1.NME= Kalle T2 The final cost furthermore takes into account the costs of the operators (join, Index Scan) This costs depends on factors like join algorithm, sort order of stream, clustering of rows on disk, etc. (no more details here) 06/12/16 28
29 Data statistics DMS maintains statistics about data in tables in order to estimate size of intermediate results, for example: Rows in tables, cardinality(t) Row length in a table Number of different values in column, count_distinct(t.c) Histograms of distributions of column values DMS contains statistical models for estimating selectivities of composite predicates. Examples of statistical models assuming independent values: selectivity(p1 ND P2) = selectivity(p1) * selectivity(p2) selectivity(p1 OR P2) = selectivity(p1) + selectivity(p2) selectivity(p1) * selectivity(p2) selectivity(not P) = 1 selectivity(p) 06/12/16 29
30 Data statistics The models are often very rough For example, assuming statistical independence and flat distributions Work rather well since models used only for comparing different execution strategies - not for getting the exact execution costs. Modern DMS have more advanced models handling statistical dependencies and different distributions as well Cost of maintaining data statistics May be cheap: e.g size of relation, depth of -tree. More expensive: e.g. distribution on non-indexed columns, histograms Need to scan tables once in a while (single pass) The statistics can be refreshed on demand by D Statistics not always up-to-date Wrong statistics => sub-optimal but still correct plans 06/12/16 30
31 lank 06/12/16 31
32 Query optimization by Dynamic programming Elmasri/Navathe ch?? Padron-McCarthy/Risch ch?? Kjell Orsborn Uppsala Database Laboratory Department of Information Technology, Uppsala University, Uppsala, Sweden 06/12/16 32
33 Optimization algorithm Variants of dynamic programing (Dijsktra, ) normally used. Find optimal join order of: C Cost: 0 Rem:,,C 06/12/16 33
34 Optimization algorithm Variants of dynamic programing (Dijsktra, ) normally used. Find optimal join order of: C Cost: 0 Rem:,,C Cost: 1 Rem:,C Cost: 10 Rem:,C C Cost: 100 Rem:, 06/12/16 34
35 Optimization algorithm Variants of dynamic programing (Dijsktra, ) normally used. Find optimal join order of: C Cost: 0 Rem:,,C Cost: 1 Rem:,C Cost: 10 Rem:,C C Cost: 100 Rem:, Cost: 19 Rem: C Cost: 15 Rem: C 06/12/16 35
36 Optimization algorithm Variants of dynamic programing (Dijsktra, ) normally used. Find optimal join order of: C Cost: 0 Rem:,,C Cost: 1 Rem:,C Cost: 10 Rem:,C C Cost: 100 Rem:, Cost: 19 Rem: C Cost: 15 Rem: C Cost: 14 Rem: C Cost: 30 Rem: C 06/12/16 36
37 Optimization algorithm Variants of dynamic programing (Dijsktra, ) normally used. Find optimal join order of: C Cost: 0 Rem:,,C Cost: 1 Rem:,C Cost: 10 Rem:,C C Cost: 100 Rem:, Cost: 19 Rem: C Cost: 15 Rem: C Cost: 14 Rem: C Cost: 30 Rem: C C Cost: 20 Rem: 06/12/16 37
38 Optimization algorithm Variants of dynamic programing (Dijsktra, ) normally used. Find optimal join order of: C Cost: 0 Rem:,,C Cost: 1 Rem:,C Cost: 10 Rem:,C C Cost: 100 Rem:, Cost: 19 Rem: C Cost: 15 Rem: C Cost: 14 Rem: C Cost: 30 Rem: C C Cost: 35 Rem: C Cost: 20 Rem: 06/12/16 38
39 Optimization algorithm Variants of dynamic programing (Dijsktra, ) normally used. Find optimal join order of: C Cost: 0 Rem:,,C Cost: 1 Rem:,C Cost: 10 Rem:,C C Cost: 100 Rem:, Cost: 19 Rem: C Cost: 15 Rem: C Cost: 14 Rem: C Cost: 30 Rem: C C Cost: 21 Rem: C Cost: 35 Rem: C Cost: 20 Rem: 06/12/16 39
40 Optimization algorithm ll costs are total costs to execute each partial join The cost of a new node is computed by using the cost model to compute the cost of the new partial join, given the previously computed cost of the node above In practice several parameters are computed in each node, i.e. CPU cost, disk access cost, communication cost. Cost: 0 Rem:,,C Cost: 1 Rem:,C Cost: 10 Rem:,C C Cost: 100 Rem:, Cost: 19 Rem: C Cost: 15 Rem: C Cost: 14 Rem: C Cost: 30 Rem: C C Cost: 21 Rem: C Cost: 35 Rem: C Cost: 20 Rem: 06/12/16 40
41 Optimizing large queries Don t optimize at all or partly, i.e. order of predicates significant (old Oracle, old MySQL) Optimize partly, i.e. up to ca 10 joins, leave rest (rem field) unoptimized (new Oracle) Heuristic methods (e.g. greedy optimization, hill climbing) Randomized (Monte Carlo) methods To speed up data access the user may sometimes manually break down very large queries into smaller optimizable queries This is often necessary for translating relational representations to complex object structures in application programs 06/12/16 41
What happens. 376a. Database Design. Execution strategy. Query conversion. Next. Two types of techniques
376a. Database Design Dept. of Computer Science Vassar College http://www.cs.vassar.edu/~cs376 Class 16 Query optimization What happens Database is given a query Query is scanned - scanner creates a list
More informationCopyright 2016 Ramez Elmasri and Shamkant B. Navathe
CHAPTER 19 Query Optimization Introduction Query optimization Conducted by a query optimizer in a DBMS Goal: select best available strategy for executing query Based on information available Most RDBMSs
More informationIntroduction to Query Processing and Query Optimization Techniques. Copyright 2011 Ramez Elmasri and Shamkant Navathe
Introduction to Query Processing and Query Optimization Techniques Outline Translating SQL Queries into Relational Algebra Algorithms for External Sorting Algorithms for SELECT and JOIN Operations Algorithms
More informationDATABASE DESIGN - 1DL400
DATABASE DESIGN - 1DL400 Fall 2015 A course on modern database systems http://www.it.uu.se/research/group/udbl/kurser/dbii_ht15 Kjell Orsborn Uppsala Database Laboratory Department of Information Technology,
More informationThe physical database. Contents - physical database design DATABASE DESIGN I - 1DL300. Introduction to Physical Database Design
DATABASE DESIGN I - 1DL300 Fall 2011 Introduction to Physical Database Design Elmasri/Navathe ch 16 and 17 Padron-McCarthy/Risch ch 21 and 22 An introductory course on database systems http://www.it.uu.se/edu/course/homepage/dbastekn/ht11
More informationAdvanced Database Systems
Lecture IV Query Processing Kyumars Sheykh Esmaili Basic Steps in Query Processing 2 Query Optimization Many equivalent execution plans Choosing the best one Based on Heuristics, Cost Will be discussed
More informationDatabase System Concepts
Chapter 13: Query Processing s Departamento de Engenharia Informática Instituto Superior Técnico 1 st Semester 2008/2009 Slides (fortemente) baseados nos slides oficiais do livro c Silberschatz, Korth
More informationDatabase Systems External Sorting and Query Optimization. A.R. Hurson 323 CS Building
External Sorting and Query Optimization A.R. Hurson 323 CS Building External sorting When data to be sorted cannot fit into available main memory, external sorting algorithm must be applied. Naturally,
More informationChapter 12: Query Processing
Chapter 12: Query Processing Overview Catalog Information for Cost Estimation $ Measures of Query Cost Selection Operation Sorting Join Operation Other Operations Evaluation of Expressions Transformation
More informationChapter 12: Query Processing. Chapter 12: Query Processing
Chapter 12: Query Processing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 12: Query Processing Overview Measures of Query Cost Selection Operation Sorting Join
More informationChapter 13: Query Processing
Chapter 13: Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 13.1 Basic Steps in Query Processing 1. Parsing
More informationDATABASE TECHNOLOGY - 1MB025 (also 1DL029, 1DL300+1DL400)
1 DATABASE TECHNOLOGY - 1MB025 (also 1DL029, 1DL300+1DL400) Spring 2008 An introductury course on database systems http://user.it.uu.se/~udbl/dbt-vt2008/ alt. http://www.it.uu.se/edu/course/homepage/dbastekn/vt08/
More informationChapter 12: Query Processing
Chapter 12: Query Processing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Overview Chapter 12: Query Processing Measures of Query Cost Selection Operation Sorting Join
More informationAlgorithms for Query Processing and Optimization. 0. Introduction to Query Processing (1)
Chapter 19 Algorithms for Query Processing and Optimization 0. Introduction to Query Processing (1) Query optimization: The process of choosing a suitable execution strategy for processing a query. Two
More information! A relational algebra expression may have many equivalent. ! Cost is generally measured as total elapsed time for
Chapter 13: Query Processing Basic Steps in Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 1. Parsing and
More informationChapter 13: Query Processing Basic Steps in Query Processing
Chapter 13: Query Processing Basic Steps in Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 1. Parsing and
More informationQuery Processing. Introduction to Databases CompSci 316 Fall 2017
Query Processing Introduction to Databases CompSci 316 Fall 2017 2 Announcements (Tue., Nov. 14) Homework #3 sample solution posted in Sakai Homework #4 assigned today; due on 12/05 Project milestone #2
More informationHash-Based Indexing 165
Hash-Based Indexing 165 h 1 h 0 h 1 h 0 Next = 0 000 00 64 32 8 16 000 00 64 32 8 16 A 001 01 9 25 41 73 001 01 9 25 41 73 B 010 10 10 18 34 66 010 10 10 18 34 66 C Next = 3 011 11 11 19 D 011 11 11 19
More informationAdvances in Data Management Query Processing and Query Optimisation A.Poulovassilis
1 Advances in Data Management Query Processing and Query Optimisation A.Poulovassilis 1 General approach to the implementation of Query Processing and Query Optimisation functionalities in DBMSs 1. Parse
More informationQuery Optimization. Query Optimization. Optimization considerations. Example. Interaction of algorithm choice and tree arrangement.
COS 597: Principles of Database and Information Systems Query Optimization Query Optimization Query as expression over relational algebraic operations Get evaluation (parse) tree Leaves: base relations
More information7. Query Processing and Optimization
7. Query Processing and Optimization Processing a Query 103 Indexing for Performance Simple (individual) index B + -tree index Matching index scan vs nonmatching index scan Unique index one entry and one
More informationCMSC424: Database Design. Instructor: Amol Deshpande
CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu Databases Data Models Conceptual representa1on of the data Data Retrieval How to ask ques1ons of the database How to answer those ques1ons
More informationChapter 3. Algorithms for Query Processing and Optimization
Chapter 3 Algorithms for Query Processing and Optimization Chapter Outline 1. Introduction to Query Processing 2. Translating SQL Queries into Relational Algebra 3. Algorithms for External Sorting 4. Algorithms
More informationQuery Processing. Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016
Query Processing Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016 Slides re-used with some modification from www.db-book.com Reference: Database System Concepts, 6 th Ed. By Silberschatz,
More informationEECS 647: Introduction to Database Systems
EECS 647: Introduction to Database Systems Instructor: Luke Huan Spring 2009 External Sorting Today s Topic Implementing the join operation 4/8/2009 Luke Huan Univ. of Kansas 2 Review DBMS Architecture
More informationDBMS Query evaluation
Data Management for Data Science DBMS Maurizio Lenzerini, Riccardo Rosati Corso di laurea magistrale in Data Science Sapienza Università di Roma Academic Year 2016/2017 http://www.dis.uniroma1.it/~rosati/dmds/
More informationModule 9: Selectivity Estimation
Module 9: Selectivity Estimation Module Outline 9.1 Query Cost and Selectivity Estimation 9.2 Database profiles 9.3 Sampling 9.4 Statistics maintained by commercial DBMS Web Forms Transaction Manager Lock
More informationFundamentals of Database Systems
Fundamentals of Database Systems Assignment: 4 September 21, 2015 Instructions 1. This question paper contains 10 questions in 5 pages. Q1: Calculate branching factor in case for B- tree index structure,
More informationQuery Processing and Query Optimization. Prof Monika Shah
Query Processing and Query Optimization Query Processing SQL Query Is in Library Cache? System catalog (Dict / Dict cache) Scan and verify relations Parse into parse tree (relational Calculus) View definitions
More informationQuery Processing & Optimization
Query Processing & Optimization 1 Roadmap of This Lecture Overview of query processing Measures of Query Cost Selection Operation Sorting Join Operation Other Operations Evaluation of Expressions Introduction
More informationAnnouncement. Reading Material. Overview of Query Evaluation. Overview of Query Evaluation. Overview of Query Evaluation 9/26/17
Announcement CompSci 516 Database Systems Lecture 10 Query Evaluation and Join Algorithms Project proposal pdf due on sakai by 5 pm, tomorrow, Thursday 09/27 One per group by any member Instructor: Sudeepa
More informationParser: SQL parse tree
Jinze Liu Parser: SQL parse tree Good old lex & yacc Detect and reject syntax errors Validator: parse tree logical plan Detect and reject semantic errors Nonexistent tables/views/columns? Insufficient
More informationRelational Query Optimization. Overview of Query Evaluation. SQL Refresher. Yanlei Diao UMass Amherst October 23 & 25, 2007
Relational Query Optimization Yanlei Diao UMass Amherst October 23 & 25, 2007 Slide Content Courtesy of R. Ramakrishnan, J. Gehrke, and J. Hellerstein 1 Overview of Query Evaluation Query Evaluation Plan:
More informationQuery Evaluation! References:! q [RG-3ed] Chapter 12, 13, 14, 15! q [SKS-6ed] Chapter 12, 13!
Query Evaluation! References:! q [RG-3ed] Chapter 12, 13, 14, 15! q [SKS-6ed] Chapter 12, 13! q Overview! q Optimization! q Measures of Query Cost! Query Evaluation! q Sorting! q Join Operation! q Other
More informationData Modeling and Databases Ch 9: Query Processing - Algorithms. Gustavo Alonso Systems Group Department of Computer Science ETH Zürich
Data Modeling and Databases Ch 9: Query Processing - Algorithms Gustavo Alonso Systems Group Department of Computer Science ETH Zürich Transactions (Locking, Logging) Metadata Mgmt (Schema, Stats) Application
More informationOutline. Query Processing Overview Algorithms for basic operations. Query optimization. Sorting Selection Join Projection
Outline Query Processing Overview Algorithms for basic operations Sorting Selection Join Projection Query optimization Heuristics Cost-based optimization 19 Estimate I/O Cost for Implementations Count
More informationDATABASE TECHNOLOGY - 1MB025
1 DATABASE TECHNOLOGY - 1MB025 Fall 2005 An introductury course on database systems http://user.it.uu.se/~udbl/dbt-ht2005/ alt. http://www.it.uu.se/edu/course/homepage/dbastekn/ht05/ Kjell Orsborn Uppsala
More informationIntroduction. 1. Introduction. Overview Query Processing Overview Query Optimization Overview Query Execution 3 / 591
1. Introduction Overview Query Processing Overview Query Optimization Overview Query Execution 3 / 591 Query Processing Reason for Query Optimization query languages like SQL are declarative query specifies
More informationOverview of Implementing Relational Operators and Query Evaluation
Overview of Implementing Relational Operators and Query Evaluation Chapter 12 Motivation: Evaluating Queries The same query can be evaluated in different ways. The evaluation strategy (plan) can make orders
More informationSQL QUERY EVALUATION. CS121: Relational Databases Fall 2017 Lecture 12
SQL QUERY EVALUATION CS121: Relational Databases Fall 2017 Lecture 12 Query Evaluation 2 Last time: Began looking at database implementation details How data is stored and accessed by the database Using
More informationRelational Query Optimization
Relational Query Optimization Module 4, Lectures 3 and 4 Database Management Systems, R. Ramakrishnan 1 Overview of Query Optimization Plan: Tree of R.A. ops, with choice of alg for each op. Each operator
More informationChapter 19 Query Optimization
Chapter 19 Query Optimization It is an activity conducted by the query optimizer to select the best available strategy for executing the query. 1. Query Trees and Heuristics for Query Optimization - Apply
More informationModule 4. Implementation of XQuery. Part 0: Background on relational query processing
Module 4 Implementation of XQuery Part 0: Background on relational query processing The Data Management Universe Lecture Part I Lecture Part 2 2 What does a Database System do? Input: SQL statement Output:
More informationData Modeling and Databases Ch 10: Query Processing - Algorithms. Gustavo Alonso Systems Group Department of Computer Science ETH Zürich
Data Modeling and Databases Ch 10: Query Processing - Algorithms Gustavo Alonso Systems Group Department of Computer Science ETH Zürich Transactions (Locking, Logging) Metadata Mgmt (Schema, Stats) Application
More informationPrinciples of Data Management. Lecture #9 (Query Processing Overview)
Principles of Data Management Lecture #9 (Query Processing Overview) Instructor: Mike Carey mjcarey@ics.uci.edu Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Today s Notable News v Midterm
More informationLassonde School of Engineering Winter 2016 Term Course No: 4411 Database Management Systems
Lassonde School of Engineering Winter 2016 Term Course No: 4411 Database Management Systems Last Name: First Name: Student ID: 1. Exam is 2 hours long 2. Closed books/notes Problem 1 (6 points) Consider
More informationCS 564 Final Exam Fall 2015 Answers
CS 564 Final Exam Fall 015 Answers A: STORAGE AND INDEXING [0pts] I. [10pts] For the following questions, clearly circle True or False. 1. The cost of a file scan is essentially the same for a heap file
More informationUniversity of Waterloo Midterm Examination Sample Solution
1. (4 total marks) University of Waterloo Midterm Examination Sample Solution Winter, 2012 Suppose that a relational database contains the following large relation: Track(ReleaseID, TrackNum, Title, Length,
More informationExamples of Physical Query Plan Alternatives. Selected Material from Chapters 12, 14 and 15
Examples of Physical Query Plan Alternatives Selected Material from Chapters 12, 14 and 15 1 Query Optimization NOTE: SQL provides many ways to express a query. HENCE: System has many options for evaluating
More informationQuery Processing & Optimization. CS 377: Database Systems
Query Processing & Optimization CS 377: Database Systems Recap: File Organization & Indexing Physical level support for data retrieval File organization: ordered or sequential file to find items using
More informationσ (R.B = 1 v R.C > 3) (S.D = 2) Conjunctive normal form Topics for the Day Distributed Databases Query Processing Steps Decomposition
Topics for the Day Distributed Databases Query processing in distributed databases Localization Distributed query operators Cost-based optimization C37 Lecture 1 May 30, 2001 1 2 Query Processing teps
More information! Parallel machines are becoming quite common and affordable. ! Databases are growing increasingly large
Chapter 20: Parallel Databases Introduction! Introduction! I/O Parallelism! Interquery Parallelism! Intraquery Parallelism! Intraoperation Parallelism! Interoperation Parallelism! Design of Parallel Systems!
More informationChapter 20: Parallel Databases
Chapter 20: Parallel Databases! Introduction! I/O Parallelism! Interquery Parallelism! Intraquery Parallelism! Intraoperation Parallelism! Interoperation Parallelism! Design of Parallel Systems 20.1 Introduction!
More informationChapter 20: Parallel Databases. Introduction
Chapter 20: Parallel Databases! Introduction! I/O Parallelism! Interquery Parallelism! Intraquery Parallelism! Intraoperation Parallelism! Interoperation Parallelism! Design of Parallel Systems 20.1 Introduction!
More informationSpring 2017 QUERY PROCESSING [JOINS, SET OPERATIONS, AND AGGREGATES] 2/19/17 CS 564: Database Management Systems; (c) Jignesh M.
Spring 2017 QUERY PROCESSING [JOINS, SET OPERATIONS, AND AGGREGATES] 2/19/17 CS 564: Database Management Systems; (c) Jignesh M. Patel, 2013 1 Joins The focus here is on equijoins These are very common,
More informationDATABASE DESIGN I - 1DL300
DATABASE DESIGN I - 1DL300 Fall 2009 An introductury course on database systems http://user.it.uu.se/~udbl/dbt1-ht2009/ alt. http://www.it.uu.se/edu/course/homepage/dbastekn/ht09/ Kjell Orsborn Uppsala
More informationDatabase Tuning and Physical Design: Basics of Query Execution
Database Tuning and Physical Design: Basics of Query Execution Spring 2018 School of Computer Science University of Waterloo Databases CS348 (University of Waterloo) Query Execution 1 / 43 The Client/Server
More informationFaloutsos 1. Carnegie Mellon Univ. Dept. of Computer Science Database Applications. Outline
Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications Lecture #14: Implementation of Relational Operations (R&G ch. 12 and 14) 15-415 Faloutsos 1 introduction selection projection
More informationChapter 17: Parallel Databases
Chapter 17: Parallel Databases Introduction I/O Parallelism Interquery Parallelism Intraquery Parallelism Intraoperation Parallelism Interoperation Parallelism Design of Parallel Systems Database Systems
More informationReview. Support for data retrieval at the physical level:
Query Processing Review Support for data retrieval at the physical level: Indices: data structures to help with some query evaluation: SELECTION queries (ssn = 123) RANGE queries (100
More informationCSIT5300: Advanced Database Systems
CSIT5300: Advanced Database Systems L10: Query Processing Other Operations, Pipelining and Materialization Dr. Kenneth LEUNG Department of Computer Science and Engineering The Hong Kong University of Science
More informationCSE 190D Spring 2017 Final Exam Answers
CSE 190D Spring 2017 Final Exam Answers Q 1. [20pts] For the following questions, clearly circle True or False. 1. The hash join algorithm always has fewer page I/Os compared to the block nested loop join
More informationQUERY OPTIMIZATION FOR DATABASE MANAGEMENT SYSTEM BY APPLYING DYNAMIC PROGRAMMING ALGORITHM
QUERY OPTIMIZATION FOR DATABASE MANAGEMENT SYSTEM BY APPLYING DYNAMIC PROGRAMMING ALGORITHM Wisnu Adityo NIM 13506029 Information Technology Department Institut Teknologi Bandung Jalan Ganesha 10 e-mail:
More informationEvaluation of relational operations
Evaluation of relational operations Iztok Savnik, FAMNIT Slides & Textbook Textbook: Raghu Ramakrishnan, Johannes Gehrke, Database Management Systems, McGraw-Hill, 3 rd ed., 2007. Slides: From Cow Book
More informationChapter 18: Parallel Databases
Chapter 18: Parallel Databases Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 18: Parallel Databases Introduction I/O Parallelism Interquery Parallelism Intraquery
More informationChapter 18: Parallel Databases. Chapter 18: Parallel Databases. Parallelism in Databases. Introduction
Chapter 18: Parallel Databases Chapter 18: Parallel Databases Introduction I/O Parallelism Interquery Parallelism Intraquery Parallelism Intraoperation Parallelism Interoperation Parallelism Design of
More informationR & G Chapter 13. Implementation of single Relational Operations Choices depend on indexes, memory, stats, Joins Blocked nested loops:
Relational Query Optimization R & G Chapter 13 Review Implementation of single Relational Operations Choices depend on indexes, memory, stats, Joins Blocked nested loops: simple, exploits extra memory
More informationEvaluation of Relational Operations
Evaluation of Relational Operations Yanlei Diao UMass Amherst March 13 and 15, 2006 Slides Courtesy of R. Ramakrishnan and J. Gehrke 1 Relational Operations We will consider how to implement: Selection
More informationCMSC424: Database Design. Instructor: Amol Deshpande
CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu Databases Data Models Conceptual representa1on of the data Data Retrieval How to ask ques1ons of the database How to answer those ques1ons
More informationModule 9: Query Optimization
Module 9: Query Optimization Module Outline Web Forms Applications SQL Interface 9.1 Outline of Query Optimization 9.2 Motivating Example 9.3 Equivalences in the relational algebra 9.4 Heuristic optimization
More informationDistributed Databases Systems
Distributed Databases Systems Lecture No. 05 Query Processing Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology Jamshoro Outline
More informationSchema for Examples. Query Optimization. Alternative Plans 1 (No Indexes) Motivating Example. Alternative Plans 2 With Indexes
Schema for Examples Query Optimization (sid: integer, : string, rating: integer, age: real) (sid: integer, bid: integer, day: dates, rname: string) Similar to old schema; rname added for variations. :
More informationCSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2007 Lecture 9 - Query optimization
CSE 544 Principles of Database Management Systems Magdalena Balazinska Fall 2007 Lecture 9 - Query optimization References Access path selection in a relational database management system. Selinger. et.
More informationFinal Review. May 9, 2017
Final Review May 9, 2017 1 SQL 2 A Basic SQL Query (optional) keyword indicating that the answer should not contain duplicates SELECT [DISTINCT] target-list A list of attributes of relations in relation-list
More informationL22: The Relational Model (continued) CS3200 Database design (sp18 s2) 4/5/2018
L22: The Relational Model (continued) CS3200 Database design (sp18 s2) https://course.ccs.neu.edu/cs3200sp18s2/ 4/5/2018 256 Announcements! Please pick up your exam if you have not yet HW6 will include
More informationFinal Review. May 9, 2018 May 11, 2018
Final Review May 9, 2018 May 11, 2018 1 SQL 2 A Basic SQL Query (optional) keyword indicating that the answer should not contain duplicates SELECT [DISTINCT] target-list A list of attributes of relations
More informationQuery Execution [15]
CSC 661, Principles of Database Systems Query Execution [15] Dr. Kalpakis http://www.csee.umbc.edu/~kalpakis/courses/661 Query processing involves Query processing compilation parsing to construct parse
More informationCS 4604: Introduction to Database Management Systems. B. Aditya Prakash Lecture #10: Query Processing
CS 4604: Introduction to Database Management Systems B. Aditya Prakash Lecture #10: Query Processing Outline introduction selection projection join set & aggregate operations Prakash 2018 VT CS 4604 2
More informationQuery Processing and Optimization *
OpenStax-CNX module: m28213 1 Query Processing and Optimization * Nguyen Kim Anh This work is produced by OpenStax-CNX and licensed under the Creative Commons Attribution License 3.0 Query processing is
More informationQuery optimization. Elena Baralis, Silvia Chiusano Politecnico di Torino. DBMS Architecture D B M G. Database Management Systems. Pag.
Database Management Systems DBMS Architecture SQL INSTRUCTION OPTIMIZER MANAGEMENT OF ACCESS METHODS CONCURRENCY CONTROL BUFFER MANAGER RELIABILITY MANAGEMENT Index Files Data Files System Catalog DATABASE
More informationLecture 15: The Details of Joins
Lecture 15 Lecture 15: The Details of Joins (and bonus!) Lecture 15 > Section 1 What you will learn about in this section 1. How to choose between BNLJ, SMJ 2. HJ versus SMJ 3. Buffer Manager Detail (PS#3!)
More informationQuestion 1 (a) 10 marks
Question 1 (a) Consider the tree in the next slide. Find any/ all violations of a B+tree structure. Identify each bad node and give a brief explanation of each error. Assume the order of the tree is 4
More informationQuery Optimization. Schema for Examples. Motivating Example. Similar to old schema; rname added for variations. Reserves: Sailors:
Query Optimization Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Schema for Examples (sid: integer, sname: string, rating: integer, age: real) (sid: integer, bid: integer, day: dates,
More informationQUERY OPTIMIZATION. CS 564- Spring ACKs: Jeff Naughton, Jignesh Patel, AnHai Doan
QUERY OPTIMIZATION CS 564- Spring 2018 ACKs: Jeff Naughton, Jignesh Patel, AnHai Doan WHAT IS THIS LECTURE ABOUT? What is a query optimizer? Generating query plans Cost estimation of query plans 2 ARCHITECTURE
More informationDATABASE TECHNOLOGY - 1DL124
1 DATABASE TECHNOLOGY - 1DL124 Summer 2007 An introductury course on database systems http://user.it.uu.se/~udbl/dbt-sommar07/ alt. http://www.it.uu.se/edu/course/homepage/dbdesign/st07/ Kjell Orsborn
More informationDatabase Applications (15-415)
Database Applications (15-415) DMS Internals- Part X Lecture 21, April 7, 2015 Mohammad Hammoud Last Session: DMS Internals- Part IX Query Optimization Today Today s Session: DMS Internals- Part X Query
More informationDBMS Y3/S5. 1. OVERVIEW The steps involved in processing a query are: 1. Parsing and translation. 2. Optimization. 3. Evaluation.
Query Processing QUERY PROCESSING refers to the range of activities involved in extracting data from a database. The activities include translation of queries in high-level database languages into expressions
More informationCSE 344 FEBRUARY 14 TH INDEXING
CSE 344 FEBRUARY 14 TH INDEXING EXAM Grades posted to Canvas Exams handed back in section tomorrow Regrades: Friday office hours EXAM Overall, you did well Average: 79 Remember: lowest between midterm/final
More informationQUERY OPTIMIZATION E Jayant Haritsa Computer Science and Automation Indian Institute of Science. JAN 2014 Slide 1 QUERY OPTIMIZATION
E0 261 Jayant Haritsa Computer Science and Automation Indian Institute of Science JAN 2014 Slide 1 Database Engines Main Components Query Processing Transaction Processing Access Methods JAN 2014 Slide
More informationOverview of Query Processing and Optimization
Overview of Query Processing and Optimization Source: Database System Concepts Korth and Silberschatz Lisa Ball, 2010 (spelling error corrections Dec 07, 2011) Purpose of DBMS Optimization Each relational
More informationDATABASE DESIGN I - 1DL300
DATABASE DESIGN I - 1DL300 Fall 2010 An introductory course on database systems http://www.it.uu.se/edu/course/homepage/dbastekn/ht10/ Manivasakan Sabesan Uppsala Database Laboratory Department of Information
More informationCS 222/122C Fall 2017, Final Exam. Sample solutions
CS 222/122C Fall 2017, Final Exam Principles of Data Management Department of Computer Science, UC Irvine Prof. Chen Li (Max. Points: 100 + 15) Sample solutions Question 1: Short questions (15 points)
More informationAdministrivia. CS 133: Databases. Cost-based Query Sub-System. Goals for Today. Midterm on Thursday 10/18. Assignments
Administrivia Midterm on Thursday 10/18 CS 133: Databases Fall 2018 Lec 12 10/16 Prof. Beth Trushkowsky Assignments Lab 3 starts after fall break No problem set out this week Goals for Today Cost-based
More informationOptimization Overview
Lecture 17 Optimization Overview Lecture 17 Lecture 17 Today s Lecture 1. Logical Optimization 2. Physical Optimization 3. Course Summary 2 Lecture 17 Logical vs. Physical Optimization Logical optimization:
More informationQuery processing and optimization
Query processing and optimization These slides are a modified version of the slides of the book Database System Concepts (Chapter 13 and 14), 5th Ed., McGraw-Hill, by Silberschatz, Korth and Sudarshan.
More informationQuery Processing Strategies and Optimization
Query Processing Strategies and Optimization CPS352: Database Systems Simon Miner Gordon College Last Revised: 10/25/12 Agenda Check-in Design Project Presentations Query Processing Programming Project
More informationTHE EFFECT OF JOIN SELECTIVITIES ON OPTIMAL NESTING ORDER
THE EFFECT OF JOIN SELECTIVITIES ON OPTIMAL NESTING ORDER Akhil Kumar and Michael Stonebraker EECS Department University of California Berkeley, Ca., 94720 Abstract A heuristic query optimizer must choose
More informationDATABASE DESIGN II - 1DL400
DATABASE DESIGN II - 1DL400 Fall 2016 A second course in database systems http://www.it.uu.se/research/group/udbl/kurser/dbii_ht16 Kjell Orsborn Uppsala Database Laboratory Department of Information Technology,
More informationProject. Building a Simple Query Optimizer with Performance Evaluation Experiment on Query Rewrite Optimization
Project CIS611 SS Chung Building a Simple Query Optimizer with Performance Evaluation Experiment on Query Rewrite Optimization This project is to simulate a simple Query Optimizer that: 1. Evaluates query
More informationQuery Processing and Advanced Queries. Query Optimization (4)
Query Processing and Advanced Queries Query Optimization (4) Two-Pass Algorithms Based on Hashing R./S If both input relations R and S are too large to be stored in the buffer, hash all the tuples of both
More information