Overview of Query Processing and Optimization

Size: px
Start display at page:

Download "Overview of Query Processing and Optimization"

Transcription

1 Overview of Query Processing and Optimization Source: Database System Concepts Korth and Silberschatz Lisa Ball, 2010 (spelling error corrections Dec 07, 2011)

2 Purpose of DBMS Optimization Each relational algebra query can generate multiple execution plans ordering of operations (σ, π, ) use of access structures (index files) Different strategies vary greatly in # of disk accesses Goal: Free user from worrying about writing efficient SQL

3 Basic Query Processor Operations Input: SQL Query (declarative) Parse Query checks syntax, table existence Input: relational algebra (procedural) Find equivalent, but more efficient expressions Input: optimized rel algebra expressions Select execution plan-- order of operations & index files to use Goal: minimize the # of disk accesses Execute Query

4 Example Database cust_name, street, city CUSTOMER branch_name, acct_no, cust_name, balance DEPOSIT branch_name, assets, branch_city BRANCH

5 Order of Operations Heuristic # 1: PERFORM SELECTIONS EARLY Q1: List the name and assets of all banks with depositors living in Port Chester sql query: select B.branch_name, B.assets from BRANCH B, DEPOSIT D, CUSTOMER C where B.branch_name=D.branch_name and D.cust_name=C.cust_name and C.city= Port Chester

6 Heuristic # 1: Options for relational algebra translation Option 1: joins 1 st π branch_name,assets (σ city= Port Chester (customer * deposit * branch)) Option 2: select 1 st π branch_name,assets ((σ city= Port Chester customer) * deposit * branch) Both options are equivalent, in that they return the same data, but option 1 first joins all 3 tables, generating huge intermediate results on disk. We only need a fraction of the rows! KEY HERE: the select only applies to 1 relation (CUSTOMER)

7 Order of Operations Heuristic # 1: PERFORM SELECTIONS EARLY Q2: List the name and assets of all banks with depositors living in Port Chester with a balance > $1000. sql query: select B.branch_name, B.assets from BRANCH B, DEPOSIT D, CUSTOMER C where B.branch_name=D.branch_name and D.cust_name=C.cust_name and C.city= Port Chester and D.balance > 1000

8 Heuristic #2: Apply selects to minimum required tables Option 1: join all 1 st π branch_name,assets (σ city= Port Chester and balance>1000 (customer * deposit * branch)) Option 2: apply select to join of needed tables π branch_name,assets ((σ city= Port Chester and balance>1000 customer * deposit) * branch) Option 3: apply select to join of needed tables using rule at end of slide π branch_name,assets (σ city= Port Chester (σ balance>1000 (customer * deposit )) branch)) Option 4: apply select to minimal tables needed π branch_name,assets ((σ city= Port Chester customer) * (σ balance>1000 deposit) * branch) All options are equivalent, in that they return the same data, but option 4 reduces the size of the intermediate tables by performing the selects 1 st KEY HERE: σ P1 AND P2 (e) σ P1 (σ P2 (e))

9 Heuristic # 3: Perform Projections Early Option 1: project 1 st, select 2 nd, join last π branch_name,assets ((π branch_name, ((σ city= Port Chester (customer)) *deposit)) * branch)) Looking back at Q1, we get a relation with many unnecessary columns KEY HERE: reduce size of intermediate tables by only using columns needed (ones that appear in result OR are needed in subsequent operations, e.g. joins)

10 Heuristic # 4: Order join operations to reduce the size of temporary tables Query 1: List the name and assets of all banks with depositors living in Port Chester Query 1 relational algebra (join of 3 tables) π branch_name,assets ((σ city= Port Chester customer) * deposit * branch) Natural join is associative, so we can choose the order in which tables are joined Option 1 (NOTbest choice): join deposit and branch 1 st Likely to be large 1 row for each deposit account (it matches 1 row in in branch) So if deposit has 10,000 rows, the result of this intermediate join will be 10,000 Option 2 (WORST choice): do 1 st (σ city= Port Chester customer) * branch No common join attribute Cartesian product So if there are 200 Port Chester customers and 3000 branches, the result of this intermediate join is 2000 X 3000 = 600,000 Option 3 (BEST choice): do 1 st (σ city= Port Chester customer) * deposit If the bank has many branches in different cities, this will be much smaller So if there are 200 Port Chester customers, each matching 2 deposit records on average, the result of this intermediate join is 200 X 2 = 400

11 Other Rewrite Rules to Aid Efficiency σ p (r 1 r 2 ) = σ p (r 1 ) (r 2 ) select before take union σ p (r 1 - r 2 ) = σ p (r 1 ) - r 2 = σ p (r 1 ) - σ p (r 2 ) example: list CS students (r1) with GPA>3.5 that are not also EE students (r2) (r 1 r 2 ) r 3 = r 1 (r 2 r 3 )

12 Strategies for Complex Queries May have a LARGE # of possible expressions that APPEAR to be efficient Ways to cope: Use heuristics to choose an expression Generate all promising expressions, estimate processing cost for every one (we ll see next) In choosing strategies, look at size of each relation distribution of values within columns (selectivity), e.g. Name is very selective (few duplicates) Gender is not very selective (about ½ rows for each)

13 Estimating Query Processing Cost To estimate query processing costs, the DB needs to store some stats, e.g., n r = the # of records in relation r s r = the size of a record in bytes in r V(A,r) = the # of distinct values that appear in relation r for attribute A (e.g., gender=2)

14 Estimating Query Processing Cost So, an upper limit on the size of a binary join is the Cartesian product r s, which has n r * n s rows each record is s r + s s bytes long V(A,r) lets us estimate the # of rows that satisfy a selection predicate of the form <attribute-name> = <value> How? next slide

15 Estimating Query Processing Cost How we estimate # rows of an natural join assume a UNIFORM distribution of values then σ A=a (r) has approximately Example, if V(Gender,2), a table with 1000 rows will have = 500 rows Note: this may not be very accurate large bank branches may have many more depositors than small branches

16 Estimating Natural Join Size r 1 * r 2 : join on the key of r 1 will have r 2 rows Example (Company DB ) Employee * Dependent (join attribute is SSN) will have the same # of rows as Dependent (n r2 ), because each Dependent joins with exactly 1 row of Employee

17 Estimating Natural Join Size r 1 * r 2 : join attribute is not a key of r 1 or r 2 Example (Company DB ) Employee * Dept_Location (join attribute is DNO) each row in r 1 will join with rows in r 2 Equation 1 so, the join will have a total estimate of: Equation 2

18 Estimating Natural Join Size Example of the Employee*Dept_Location: Let Employee (r 1 ) have 50 rows Dept_Location (r 2 ) have 15 rows V(DNO, Dept_Location) = 5 Then we have An estimate of (15/5) =3 rows from Dept_Location for each join with Employee (using eq. 1) Giving us (eq. 2): 3 * 50 = 150 rows in this join

19 Estimating Natural Join Size The previous result generally is the same as using Equation 3 Using same # of rows as before, Employee (r 1 ) has 50 rows Dept_Location (r 2 ) has 15 rows V(DNO, Dept_Employee) = 5 Then we have An estimate of (50/5) =10 rows from Employee for each join with Dept_Location (using variant of eq. 1) Giving us (eq. 3): 10* 15 = still an estimate of 150 rows in this join Note: the 2 estimates could be different if there are dangling tuples (nulls that won t partcipate in the join)

20 To keep in mind All this estimation requires that the DBMS keeps the statistics needed for these calculations Often these stats are only updated during periods of lighter system load But these stats are giving us estimates of the # of rows we really care about the # of disk accesses, which are influenced by: physical organization (we ll be using worst case in our discussions) access structures (indices) available

21 Access Cost Estimation Using Index Files and Hashing An access plan for a query has: relational operations to be performed the order in which the operations are to be performed the access structures to be used the order in which the rows are to be accessed 1 st we ll look at queries involving 1 tables

22 Cost Estimation Example Query 3 (Q3): SELECT acct_no FROM deposit WHERE branch-name= Hurst and cust-name= Fagin and balance > 1000 We have the following statistics bfr = 20 for deposit (20 records/block) V(branch-name,deposit) = 50 V (cust-name, deposit) = 200 V(balance, deposit) = 5000 Deposit has 10,000 rows (10,000/20 = 500 blocks) Index structures (next page )

23 Cost Estimation Example We have the following index files on deposit: A clustering 2-level index on branch-name A non-clustering (2 ), 2-level index on cust-name 1 st level of index 2 nd (base) level list of ptrs (extra level of indirection)- we ll ignore actual file

24 Cost Estimation: Plan #1 Use branch-name index (clustering) Analysis: Since V(branch-name,deposit) = 50 10,000 rows / 50 possible values = 200 rows for Hurst We need to read each of these blocks to check for Fagin and balance > 1000 The 200 rows are clustered into about 200/20 = 10 blocks for the deposit file Using this index (2 levels) requires 10 blocks + 2 index block reads = 12 block accesses

25 Cost Estimation: Plan #2 Use cust-name 2 index (non-clustering, 2 level) Analysis: Since V(cust-name,deposit) = ,000 rows / 200 possible values = 50 rows for Fagin We need to read each of these blocks to check for Hurst and balance > 1000 The 50 rows are not clustered, so in the worst case each record will be in a separate block, so we must read 50 blocks Using this index (2 levels) requires 50 blocks + 2 index block reads = 52 block accesses (vs. 12) If branch-name was non-clustering: we would have = 202 block accesses (worst case)

26 Cost Estimation: Plan #3 This time, both attributes (cust-name and deposit) use a non-clustering (2 ) index Since we can use index files for non-key attributes, we have index pointers to the blocks of record pointers (the extra level of indirection that we aren t counting here) Idea: take the intersection of the indirect record pointers of each index.

27 Cost Estimation: Plan #3 Let P 1 = pointers to records with cust-name Fagin (200) P 2 = pointers with branch-name Hurst (50) This means P 1 P 2 = set of pointers with cust-name= Fagin AND branch-name= Hurst To estimate block accesses we assume uniform distribution and attribute independence (Hurst doesn t imply we must have Fagin records or vice versa) P 1 P 2 = min(200, 50) 50 records at most have both attributes

28 Cost Estimation: Plan #3 Since 50 records, at most, have both attributes, out of 10,000 records, we get Probability(Fagin and Hurst) = 50/10,000 = 1/2000 1/2000 * 10,000 = 5 records So, we get = 5 block accesses (vs. 12, 52, 202 in our previous examples)

29 Cost Estimation: Plan #4 What about using balance in our plan? No index on balance attribute a non equality comparison (balance > 1000) is usually less selective that =

30 Single Table Optimization: Final Comments For any relational algebra expression, the optimizer may be able to formulate many different access plans During access plan selection, the query optimizer chooses the best strategy for a given expression Different plans can vary significantly in terms of real cost (# of block accesses) What appears to be a more efficient relational algebra expression may in fact have only mediocre plans, depending on file organization and available access structures

31 Cost Estimation: Joins Best strategy factors physical order of rows index (1, 2 ) possible creation of temporary index files memory buffer size worst case: 1 block per tables best case: 1 or both files can fit in memory Join example: deposit * customer n deposit = 10,000 n customer = 200

32 Cost Estimation: Joins # of comparisons to be done How many comparisons do we have to do? (this does not consider blocking factors) 10,000 * 200 = 2,000,000 comparisons!

33 Cost Estimation: Joins Algorithom 1: Simple Iteration deposit=10,000 and customer = 200 rows * this is just for illustrative purposes, or when bfr=1 for each row d in deposit for each row c in customer test(d,c) // cust-name match // worst case (1 read for each row) // for each outer loop, inner loop is runs 200 times // outer loop runs 10,000 times // total = (200 * 10,000) + 10,000 = 2,010,000 reads // best case (inner table or both fit in memory) // for each outer loop, inner loop is runs 200 times, but // this all fits in memory, so is read 1 time // outer loop runs 10,000 times // total = ,000 = 10,200 reads

34 Cost Estimation: Joins Algorithom 2: Block Oriented Iteration deposit=10,000 and customer = 200 rows bfr deposit = 20 requires 10,000/20 ==500 blocks bfr customer=20 requires 200/20 = 10 blocks for each block Bd in deposit for each block Bc in customer for each row d in Bd // in memory for each row c in Bc //in memory test (d,c) worst case (1 disk read for each block) 500 reads for deposit, 10 * 500 reads for customer Total: 5500 block reads best case (inner table or both fit in memory) 10 for customer for deposit = Total: 510 block reads

35 Algorithm 3: Merge-Join When to use: Neither table fits in memory Both tables are sorted by the join attribute Method: Associate a pointer with each table A group of rows with the join attribute value will match a consecutive group of matching rows in the other table Reads each block one time (same as when inner or both tables fit into memory) Example: next slide

36 Merge Join P 1 (STARTS AT 1ST BLOCK) Customer DB EJ OO RP once we ve read the last matching block in P 2, we know there will be no more matches P 2 (STARTS AT 1ST BLOCK) Deposit DB DB DB EJ OO OO OO OO RP RP

37 Algorithm Comparison Summary for Join Algorithms Number of disk reads Algorithm1: Simple iteration (single row reads) 2,010,000 Algorithm 1: Simple iteration (inner table in memory) 10, 200 Algorithm 2: Block oriented iteration (single block reads) 5,500 Algorithm 2: Block oriented iteration (inner table in memory) 510 Algorithm 3: Merge Join (single block reads)* 510 Algorithm 4: Index join ** (note: save 1,970,000 vs. 1 st row) 40,000 Remember: selection of these algorithms depend on file size, memory buffer size, file sort order, and index availability Both table are sorted by join attribute; neither file fits in memory Neither table is sorted by join attribute

Chapter 12: Query Processing

Chapter 12: Query Processing Chapter 12: Query Processing Overview Catalog Information for Cost Estimation $ Measures of Query Cost Selection Operation Sorting Join Operation Other Operations Evaluation of Expressions Transformation

More information

Ch 5 : Query Processing & Optimization

Ch 5 : Query Processing & Optimization Ch 5 : Query Processing & Optimization Basic Steps in Query Processing 1. Parsing and translation 2. Optimization 3. Evaluation Basic Steps in Query Processing (Cont.) Parsing and translation translate

More information

CMSC 424 Database design Lecture 18 Query optimization. Mihai Pop

CMSC 424 Database design Lecture 18 Query optimization. Mihai Pop CMSC 424 Database design Lecture 18 Query optimization Mihai Pop More midterm solutions Projects do not be late! Admin Introduction Alternative ways of evaluating a given query Equivalent expressions Different

More information

Query Processing & Optimization. CS 377: Database Systems

Query Processing & Optimization. CS 377: Database Systems Query Processing & Optimization CS 377: Database Systems Recap: File Organization & Indexing Physical level support for data retrieval File organization: ordered or sequential file to find items using

More information

Database System Concepts

Database System Concepts Chapter 14: Optimization Departamento de Engenharia Informática Instituto Superior Técnico 1 st Semester 2007/2008 Slides (fortemente) baseados nos slides oficiais do livro c Silberschatz, Korth and Sudarshan.

More information

Database System Concepts

Database System Concepts Chapter 13: Query Processing s Departamento de Engenharia Informática Instituto Superior Técnico 1 st Semester 2008/2009 Slides (fortemente) baseados nos slides oficiais do livro c Silberschatz, Korth

More information

Query processing and optimization

Query processing and optimization Query processing and optimization These slides are a modified version of the slides of the book Database System Concepts (Chapter 13 and 14), 5th Ed., McGraw-Hill, by Silberschatz, Korth and Sudarshan.

More information

Contents Contents Introduction Basic Steps in Query Processing Introduction Transformation of Relational Expressions...

Contents Contents Introduction Basic Steps in Query Processing Introduction Transformation of Relational Expressions... Contents Contents...283 Introduction...283 Basic Steps in Query Processing...284 Introduction...285 Transformation of Relational Expressions...287 Equivalence Rules...289 Transformation Example: Pushing

More information

Query Optimization. Query Optimization. Optimization considerations. Example. Interaction of algorithm choice and tree arrangement.

Query Optimization. Query Optimization. Optimization considerations. Example. Interaction of algorithm choice and tree arrangement. COS 597: Principles of Database and Information Systems Query Optimization Query Optimization Query as expression over relational algebraic operations Get evaluation (parse) tree Leaves: base relations

More information

Query Optimization. Shuigeng Zhou. December 9, 2009 School of Computer Science Fudan University

Query Optimization. Shuigeng Zhou. December 9, 2009 School of Computer Science Fudan University Query Optimization Shuigeng Zhou December 9, 2009 School of Computer Science Fudan University Outline Introduction Catalog Information for Cost Estimation Estimation of Statistics Transformation of Relational

More information

Review. Support for data retrieval at the physical level:

Review. Support for data retrieval at the physical level: Query Processing Review Support for data retrieval at the physical level: Indices: data structures to help with some query evaluation: SELECTION queries (ssn = 123) RANGE queries (100

More information

Introduction to Query Processing and Query Optimization Techniques. Copyright 2011 Ramez Elmasri and Shamkant Navathe

Introduction to Query Processing and Query Optimization Techniques. Copyright 2011 Ramez Elmasri and Shamkant Navathe Introduction to Query Processing and Query Optimization Techniques Outline Translating SQL Queries into Relational Algebra Algorithms for External Sorting Algorithms for SELECT and JOIN Operations Algorithms

More information

Chapter 13: Query Processing

Chapter 13: Query Processing Chapter 13: Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 13.1 Basic Steps in Query Processing 1. Parsing

More information

What happens. 376a. Database Design. Execution strategy. Query conversion. Next. Two types of techniques

What happens. 376a. Database Design. Execution strategy. Query conversion. Next. Two types of techniques 376a. Database Design Dept. of Computer Science Vassar College http://www.cs.vassar.edu/~cs376 Class 16 Query optimization What happens Database is given a query Query is scanned - scanner creates a list

More information

DBMS Query evaluation

DBMS Query evaluation Data Management for Data Science DBMS Maurizio Lenzerini, Riccardo Rosati Corso di laurea magistrale in Data Science Sapienza Università di Roma Academic Year 2016/2017 http://www.dis.uniroma1.it/~rosati/dmds/

More information

CSC 261/461 Database Systems Lecture 19

CSC 261/461 Database Systems Lecture 19 CSC 261/461 Database Systems Lecture 19 Fall 2017 Announcements CIRC: CIRC is down!!! MongoDB and Spark (mini) projects are at stake. L Project 1 Milestone 4 is out Due date: Last date of class We will

More information

! A relational algebra expression may have many equivalent. ! Cost is generally measured as total elapsed time for

! A relational algebra expression may have many equivalent. ! Cost is generally measured as total elapsed time for Chapter 13: Query Processing Basic Steps in Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 1. Parsing and

More information

Chapter 13: Query Processing Basic Steps in Query Processing

Chapter 13: Query Processing Basic Steps in Query Processing Chapter 13: Query Processing Basic Steps in Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 1. Parsing and

More information

Query Processing & Optimization

Query Processing & Optimization Query Processing & Optimization 1 Roadmap of This Lecture Overview of query processing Measures of Query Cost Selection Operation Sorting Join Operation Other Operations Evaluation of Expressions Introduction

More information

Query Processing Strategies and Optimization

Query Processing Strategies and Optimization Query Processing Strategies and Optimization CPS352: Database Systems Simon Miner Gordon College Last Revised: 10/25/12 Agenda Check-in Design Project Presentations Query Processing Programming Project

More information

Chapter 8: Relational Algebra

Chapter 8: Relational Algebra Chapter 8: elational Algebra Outline: Introduction Unary elational Operations. Select Operator (σ) Project Operator (π) ename Operator (ρ) Assignment Operator ( ) Binary elational Operations. Set Operators

More information

Query Processing. Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016

Query Processing. Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016 Query Processing Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016 Slides re-used with some modification from www.db-book.com Reference: Database System Concepts, 6 th Ed. By Silberschatz,

More information

CS121 MIDTERM REVIEW. CS121: Relational Databases Fall 2017 Lecture 13

CS121 MIDTERM REVIEW. CS121: Relational Databases Fall 2017 Lecture 13 CS121 MIDTERM REVIEW CS121: Relational Databases Fall 2017 Lecture 13 2 Before We Start Midterm Overview 3 6 hours, multiple sittings Open book, open notes, open lecture slides No collaboration Possible

More information

Advanced Database Systems

Advanced Database Systems Lecture IV Query Processing Kyumars Sheykh Esmaili Basic Steps in Query Processing 2 Query Optimization Many equivalent execution plans Choosing the best one Based on Heuristics, Cost Will be discussed

More information

Algorithms for Query Processing and Optimization. 0. Introduction to Query Processing (1)

Algorithms for Query Processing and Optimization. 0. Introduction to Query Processing (1) Chapter 19 Algorithms for Query Processing and Optimization 0. Introduction to Query Processing (1) Query optimization: The process of choosing a suitable execution strategy for processing a query. Two

More information

Database Systems SQL SL03

Database Systems SQL SL03 Inf4Oec10, SL03 1/52 M. Böhlen, ifi@uzh Informatik für Ökonomen II Fall 2010 Database Systems SQL SL03 Data Definition Language Table Expressions, Query Specifications, Query Expressions Subqueries, Duplicates,

More information

R & G Chapter 13. Implementation of single Relational Operations Choices depend on indexes, memory, stats, Joins Blocked nested loops:

R & G Chapter 13. Implementation of single Relational Operations Choices depend on indexes, memory, stats, Joins Blocked nested loops: Relational Query Optimization R & G Chapter 13 Review Implementation of single Relational Operations Choices depend on indexes, memory, stats, Joins Blocked nested loops: simple, exploits extra memory

More information

Silberschatz, Korth and Sudarshan See for conditions on re-use

Silberschatz, Korth and Sudarshan See   for conditions on re-use Chapter 3: SQL Database System Concepts, 5th Ed. See www.db-book.com for conditions on re-use Chapter 3: SQL Data Definition Basic Query Structure Set Operations Aggregate Functions Null Values Nested

More information

Database Tuning and Physical Design: Basics of Query Execution

Database Tuning and Physical Design: Basics of Query Execution Database Tuning and Physical Design: Basics of Query Execution Spring 2018 School of Computer Science University of Waterloo Databases CS348 (University of Waterloo) Query Execution 1 / 43 The Client/Server

More information

Database Systems SQL SL03

Database Systems SQL SL03 Checking... Informatik für Ökonomen II Fall 2010 Data Definition Language Database Systems SQL SL03 Table Expressions, Query Specifications, Query Expressions Subqueries, Duplicates, Null Values Modification

More information

Chapter 12: Query Processing

Chapter 12: Query Processing Chapter 12: Query Processing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Overview Chapter 12: Query Processing Measures of Query Cost Selection Operation Sorting Join

More information

Fundamentals of Database Systems

Fundamentals of Database Systems Fundamentals of Database Systems Assignment: 4 September 21, 2015 Instructions 1. This question paper contains 10 questions in 5 pages. Q1: Calculate branching factor in case for B- tree index structure,

More information

Chapter 3: SQL. Database System Concepts, 5th Ed. Silberschatz, Korth and Sudarshan See for conditions on re-use

Chapter 3: SQL. Database System Concepts, 5th Ed. Silberschatz, Korth and Sudarshan See  for conditions on re-use Chapter 3: SQL Database System Concepts, 5th Ed. See www.db-book.com for conditions on re-use Chapter 3: SQL Data Definition Basic Query Structure Set Operations Aggregate Functions Null Values Nested

More information

Chapter 3: SQL. Chapter 3: SQL

Chapter 3: SQL. Chapter 3: SQL Chapter 3: SQL Database System Concepts, 5th Ed. See www.db-book.com for conditions on re-use Chapter 3: SQL Data Definition Basic Query Structure Set Operations Aggregate Functions Null Values Nested

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe CHAPTER 19 Query Optimization Introduction Query optimization Conducted by a query optimizer in a DBMS Goal: select best available strategy for executing query Based on information available Most RDBMSs

More information

Announcement. Reading Material. Overview of Query Evaluation. Overview of Query Evaluation. Overview of Query Evaluation 9/26/17

Announcement. Reading Material. Overview of Query Evaluation. Overview of Query Evaluation. Overview of Query Evaluation 9/26/17 Announcement CompSci 516 Database Systems Lecture 10 Query Evaluation and Join Algorithms Project proposal pdf due on sakai by 5 pm, tomorrow, Thursday 09/27 One per group by any member Instructor: Sudeepa

More information

Chapter 14 Query Optimization

Chapter 14 Query Optimization Chapter 14 Query Optimization Chapter 14: Query Optimization! Introduction! Catalog Information for Cost Estimation! Estimation of Statistics! Transformation of Relational Expressions! Dynamic Programming

More information

Chapter 4: SQL. Basic Structure

Chapter 4: SQL. Basic Structure Chapter 4: SQL Basic Structure Set Operations Aggregate Functions Null Values Nested Subqueries Derived Relations Views Modification of the Database Joined Relations Data Definition Language Embedded SQL

More information

Chapter 14 Query Optimization

Chapter 14 Query Optimization Chapter 14 Query Optimization Chapter 14: Query Optimization! Introduction! Catalog Information for Cost Estimation! Estimation of Statistics! Transformation of Relational Expressions! Dynamic Programming

More information

Chapter 14 Query Optimization

Chapter 14 Query Optimization Chapter 14: Query Optimization Chapter 14 Query Optimization! Introduction! Catalog Information for Cost Estimation! Estimation of Statistics! Transformation of Relational Expressions! Dynamic Programming

More information

Query optimization. Elena Baralis, Silvia Chiusano Politecnico di Torino. DBMS Architecture D B M G. Database Management Systems. Pag.

Query optimization. Elena Baralis, Silvia Chiusano Politecnico di Torino. DBMS Architecture D B M G. Database Management Systems. Pag. Database Management Systems DBMS Architecture SQL INSTRUCTION OPTIMIZER MANAGEMENT OF ACCESS METHODS CONCURRENCY CONTROL BUFFER MANAGER RELIABILITY MANAGEMENT Index Files Data Files System Catalog DATABASE

More information

Query Processing and Query Optimization. Prof Monika Shah

Query Processing and Query Optimization. Prof Monika Shah Query Processing and Query Optimization Query Processing SQL Query Is in Library Cache? System catalog (Dict / Dict cache) Scan and verify relations Parse into parse tree (relational Calculus) View definitions

More information

SQL QUERIES. CS121: Relational Databases Fall 2017 Lecture 5

SQL QUERIES. CS121: Relational Databases Fall 2017 Lecture 5 SQL QUERIES CS121: Relational Databases Fall 2017 Lecture 5 SQL Queries 2 SQL queries use the SELECT statement General form is: SELECT A 1, A 2,... FROM r 1, r 2,... WHERE P; r i are the relations (tables)

More information

Chapter 14: Query Optimization

Chapter 14: Query Optimization Chapter 14: Query Optimization Database System Concepts 5 th Ed. See www.db-book.com for conditions on re-use Chapter 14: Query Optimization Introduction Transformation of Relational Expressions Catalog

More information

Introduction Alternative ways of evaluating a given query using

Introduction Alternative ways of evaluating a given query using Query Optimization Introduction Catalog Information for Cost Estimation Estimation of Statistics Transformation of Relational Expressions Dynamic Programming for Choosing Evaluation Plans Introduction

More information

Chapter 12: Query Processing. Chapter 12: Query Processing

Chapter 12: Query Processing. Chapter 12: Query Processing Chapter 12: Query Processing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 12: Query Processing Overview Measures of Query Cost Selection Operation Sorting Join

More information

Database System Concepts, 5th Ed.! Silberschatz, Korth and Sudarshan See for conditions on re-use "

Database System Concepts, 5th Ed.! Silberschatz, Korth and Sudarshan See   for conditions on re-use Database System Concepts, 5th Ed.! Silberschatz, Korth and Sudarshan See www.db-book.com for conditions on re-use " Data Definition! Basic Query Structure! Set Operations! Aggregate Functions! Null Values!

More information

Basic Structure Set Operations Aggregate Functions Null Values Nested Subqueries Derived Relations Views Modification of the Database Data Definition

Basic Structure Set Operations Aggregate Functions Null Values Nested Subqueries Derived Relations Views Modification of the Database Data Definition Chapter 4: SQL Basic Structure Set Operations Aggregate Functions Null Values Nested Subqueries Derived Relations Views Modification of the Database Data Definition Language 4.1 Schema Used in Examples

More information

Relational Algebra. Relational Algebra. 7/4/2017 Md. Golam Moazzam, Dept. of CSE, JU

Relational Algebra. Relational Algebra. 7/4/2017 Md. Golam Moazzam, Dept. of CSE, JU Relational Algebra 1 Structure of Relational Databases A relational database consists of a collection of tables, each of which is assigned a unique name. A row in a table represents a relationship among

More information

Database System Concepts, 5 th Ed.! Silberschatz, Korth and Sudarshan See for conditions on re-use "

Database System Concepts, 5 th Ed.! Silberschatz, Korth and Sudarshan See   for conditions on re-use Database System Concepts, 5 th Ed.! Silberschatz, Korth and Sudarshan See www.db-book.com for conditions on re-use " Structure of Relational Databases! Fundamental Relational-Algebra-Operations! Additional

More information

SQL. Lecture 4 SQL. Basic Structure. The select Clause. The select Clause (Cont.) The select Clause (Cont.) Basic Structure.

SQL. Lecture 4 SQL. Basic Structure. The select Clause. The select Clause (Cont.) The select Clause (Cont.) Basic Structure. SL Lecture 4 SL Chapter 4 (Sections 4.1, 4.2, 4.3, 4.4, 4.5, 4., 4.8, 4.9, 4.11) Basic Structure Set Operations Aggregate Functions Null Values Nested Subqueries Derived Relations Modification of the Database

More information

Chapter 5: Other Relational Languages

Chapter 5: Other Relational Languages Chapter 5: Other Relational Languages Database System Concepts, 5th Ed. See www.db-book.com for conditions on re-use Chapter 5: Other Relational Languages Tuple Relational Calculus Domain Relational Calculus

More information

Chapter 17 Indexing Structures for Files and Physical Database Design

Chapter 17 Indexing Structures for Files and Physical Database Design Chapter 17 Indexing Structures for Files and Physical Database Design We assume that a file already exists with some primary organization unordered, ordered or hash. The index provides alternate ways to

More information

CS 4604: Introduction to Database Management Systems. B. Aditya Prakash Lecture #10: Query Processing

CS 4604: Introduction to Database Management Systems. B. Aditya Prakash Lecture #10: Query Processing CS 4604: Introduction to Database Management Systems B. Aditya Prakash Lecture #10: Query Processing Outline introduction selection projection join set & aggregate operations Prakash 2018 VT CS 4604 2

More information

Chapter 3. Algorithms for Query Processing and Optimization

Chapter 3. Algorithms for Query Processing and Optimization Chapter 3 Algorithms for Query Processing and Optimization Chapter Outline 1. Introduction to Query Processing 2. Translating SQL Queries into Relational Algebra 3. Algorithms for External Sorting 4. Algorithms

More information

SQL QUERY EVALUATION. CS121: Relational Databases Fall 2017 Lecture 12

SQL QUERY EVALUATION. CS121: Relational Databases Fall 2017 Lecture 12 SQL QUERY EVALUATION CS121: Relational Databases Fall 2017 Lecture 12 Query Evaluation 2 Last time: Began looking at database implementation details How data is stored and accessed by the database Using

More information

CMP-3440 Database Systems

CMP-3440 Database Systems CMP-3440 Database Systems Relational DB Languages Relational Algebra, Calculus, SQL Lecture 05 zain 1 Introduction Relational algebra & relational calculus are formal languages associated with the relational

More information

Chapter 18 Strategies for Query Processing. We focus this discussion w.r.t RDBMS, however, they are applicable to OODBS.

Chapter 18 Strategies for Query Processing. We focus this discussion w.r.t RDBMS, however, they are applicable to OODBS. Chapter 18 Strategies for Query Processing We focus this discussion w.r.t RDBMS, however, they are applicable to OODBS. 1 1. Translating SQL Queries into Relational Algebra and Other Operators - SQL is

More information

CMSC424: Database Design. Instructor: Amol Deshpande

CMSC424: Database Design. Instructor: Amol Deshpande CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu Databases Data Models Conceptual representa1on of the data Data Retrieval How to ask ques1ons of the database How to answer those ques1ons

More information

Plan for today. Query Processing/Optimization. Parsing. A query s trip through the DBMS. Validation. Logical plan

Plan for today. Query Processing/Optimization. Parsing. A query s trip through the DBMS. Validation. Logical plan Plan for today Query Processing/Optimization CPS 216 Advanced Database Systems Overview of query processing Query execution Query plan enumeration Query rewrite heuristics Query rewrite in DB2 2 A query

More information

Query Execution [15]

Query Execution [15] CSC 661, Principles of Database Systems Query Execution [15] Dr. Kalpakis http://www.csee.umbc.edu/~kalpakis/courses/661 Query processing involves Query processing compilation parsing to construct parse

More information

EECS 647: Introduction to Database Systems

EECS 647: Introduction to Database Systems EECS 647: Introduction to Database Systems Instructor: Luke Huan Spring 2009 External Sorting Today s Topic Implementing the join operation 4/8/2009 Luke Huan Univ. of Kansas 2 Review DBMS Architecture

More information

CMSC424: Database Design. Instructor: Amol Deshpande

CMSC424: Database Design. Instructor: Amol Deshpande CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu Databases Data Models Conceptual representa1on of the data Data Retrieval How to ask ques1ons of the database How to answer those ques1ons

More information

Implementation of Relational Operations. Introduction. CS 186, Fall 2002, Lecture 19 R&G - Chapter 12

Implementation of Relational Operations. Introduction. CS 186, Fall 2002, Lecture 19 R&G - Chapter 12 Implementation of Relational Operations CS 186, Fall 2002, Lecture 19 R&G - Chapter 12 First comes thought; then organization of that thought, into ideas and plans; then transformation of those plans into

More information

Lecture 3 SQL. Shuigeng Zhou. September 23, 2008 School of Computer Science Fudan University

Lecture 3 SQL. Shuigeng Zhou. September 23, 2008 School of Computer Science Fudan University Lecture 3 SQL Shuigeng Zhou September 23, 2008 School of Computer Science Fudan University Outline Basic Structure Set Operations Aggregate Functions Null Values Nested Subqueries Derived Relations Views

More information

Chapter 2: Relational Model

Chapter 2: Relational Model Chapter 2: Relational Model Database System Concepts, 5 th Ed. See www.db-book.com for conditions on re-use Chapter 2: Relational Model Structure of Relational Databases Fundamental Relational-Algebra-Operations

More information

Faloutsos 1. Carnegie Mellon Univ. Dept. of Computer Science Database Applications. Outline

Faloutsos 1. Carnegie Mellon Univ. Dept. of Computer Science Database Applications. Outline Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications Lecture #14: Implementation of Relational Operations (R&G ch. 12 and 14) 15-415 Faloutsos 1 introduction selection projection

More information

RELATIONAL OPERATORS #1

RELATIONAL OPERATORS #1 RELATIONAL OPERATORS #1 CS 564- Spring 2018 ACKs: Jeff Naughton, Jignesh Patel, AnHai Doan WHAT IS THIS LECTURE ABOUT? Algorithms for relational operators: select project 2 ARCHITECTURE OF A DBMS query

More information

University of Waterloo Midterm Examination Sample Solution

University of Waterloo Midterm Examination Sample Solution 1. (4 total marks) University of Waterloo Midterm Examination Sample Solution Winter, 2012 Suppose that a relational database contains the following large relation: Track(ReleaseID, TrackNum, Title, Length,

More information

Database Management System 11

Database Management System 11 Database Management System 11 School of Computer Engineering, KIIT University 11.1 Language in which user requests information from the database are: Procedural language Nonprocedural language The categories

More information

CSIT5300: Advanced Database Systems

CSIT5300: Advanced Database Systems CSIT5300: Advanced Database Systems E10: Exercises on Query Processing Dr. Kenneth LEUNG Department of Computer Science and Engineering The Hong Kong University of Science and Technology Hong Kong SAR,

More information

Outline. Database Management and Tuning. Outline. Join Strategies Running Example. Index Tuning. Johann Gamper. Unit 6 April 12, 2012

Outline. Database Management and Tuning. Outline. Join Strategies Running Example. Index Tuning. Johann Gamper. Unit 6 April 12, 2012 Outline Database Management and Tuning Johann Gamper Free University of Bozen-Bolzano Faculty of Computer Science IDSE Unit 6 April 12, 2012 1 Acknowledgements: The slides are provided by Nikolaus Augsten

More information

Database Applications (15-415)

Database Applications (15-415) Database Applications (15-415) DBMS Internals- Part VI Lecture 14, March 12, 2014 Mohammad Hammoud Today Last Session: DBMS Internals- Part V Hash-based indexes (Cont d) and External Sorting Today s Session:

More information

Implementing Relational Operators: Selection, Projection, Join. Database Management Systems, R. Ramakrishnan and J. Gehrke 1

Implementing Relational Operators: Selection, Projection, Join. Database Management Systems, R. Ramakrishnan and J. Gehrke 1 Implementing Relational Operators: Selection, Projection, Join Database Management Systems, R. Ramakrishnan and J. Gehrke 1 Readings [RG] Sec. 14.1-14.4 Database Management Systems, R. Ramakrishnan and

More information

CompSci 516 Data Intensive Computing Systems

CompSci 516 Data Intensive Computing Systems CompSci 516 Data Intensive Computing Systems Lecture 9 Join Algorithms and Query Optimizations Instructor: Sudeepa Roy CompSci 516: Data Intensive Computing Systems 1 Announcements Takeaway from Homework

More information

Chapter 13: Query Optimization

Chapter 13: Query Optimization Chapter 13: Query Optimization Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 13: Query Optimization Introduction Transformation of Relational Expressions Catalog

More information

RDBMS- Day 4. Grouped results Relational algebra Joins Sub queries. In today s session we will discuss about the concept of sub queries.

RDBMS- Day 4. Grouped results Relational algebra Joins Sub queries. In today s session we will discuss about the concept of sub queries. RDBMS- Day 4 Grouped results Relational algebra Joins Sub queries In today s session we will discuss about the concept of sub queries. Grouped results SQL - Using GROUP BY Related rows can be grouped together

More information

CPSC 421 Database Management Systems. Lecture 19: Physical Database Design Concurrency Control and Recovery

CPSC 421 Database Management Systems. Lecture 19: Physical Database Design Concurrency Control and Recovery CPSC 421 Database Management Systems Lecture 19: Physical Database Design Concurrency Control and Recovery * Some material adapted from R. Ramakrishnan, L. Delcambre, and B. Ludaescher Agenda Physical

More information

Relational Model History. COSC 304 Introduction to Database Systems. Relational Model and Algebra. Relational Model Definitions.

Relational Model History. COSC 304 Introduction to Database Systems. Relational Model and Algebra. Relational Model Definitions. COSC 304 Introduction to Database Systems Relational Model and Algebra Dr. Ramon Lawrence University of British Columbia Okanagan ramon.lawrence@ubc.ca Relational Model History The relational model was

More information

Review. Relational Query Optimization. Query Optimization Overview (cont) Query Optimization Overview. Cost-based Query Sub-System

Review. Relational Query Optimization. Query Optimization Overview (cont) Query Optimization Overview. Cost-based Query Sub-System Review Relational Query Optimization R & G Chapter 12/15 Implementation of single Relational Operations Choices depend on indexes, memory, stats, Joins Blocked nested loops: simple, exploits extra memory

More information

ARTICLE RELATIONAL ALGEBRA

ARTICLE RELATIONAL ALGEBRA ARTICLE ON RELATIONAL ALGEBRA Tips to crack queries in GATE Exams:- In GATE exam you have no need to learn the syntax of different operations. You have to understand only how to execute that operation.

More information

Optimization Overview

Optimization Overview Lecture 17 Optimization Overview Lecture 17 Lecture 17 Today s Lecture 1. Logical Optimization 2. Physical Optimization 3. Course Summary 2 Lecture 17 Logical vs. Physical Optimization Logical optimization:

More information

Chapter 12: Indexing and Hashing. Basic Concepts

Chapter 12: Indexing and Hashing. Basic Concepts Chapter 12: Indexing and Hashing! Basic Concepts! Ordered Indices! B+-Tree Index Files! B-Tree Index Files! Static Hashing! Dynamic Hashing! Comparison of Ordered Indexing and Hashing! Index Definition

More information

Principles of Database Management Systems

Principles of Database Management Systems Principles of Database Management Systems 5: Query Processing Pekka Kilpeläinen (partially based on Stanford CS245 slide originals by Hector Garcia-Molina, Jeff Ullman and Jennifer Widom) Query Processing

More information

Database System Concepts

Database System Concepts s Slides (fortemente) baseados nos slides oficiais do livro c Silberschatz, Korth and Sudarshan. Chapter 2: Model Departamento de Engenharia Informática Instituto Superior Técnico 1 st Semester 2009/2010

More information

Set theory is a branch of mathematics that studies sets. Sets are a collection of objects.

Set theory is a branch of mathematics that studies sets. Sets are a collection of objects. Set Theory Set theory is a branch of mathematics that studies sets. Sets are a collection of objects. Often, all members of a set have similar properties, such as odd numbers less than 10 or students in

More information

Administriva. CS 133: Databases. General Themes. Goals for Today. Fall 2018 Lec 11 10/11 Query Evaluation Prof. Beth Trushkowsky

Administriva. CS 133: Databases. General Themes. Goals for Today. Fall 2018 Lec 11 10/11 Query Evaluation Prof. Beth Trushkowsky Administriva Lab 2 Final version due next Wednesday CS 133: Databases Fall 2018 Lec 11 10/11 Query Evaluation Prof. Beth Trushkowsky Problem sets PSet 5 due today No PSet out this week optional practice

More information

Chapter 12: Indexing and Hashing

Chapter 12: Indexing and Hashing Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing and Hashing Index Definition in SQL

More information

Administrivia. CS 133: Databases. Cost-based Query Sub-System. Goals for Today. Midterm on Thursday 10/18. Assignments

Administrivia. CS 133: Databases. Cost-based Query Sub-System. Goals for Today. Midterm on Thursday 10/18. Assignments Administrivia Midterm on Thursday 10/18 CS 133: Databases Fall 2018 Lec 12 10/16 Prof. Beth Trushkowsky Assignments Lab 3 starts after fall break No problem set out this week Goals for Today Cost-based

More information

Outline. Query Processing Overview Algorithms for basic operations. Query optimization. Sorting Selection Join Projection

Outline. Query Processing Overview Algorithms for basic operations. Query optimization. Sorting Selection Join Projection Outline Query Processing Overview Algorithms for basic operations Sorting Selection Join Projection Query optimization Heuristics Cost-based optimization 19 Estimate I/O Cost for Implementations Count

More information

Chapter 11: Query Optimization

Chapter 11: Query Optimization Chapter 11: Query Optimization Chapter 11: Query Optimization Introduction Transformation of Relational Expressions Statistical Information for Cost Estimation Cost-based optimization Dynamic Programming

More information

DATABASE DESIGN I - 1DL300

DATABASE DESIGN I - 1DL300 DATABASE DESIGN I - 1DL300 Fall 2010 An introductory course on database systems http://www.it.uu.se/edu/course/homepage/dbastekn/ht10/ Manivasakan Sabesan Uppsala Database Laboratory Department of Information

More information

Lecture Query evaluation. Combining operators. Logical query optimization. By Marina Barsky Winter 2016, University of Toronto

Lecture Query evaluation. Combining operators. Logical query optimization. By Marina Barsky Winter 2016, University of Toronto Lecture 02.03. Query evaluation Combining operators. Logical query optimization By Marina Barsky Winter 2016, University of Toronto Quick recap: Relational Algebra Operators Core operators: Selection σ

More information

CAS CS 460/660 Introduction to Database Systems. Query Evaluation II 1.1

CAS CS 460/660 Introduction to Database Systems. Query Evaluation II 1.1 CAS CS 460/660 Introduction to Database Systems Query Evaluation II 1.1 Cost-based Query Sub-System Queries Select * From Blah B Where B.blah = blah Query Parser Query Optimizer Plan Generator Plan Cost

More information

The SQL database language Parts of the SQL language

The SQL database language Parts of the SQL language DATABASE DESIGN I - 1DL300 Fall 2011 Introduction to SQL Elmasri/Navathe ch 4,5 Padron-McCarthy/Risch ch 7,8,9 An introductory course on database systems http://www.it.uu.se/edu/course/homepage/dbastekn/ht11

More information

DBMS Y3/S5. 1. OVERVIEW The steps involved in processing a query are: 1. Parsing and translation. 2. Optimization. 3. Evaluation.

DBMS Y3/S5. 1. OVERVIEW The steps involved in processing a query are: 1. Parsing and translation. 2. Optimization. 3. Evaluation. Query Processing QUERY PROCESSING refers to the range of activities involved in extracting data from a database. The activities include translation of queries in high-level database languages into expressions

More information

Evaluation of relational operations

Evaluation of relational operations Evaluation of relational operations Iztok Savnik, FAMNIT Slides & Textbook Textbook: Raghu Ramakrishnan, Johannes Gehrke, Database Management Systems, McGraw-Hill, 3 rd ed., 2007. Slides: From Cow Book

More information

CMPUT 391 Database Management Systems. An Overview of Query Processing. Textbook: Chapter 11 (first edition: Chapter 14)

CMPUT 391 Database Management Systems. An Overview of Query Processing. Textbook: Chapter 11 (first edition: Chapter 14) CMPUT 391 Database Management Systems Winter Semester 2006, Section B1, Dr. Jörg Sander An Overview of Query Processing Textbook: Chapter 11 (first edition: Chapter 14) Based on slides by Lewis, Bernstein

More information

Textbook: Chapter 6! CS425 Fall 2013 Boris Glavic! Chapter 3: Formal Relational Query. Relational Algebra! Select Operation Example! Select Operation!

Textbook: Chapter 6! CS425 Fall 2013 Boris Glavic! Chapter 3: Formal Relational Query. Relational Algebra! Select Operation Example! Select Operation! Chapter 3: Formal Relational Query Languages CS425 Fall 2013 Boris Glavic Chapter 3: Formal Relational Query Languages Relational Algebra Tuple Relational Calculus Domain Relational Calculus Textbook:

More information

Relational Query Optimization

Relational Query Optimization Relational Query Optimization Module 4, Lectures 3 and 4 Database Management Systems, R. Ramakrishnan 1 Overview of Query Optimization Plan: Tree of R.A. ops, with choice of alg for each op. Each operator

More information