CMSC 424 Database design Lecture 18 Query optimization. Mihai Pop

Similar documents
Ch 5 : Query Processing & Optimization

Database System Concepts

Chapter 14: Query Optimization

Contents Contents Introduction Basic Steps in Query Processing Introduction Transformation of Relational Expressions...

Advanced Databases. Lecture 4 - Query Optimization. Masood Niazi Torshiz Islamic Azad university- Mashhad Branch

Chapter 13: Query Optimization

Chapter 11: Query Optimization

Chapter 13: Query Optimization. Chapter 13: Query Optimization

Query Optimization. Shuigeng Zhou. December 9, 2009 School of Computer Science Fudan University

Query Processing & Optimization

Query processing and optimization

Chapter 14 Query Optimization

Chapter 14 Query Optimization

Chapter 14 Query Optimization

Introduction Alternative ways of evaluating a given query using

Chapter 12: Query Processing

14.3 Transformation of Relational Expressions

CMSC424: Database Design. Instructor: Amol Deshpande

CSCC43H: Introduction to Databases. Lecture 3

Chapter 3: SQL. Chapter 3: SQL

Database System Concepts, 5th Ed.! Silberschatz, Korth and Sudarshan See for conditions on re-use "

Silberschatz, Korth and Sudarshan See for conditions on re-use

Chapter 3: SQL. Database System Concepts, 5th Ed. Silberschatz, Korth and Sudarshan See for conditions on re-use

Chapter 2: Relational Model

Other Relational Query Languages

DBMS Y3/S5. 1. OVERVIEW The steps involved in processing a query are: 1. Parsing and translation. 2. Optimization. 3. Evaluation.

QQ Group

Overview of Query Processing and Optimization

Chapter 6: Formal Relational Query Languages

Query Processing Strategies and Optimization

SQL QUERIES. CS121: Relational Databases Fall 2017 Lecture 5

CS121 MIDTERM REVIEW. CS121: Relational Databases Fall 2017 Lecture 13

Database System Concepts, 5 th Ed.! Silberschatz, Korth and Sudarshan See for conditions on re-use "

Other Query Languages II. Winter Lecture 12

Chapter 5: Other Relational Languages

DATABASE DESIGN I - 1DL300

Textbook: Chapter 6! CS425 Fall 2013 Boris Glavic! Chapter 3: Formal Relational Query. Relational Algebra! Select Operation Example! Select Operation!

Distributed Query Processing. Banking Example

SQL. Lecture 4 SQL. Basic Structure. The select Clause. The select Clause (Cont.) The select Clause (Cont.) Basic Structure.

Chapter 8: Relational Algebra

The SQL database language Parts of the SQL language

DATABASE TECHNOLOGY. Spring An introduction to database systems

Basic Structure Set Operations Aggregate Functions Null Values Nested Subqueries Derived Relations Views Modification of the Database Data Definition

CSIT5300: Advanced Database Systems

Database Systems SQL SL03

Final Exam Review 2. Kathleen Durant CS 3200 Northeastern University Lecture 23

Database Systems SQL SL03

Chapter 13: Query Processing

Improving Query Plans. CS157B Chris Pollett Mar. 21, 2005.

! A relational algebra expression may have many equivalent. ! Cost is generally measured as total elapsed time for

Chapter 13: Query Processing Basic Steps in Query Processing

Database Management System 11

1. (a) Explain the Transaction management in a database. (b) Discuss the Query Processor of Database system structure. [8+8]

Database System Concepts

Chapter 4: SQL. Basic Structure

CS352 - DATABASE SYSTEMS. To give you experience with developing a database to model a real domain

CSIT5300: Advanced Database Systems

Database Systems CSE Comprehensive Exam Spring 2005

CSCC43H: Introduction to Databases. Lecture 4

Lecture Query evaluation. Combining operators. Logical query optimization. By Marina Barsky Winter 2016, University of Toronto

Simple SQL Queries (2)

Chapter 19: Distributed Databases

Database System Concepts, 5th Ed.! Silberschatz, Korth and Sudarshan See for conditions on re-use "

Relational Algebra. Relational Algebra. 7/4/2017 Md. Golam Moazzam, Dept. of CSE, JU

Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. Chapter 6 Outline. Unary Relational Operations: SELECT and

Advanced Database Systems

RELATIONAL ALGEBRA. CS121: Relational Databases Fall 2017 Lecture 2

Query Optimization. Chapter 13

Other Query Languages. Winter Lecture 11

2.2.2.Relational Database concept

Introduction to Query Processing and Query Optimization Techniques. Copyright 2011 Ramez Elmasri and Shamkant Navathe

SQL OVERVIEW. CS121: Relational Databases Fall 2017 Lecture 4

Lecture 3 SQL. Shuigeng Zhou. September 23, 2008 School of Computer Science Fudan University

CS352 - DATABASE SYSTEMS

Query Optimization. Query Optimization. Optimization considerations. Example. Interaction of algorithm choice and tree arrangement.

R & G Chapter 13. Implementation of single Relational Operations Choices depend on indexes, memory, stats, Joins Blocked nested loops:

Lecture 14 of 42. E-R Diagrams, UML Notes: PS3 Notes, E-R Design. Thursday, 15 Feb 2007

Chapter 3: Relational Model

Principles of Data Management. Lecture #12 (Query Optimization I)

Database Tuning and Physical Design: Basics of Query Execution

Outline. Note 1. CSIE30600 Database Systems More SQL 2

Database Systems External Sorting and Query Optimization. A.R. Hurson 323 CS Building

Relational Model, Relational Algebra, and SQL

DATABASE TECHNOLOGY - 1MB025

Relational Query Optimization. Highlights of System R Optimizer

Chapter 2: Relational Model

CS34800 Information Systems. The Relational Model Prof. Walid Aref 29 August, 2016

Relational Query Optimization

Optimization Overview

Database Systems. Announcement. December 13/14, 2006 Lecture #10. Assignment #4 is due next week.

Midterm Review. Winter Lecture 13

UNIT 2 RELATIONAL MODEL

Chapter 12: Query Processing

CS122 Lecture 5 Winter Term,

Web Science & Technologies University of Koblenz Landau, Germany. Relational Data Model

Database Systems. Answers

Chapter 6 The Relational Algebra and Relational Calculus

An SQL query is parsed into a collection of query blocks optimize one block at a time. Nested blocks are usually treated as calls to a subroutine

CSIE30600/CSIEB0290 Database Systems Relational Algebra and Calculus 2

Databases - Relational Algebra. (GF Royle, N Spadaccini ) Databases - Relational Algebra 1 / 24

Transcription:

CMSC 424 Database design Lecture 18 Query optimization Mihai Pop

More midterm solutions Projects do not be late! Admin

Introduction Alternative ways of evaluating a given query Equivalent expressions Different algorithms for each operation

Introduction (Cont.) An evaluation plan defines exactly what algorithm is used for each operation, and how the execution of the operations is coordinated.

Introduction (Cont.) Cost difference between evaluation plans for a query can be enormous E.g. seconds vs. days in some cases Steps in cost-based query optimization Generate logically equivalent expressions using equivalence rules Annotate resultant expressions to get alternative query plans Choose the cheapest plan based on estimated cost Estimation of plan cost based on: Statistical information about relations. Examples: number of tuples, number of distinct values for an attribute Statistics estimation for intermediate results to compute cost of complex expressions Cost formulae for algorithms, computed using statistics

Generating Equivalent Expressions

Transformation of Relational Expressions Two relational algebra expressions are said to be equivalent if the two expressions generate the same set of tuples on every legal database instance Note: order of tuples is irrelevant In SQL, inputs and outputs are multisets of tuples Two expressions in the multiset version of the relational algebra are said to be equivalent if the two expressions generate the same multiset of tuples on every legal database instance. An equivalence rule says that expressions of two forms are equivalent Can replace expression of first form by second, or vice versa

Equivalence Rules 1. Conjunctive selection operations can be deconstructed into a sequence of individual selections. 2. Selection operations are commutative. 3. Only the last in a sequence of projection operations is needed, the others can be omitted. Π 4. Selections can be combined with Cartesian products and theta joins. σ θ (E 1 X E 2 ) = E 1 θ E 2 σ ( θ ) ( ( )) 1 θ E = σ E 2 θ σ 1 θ 2 σ ( θ σ ( E)) ( ( E)) 1 θ = σ 2 θ σ 2 θ 1 L ( Π ( ( ( )) )) ( ) 1 L Π E E 2 Ln = Π L 1 σ θ1 (E 1 θ2 E 2 ) = E 1 θ1 θ2 E 2

Equivalence Rules (Cont.) 5. Theta-join operations (and natural joins) are commutative. E 1 θ E 2 = E 2 θ E 1 6. (a) Natural join operations are associative: (E 1 E 2 ) E 3 = E 1 (E 2 E 3 ) (b) Theta joins are associative in the following manner: (E 1 θ1 E 2 ) θ2 θ3 E 3 = E 1 θ1 θ3 (E 2 θ2 E 3 ) where θ 2 involves attributes from only E 2 and E 3.

Pictorial Depiction of Equivalence Rules

Many more... Read chapter 14!!!! Transformation rules

Transformation Example: Pushing Selections Query: Find the names of all customers who have an account at some branch located in Brooklyn. Π customer_name (σ branch_city = Brooklyn (branch (account depositor))) Transformation using rule 7a. Π customer_name ((σ branch_city = Brooklyn (branch)) (account depositor)) Performing the selection as early as possible reduces the size of the relation to be joined.

Example with Multiple Transformations Query: Find the names of all customers with an account at a Brooklyn branch whose account balance is over $1000. Π customer_name( (σ branch_city = Brooklyn balance > 1000 (branch (account depositor))) Transformation using join associatively (Rule 6a): Π customer_name ((σ branch_city = Brooklyn balance > 1000 (branch account)) depositor) Second form provides an opportunity to apply the perform selections early rule, resulting in the subexpression σ (branch) σ branch_city = Brooklyn balance > 1000 (account) Thus a sequence of transformations can be useful

Multiple Transformations (Cont.)

Transformation Example: Pushing Projections Π customer_name ((σ branch_city = Brooklyn (branch) account) depositor) When we compute (σ branch_city = Brooklyn (branch) account ) we obtain a relation whose schema is: (branch_name, branch_city, assets, account_number, balance) Push projections using equivalence rules 8a and 8b; eliminate unneeded attributes from intermediate results to get: Π customer_name (( Π account_number ( (σ branch_city = Brooklyn (branch) account )) depositor ) Performing the projection as early as possible reduces the size of the relation to be joined.

Join Ordering Example For all relations r 1, r 2, and r 3, (r 1 r 2 ) r 3 = r 1 (r 2 r 3 ) (Join Associativity) If r 2 r 3 is quite large and r 1 r 2 is small, we choose (r 1 r 2 ) r 3 so that we compute and store a smaller temporary relation.

Join Ordering Example (Cont.) Consider the expression Π customer_name ((σ branch_city = Brooklyn (branch)) (account depositor)) Could compute account depositor first, and join result with σbranch_city (branch) = Brooklyn but account depositor is likely to be a large relation. Only a small fraction of the bank s customers are likely to have accounts in branches located in Brooklyn it is better to compute σ branch_city = Brooklyn (branch) account first.

Enumeration of Equivalent Expressions Query optimizers use equivalence rules to systematically generate expressions equivalent to the given expression Can generate all equivalent expressions as follows: Repeat apply all applicable equivalence rules on every equivalent expression found so far add newly generated expressions to the set of equivalent expressions Until no new equivalent expressions are generated above The above approach is very expensive in space and time Two approaches Optimized plan generation based on transformation rules Special case approach for queries with only selections, projections and joins

Implementing Transformation Based Optimization Space requirements reduced by sharing common sub-expressions: when E1 is generated from E2 by an equivalence rule, usually only the top level of the two are different, subtrees below are the same and can be shared using pointers E.g. when applying join commutativity E1 E2 Same sub-expression may get generated multiple times Detect duplicate sub-expressions and share one copy Time requirements are reduced by not generating all expressions Dynamic programming We will study only the special case of dynamic programming for join order optimization