Algebraic laws extensions to relational algebra

Similar documents
Notes. Some of these slides are based on a slide set provided by Ulf Leser. CS 640 Query Processing Winter / 30. Notes

Optimization of logical query plans Eliminating redundant joins

SQL queries II. Set operations and joins

Query Processing and Optimization

The Extended Algebra. Duplicate Elimination. Sorting. Example: Duplicate Elimination

Joins, NULL, and Aggregation

Information Systems for Engineers. Exercise 10. ETH Zurich, Fall Semester Hand-out Due

Set Operations, Union

CS 317/387. A Relation is a Table. Schemas. Towards SQL - Relational Algebra. name manf Winterbrew Pete s Bud Lite Anheuser-Busch Beers

Database Systems Architecture. Stijn Vansummeren

5.2 E xtended Operators of R elational Algebra

Background material for Chapter 3 taken from CS 245

Query Processing & Optimization. CS 377: Database Systems

Outline. Query Processing Overview Algorithms for basic operations. Query optimization. Sorting Selection Join Projection

Relational Algebra BASIC OPERATIONS DATABASE SYSTEMS AND CONCEPTS, CSCI 3030U, UOIT, COURSE INSTRUCTOR: JAREK SZLICHTA

Relational Algebra and SQL. Basic Operations Algebra of Bags

Lecture Query evaluation. Combining operators. Logical query optimization. By Marina Barsky Winter 2016, University of Toronto

Relational Algebra. Algebra of Bags

Relational Query Languages. Preliminaries. Formal Relational Query Languages. Example Schema, with table contents. Relational Algebra

Improving Query Plans. CS157B Chris Pollett Mar. 21, 2005.

Relational Model, Relational Algebra, and SQL

Relational Algebra. B term 2004: lecture 10, 11

Introduction to Query Processing and Query Optimization Techniques. Copyright 2011 Ramez Elmasri and Shamkant Navathe

The query processor turns user queries and data modification commands into a query plan - a sequence of operations (or algorithm) on the database

Chapter 2 The relational Model of data. Relational algebra

Advanced Database Systems

Relational Algebra. Spring 2012 Instructor: Hassan Khosravi

Algorithms for Query Processing and Optimization. 0. Introduction to Query Processing (1)

Relational Algebra. Note: Slides are posted on the class website, protected by a password written on the board

Relational Algebra and SQL

Parser: SQL parse tree

Database Systems. Basics of the Relational Data Model

Programming and Database Fundamentals for Data Scientists

Introduction to SQL. Select-From-Where Statements Multirelation Queries Subqueries. Slides are reused by the approval of Jeffrey Ullman s

AC61/AT61 DATABASE MANAGEMENT SYSTEMS DEC 2013

Introduction to SQL SELECT-FROM-WHERE STATEMENTS SUBQUERIES DATABASE SYSTEMS AND CONCEPTS, CSCI 3030U, UOIT, COURSE INSTRUCTOR: JAREK SZLICHTA

Midterm 1 157A Fall /22

What happens. 376a. Database Design. Execution strategy. Query conversion. Next. Two types of techniques

Chapter 3. Algorithms for Query Processing and Optimization

Statistics. Duplicate Elimination

Chapter 13: Query Optimization. Chapter 13: Query Optimization

Query Processing & Optimization

Database Modifications and Transactions

Information Systems Engineering. SQL Structured Query Language DML Data Manipulation (sub)language

Optimization Overview

Query Optimization. Shuigeng Zhou. December 9, 2009 School of Computer Science Fudan University

Chapter 12: Query Processing

Chapter 12: Query Processing. Chapter 12: Query Processing

2.3 Algorithms Using Map-Reduce

Relational Algebra. ICOM 5016 Database Systems. Roadmap. R.A. Operators. Selection. Example: Selection. Relational Algebra. Fundamental Property

Query Processing and Optimization *

Query Processing Strategies and Optimization

Relational Algebra. Procedural language Six basic operators

RELATIONAL DATA MODEL: Relational Algebra

CSC 261/461 Database Systems Lecture 19

! Parallel machines are becoming quite common and affordable. ! Databases are growing increasingly large

Chapter 20: Parallel Databases

Chapter 20: Parallel Databases. Introduction

Query Optimization. Query Optimization. Optimization considerations. Example. Interaction of algorithm choice and tree arrangement.

CMSC424: Database Design. Instructor: Amol Deshpande

Ch 5 : Query Processing & Optimization

Name: Problem 1 Consider the following two transactions: T0: read(a); read(b); if (A = 0) then B = B + 1; write(b);

Database System Concepts

Chapter 18: Parallel Databases

Chapter 18: Parallel Databases. Chapter 18: Parallel Databases. Parallelism in Databases. Introduction

Query processing and optimization

Chapter 13: Query Optimization

Chapter 17: Parallel Databases

Chapter 12: Query Processing

Why SQL? SQL is a very-high-level language. Database management system figures out best way to execute query

Database System Concepts

Chapter 11: Query Optimization

Advanced Databases. Lecture 4 - Query Optimization. Masood Niazi Torshiz Islamic Azad university- Mashhad Branch

Database Systems. Announcement. December 13/14, 2006 Lecture #10. Assignment #4 is due next week.

Relational Algebra. Relational Algebra Overview. Relational Algebra Overview. Unary Relational Operations 8/19/2014. Relational Algebra Overview

Advanced Databases. Lecture 15- Parallel Databases (continued) Masood Niazi Torshiz Islamic Azad University- Mashhad Branch

QUERY OPTIMIZATION. CS 564- Spring ACKs: Jeff Naughton, Jignesh Patel, AnHai Doan

Lecture Query evaluation. Cost-based transformations. By Marina Barsky Winter 2016, University of Toronto

Chapter 13: Query Processing

Ian Kenny. November 28, 2017

Contents Contents Introduction Basic Steps in Query Processing Introduction Transformation of Relational Expressions...

Query Execution [15]

Introduction to SQL. Multirelation Queries Subqueries. Slides are reused by the approval of Jeffrey Ullman s

Chapter 14 Query Optimization

Chapter 14 Query Optimization

Chapter 14 Query Optimization

! A relational algebra expression may have many equivalent. ! Cost is generally measured as total elapsed time for

Chapter 13: Query Processing Basic Steps in Query Processing

Announcement. Reading Material. Overview of Query Evaluation. Overview of Query Evaluation. Overview of Query Evaluation 9/26/17

Query Processing. Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016

Subqueries. Must use a tuple-variable to name tuples of the result

Chapter 8: Relational Algebra

Something to think about. Problems. Purpose. Vocabulary. Query Evaluation Techniques for large DB. Part 1. Fact:

L22: The Relational Model (continued) CS3200 Database design (sp18 s2) 4/5/2018

Chapter 19 Query Optimization

CompSci 516 Data Intensive Computing Systems. Lecture 11. Query Optimization. Instructor: Sudeepa Roy

Advances in Data Management Query Processing and Query Optimisation A.Poulovassilis

Relational Query Optimization

Principles of Database Management Systems

CS3DB3/SE4DB3/SE6DB3 TUTORIAL

Transcription:

Today: Query optimization. Algebraic laws extensions to relational algebra for select-distinct, grouping. Soon: Estimating costs. Algorithms for computing joins, other operations. 1

Query Optimization Query Plan generator Query plans Cost estimation Selected plan 2

Query Plan Choose operations, e.g.,,./. Order operations. Detailed strategy of operations, e.g.: Join method. Pipelining: consume result of one operation by another, to avoid temporary storage on disk. Use of indexes? Sort intermediate results? 3

Example MovieStar(name, addr, gender, birthdate) StarsIn(title, year, starname) SELECT title, birthdate FROM MovieStar, StarsIn WHERE year = 1997 AND gender = 'F' AND starname = name Plan I (from denition) title birthdate year=1996 AND gender= 0 F 0 AND starname=name MovieStar StarsIn 4

Plan II title birthdate./ starname=name gender= 0 F 0 year=1996 MovieStar StarsIn Join method? Can we pipeline the result of one or both selections, and avoid storing the result on disk temporarily? Are there indexes on MovieStar.gender and/or StarsIn.year that will make the 's ecient? 5

Generating Plans Start with query denition. A plan, but usually a terrible one. Apply algebraic transformations to nd other plans. Usually, there is a preferred direction. Relational algebra is a good start, but we need also to consider: GROUP BY, duplicate elimination, HAVING, ORDER BY. Evaluate the cost of each generated plan, using estimates of sizes for intermediate results, possibly using statistics about the stored relations. 6

Algebraic Transformations Laws give equivalent expressions. meaning that whatever relations are substituted for variables, the results are the same. Commutative and associative laws. Example: for natural join: R./S= S./ R (R./S)./ T = R./ (S./T). Leads to join-ordering problem important for complex queries. Same idea for, [, \. But beware theta-join associative law does not hold. Example: relations R(a b), S(b c), T (c d) (R./ R:b>S:b S)./ a<d T 6= R./ R:b>S:b (S./ a<d T ) The latter doesn't even make sense, because a is not an attribute of S or T. 7

Laws Involving Selection Splitting: C 1 AND C 2 (R) =C 1 ; C 2 (R) C 1 OR C 2 (R) =C 1 (R) [ C 2 (R) \Pushing selections": C(R./ S) = ; C(R)./ S, aslongas condition C makes sense on R. Also possible to move C to S if C makes sense there. We caneven move C to both if it makes sense. Same ideas for commuting with,./ C. Selection and union, intersection, dierence: C(R [ S) =C(R) [ C(S) Similar for \, ;. Selection and product combine to form a join: C(R S) =R./ C S 8

Directionality in Selection Pushing SKS says always push downward. Example: relations R(a b), S(b c). Replace a=1(r./s)by ; a=1(r)./s. Big win, because we probably reduce the size of the rst join argument by a lot. Trivial counterexample: what if S is empty? Serious counterexample on next slide. 9

Selections Should Go Up Then Down StarsIn(title, year, starname) Movie(title, year, studioname) CREATE VIEW MoviesOf1996 AS SELECT * FROM Movie WHERE year = 1996 SELECT starname, studioname FROM MoviesOf1996 NATURAL JOIN StarsIn Initial query: starname studioname./ year=1996 Movie StarsIn 10

Probably Better: Move up to root, then down both paths. starname studioname./ year=1996 year=1996 Movie StarsIn 11

Pushing Projections ; X(R./S)=X Y (R)./ Z(S), where Y is those attributes of R that are either: 1. In X, or 2. A join attribute of R and S. Z dened similarly. Similar rules for commuting with,./ C, [. Problem Does commute with \? With ;? Selection and Projection ; X C(R) ; = X C Y (R) if Y union the attributes mentioned in C. is X 12

Should We Push Projections? SKS says pushing projections down is good, but they are too optimistic. Example: SELECT starname FROM StarsIn WHERE year = 1996 Suppose there is an index on year. Ecient starname year=1996 StarsIn 13

Wastes Time starname year=1996 starname year StarsIn 14

Operators Outside Relational Algebra Real query optimizer must deal with: Duplicate elimination, and operators that require bag semantics, e.g., UNION ALL. Group-by andhaving. Duplicate Elimination A step in a query tree that involves the relation as a whole. We'll use as the duplicate-elimination operator, e.g., (R) = R with duplicates eliminated. 15

Algebraic Laws Involving Commutes with,,./,./ C, [, \, ;. ; Examples: A=c(R) ; = A=c (R), (R./S)=(R)./ (S). Note that goes down both paths of a binary operator. Remember that [, etc., eliminate duplicates anyway. Thus, we have rules like: R [ S = (R [ S) =(R) [ (S). General goal of moving around: it is an expensive operation, and sometimes we can eliminate it altogether when it meets a (set) union, e.g., or a group-by (which always produces a set). 16

and Duplicate elimination does not commute with projection. Example: R(A B) = f(1 2) (1 ; 3)g. A(R) ; 6= A (R). Bag Versions of [, Etc. Since SQL allows us to require bag union, etc., we need operators [B, \B, and;b to denote these operations. Question: which of these are valid? (R [B S) =(R) [B (S)? (R \B S) =(R) \B (S)? (R ;B S) =(R) ;B (S)? 17

Grouping Introduce operator for grouping. Takes a list of attributes and aggregated attributes, plus possibly a HAVING condition. Example StarsIn(title, year, starname). SELECT title, MIN(year) FROM StarsIn GROUP By title HAVING COUNT(starName) >= 3 title MIN(year)jCOUNT(starName3)(StarsIn) Laws Involving Not much. absorbs : ; X(R) = X(R). Some special opportunities, e.g., if the the only aggregation is MIN or MAX, thenwecan introduce a to apply to the operand relation. Might allow compacting of computation below. 18