Chapter 2: The Relational Algebra

Similar documents
Chapter 2: The Relational Algebra

L22: The Relational Model (continued) CS3200 Database design (sp18 s2) 4/5/2018

4/10/2018. Relational Algebra (RA) 1. Selection (σ) 2. Projection (Π) Note that RA Operators are Compositional! 3.

CSC 261/461 Database Systems Lecture 13. Fall 2017

Lecture 16. The Relational Model

Optimization Overview

Lecture Query evaluation. Combining operators. Logical query optimization. By Marina Barsky Winter 2016, University of Toronto

CS411 Database Systems. 04: Relational Algebra Ch 2.4, 5.1

Introduction to Data Management CSE 344. Lectures 8: Relational Algebra

Relational Model and Relational Algebra

Introduction to Data Management CSE 344. Lectures 8: Relational Algebra

Relational Model & Algebra. Announcements (Thu. Aug. 27) Relational data model. CPS 116 Introduction to Database Systems

Relational Query Languages

CSC 261/461 Database Systems Lecture 14. Spring 2017 MW 3:25 pm 4:40 pm January 18 May 3 Dewey 1101

CS 377 Database Systems

Relational Algebra and SQL. Basic Operations Algebra of Bags

MIS Database Systems Relational Algebra

Announcements. Relational Model & Algebra. Example. Relational data model. Example. Schema versus instance. Lecture notes

Final Review. CS 145 Final Review. The Best Of Collection (Master Tracks), Vol. 2

Relational Model & Algebra. Announcements (Tue. Sep. 3) Relational data model. CompSci 316 Introduction to Database Systems

RELATIONAL ALGEBRA. CS 564- Fall ACKs: Dan Suciu, Jignesh Patel, AnHai Doan

Relational Algebra Part I. CS 377: Database Systems

Relational Algebra BASIC OPERATIONS DATABASE SYSTEMS AND CONCEPTS, CSCI 3030U, UOIT, COURSE INSTRUCTOR: JAREK SZLICHTA

Relational Algebra. Algebra of Bags

Relational Algebra. [R&G] Chapter 4, Part A CS4320 1

CS 4604: Introduc0on to Database Management Systems. B. Aditya Prakash Lecture #3: SQL and Rela2onal Algebra- - - Part 1

Database Management Systems. Chapter 4. Relational Algebra. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1

Relational Algebra. Study Chapter Comp 521 Files and Databases Fall

Basic operators: selection, projection, cross product, union, difference,

Chapter 6 The Relational Algebra and Relational Calculus

EECS 647: Introduction to Database Systems

Relational Algebra Homework 0 Due Tonight, 5pm! R & G, Chapter 4 Room Swap for Tuesday Discussion Section Homework 1 will be posted Tomorrow

Chapter 2: Intro to Relational Model

CSC 261/461 Database Systems Lecture 19

2.2.2.Relational Database concept

CS3DB3/SE4DB3/SE6M03 TUTORIAL

SQL. A Short Note. November 14, where the part appears between [ and ] are optional. A (simple) description of components are given below:

CAS CS 460/660 Introduction to Database Systems. Relational Algebra 1.1

Relational Algebra. Chapter 4, Part A. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1

CIS 330: Applied Database Systems. ER to Relational Relational Algebra

Chapter 2 The relational Model of data. Relational algebra

Relational Algebra and Calculus

Introduction to Data Management. Lecture #11 (Relational Algebra)

Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. Chapter 6 Outline. Unary Relational Operations: SELECT and

RELATIONAL ALGEBRA. CS121: Relational Databases Fall 2017 Lecture 2

Chapter 6 Part I The Relational Algebra and Calculus

Relational Algebra. Lecture 4A Kathleen Durant Northeastern University

Relational Model, Relational Algebra, and SQL

Relational Algebra 1

Relational Query Languages: Relational Algebra. Juliana Freire

CMPT 354: Database System I. Lecture 7. Basics of Query Optimization

Relational Algebra. Procedural language Six basic operators

Relational Algebra. Relational Query Languages

Relational Algebra 1

Relational Databases. Relational Databases. Extended Functional view of Information Manager. e.g. Schema & Example instance of student Relation

COMP 244 DATABASE CONCEPTS AND APPLICATIONS

Database Systems CSE 303. Outline. Lecture 06: SQL. What is Sub-query? Sub-query in WHERE clause Subquery

Relational Algebra. Note: Slides are posted on the class website, protected by a password written on the board

CMPT 354: Database System I. Lecture 5. Relational Algebra

CompSci 516: Database Systems

Ian Kenny. November 28, 2017

Database Applications (15-415)

COSC344 Database Theory and Applications. σ a= c (P) S. Lecture 4 Relational algebra. π A, P X Q. COSC344 Lecture 4 1

Chapter 3. The Relational Model. Database Systems p. 61/569

Faloutsos - Pavlo CMU SCS /615

Overview. Carnegie Mellon Univ. School of Computer Science /615 - DB Applications. Concepts - reminder. History

CS

Relational Algebra 1

Query Processing & Optimization. CS 377: Database Systems

Databases Relational algebra Lectures for mathematics students

Relational Algebra. Relational Query Languages

Database Technology Introduction. Heiko Paulheim

Database Applications (15-415)

Carnegie Mellon Univ. Dept. of Computer Science Database Applications. General Overview - rel. model. Overview - detailed - SQL

EECS 647: Introduction to Database Systems

CS 317/387. A Relation is a Table. Schemas. Towards SQL - Relational Algebra. name manf Winterbrew Pete s Bud Lite Anheuser-Busch Beers

Chapter 8: The Relational Algebra and The Relational Calculus

CMPT 354: Database System I. Lecture 2. Relational Model

Database Management System. Relational Algebra and operations

Relational Algebra 1 / 38

CS 4604: Introduction to Database Management Systems. B. Aditya Prakash Lecture #2: The Relational Model and Relational Algebra

The Relational Model

CMPT 354: Database System I. Lecture 3. SQL Basics

CS3DB3/SE4DB3/SE6DB3 TUTORIAL

Relational Query Languages. Relational Algebra. Preliminaries. Formal Relational Query Languages. Relational Algebra: 5 Basic Operations

Relational Algebra & Calculus. CS 377: Database Systems

CMP-3440 Database Systems

EECS 647: Introduction to Database Systems

Introduction to Query Processing and Query Optimization Techniques. Copyright 2011 Ramez Elmasri and Shamkant Navathe

Relational Query Languages. Preliminaries. Formal Relational Query Languages. Example Schema, with table contents. Relational Algebra

CS2300: File Structures and Introduction to Database Systems

Relational algebra. Iztok Savnik, FAMNIT. IDB, Algebra

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Informationslogistik Unit 4: The Relational Algebra

More Relational Algebra

CS 5614: (Big) Data Management Systems. B. Aditya Prakash Lecture #2: The Rela0onal Model, and SQL/Rela0onal Algebra

v Conceptual Design: ER model v Logical Design: ER to relational model v Querying and manipulating data

WEEK 3. EE562 Slides and Modified Slides from Database Management Systems, R.Ramakrishnan 1

Relational Algebra 1. Week 4

Relational Model and Algebra. Introduction to Databases CompSci 316 Fall 2018

Transcription:

CSE 303: Database RDBMS Architecture Lecture 11 How does a SQL engine work? Chapter 2: The Relational Algebra SQL Query Declarative query (from user) Relational Algebra (RA) Plan Translate to relational algebra expression Optimized RA Plan Find logically equivalent- but more efficient- RA expression Execution Execute each operator of the optimized plan! RDBMS Architecture How does a SQL engine work? SQL Query Relational Algebra (RA) Plan Optimized RA Plan Execution Relational Algebra (RA) Five basic operators: 1. Selection: s 2. Projection: P 3. Cartesian Product: 4. Union: 5. Difference: - We ll look at these first! Then we ll look see these set operations Relational Algebra allows us to translate declarative (SQL) queries into precise and optimizable expressions! Derived or auxiliary operators: Intersection Joins (natural,equi-join, theta join) Renaming: And also at one example of a derived operator (natural join) and a special operator (renaming) 1

Keep in mind: RA operates on sets! RDBMSs use Bags (multisets), however in relational algebra formalism we will consider sets! Also: we will consider the named perspective, where every attribute must have a unique name attribute order does not matter Now on to the basic RA operators Returns all tuples which satisfy a condition Notation: s c (R) Examples s Salary > 40000 (Employee) s name = Smith (Employee) The condition c can use =, <,, >,, <> SELECT * FROM Students WHERE gpa > 3.5; Another example: SSN Name Salary 1234545 John 200000 5423341 Smith 600000 4352342 Fred 500000 s Salary > 40000 (Employee) SSN Name Salary 5423341 Smith 600000 4352342 Fred 500000 Eliminates columns, then removes duplicates Notation: P A1,,An (R) Example: project social-security number and names: P SSN, Name (Employee) Output schema: Answer(SSN, Name) SELECT DISTINCT sname, gpa FROM Students; 2

Another example: SSN Name Salary 1234545 John 200000 5423341 John 600000 4352342 John 200000 P Name,Salary (Employee) Name Salary John 200000 John 600000 Note that RA Operators are Compositional! SELECT DISTINCT sname, gpa FROM Students WHERE gpa > 3.5; How do we represent this query in RA? Are these logically equivalent? Each tuple in R1 with each tuple in R2 Notation: R1 R2 Example: Employee Dependents Rare in practice; mainly used to express joins People(ssn,pname,address) SELECT * FROM Students, People; Another example: People Students ssn pname address 1234545 John 216 Rosse 5423341 Bob 217 Rosse sid sname gpa 001 John 3.4 002 Bob 1.3 ssn pname address sid sname gpa 1234545 John 216 Rosse 001 John 3.4 5423341 Bob 217 Rosse 001 John 3.4 1234545 John 216 Rosse 002 Bob 1.3 5423341 Bob 216 Rosse 002 Bob 1.3 3

Changes the schema, not the instance A special operator- neither basic nor derived Notation: S(B1,,Bn) (R) Note: this is shorthand for the proper form (since names, not order matters!): S(A1 B1,,An Bn) (R) SELECT sid AS studid, sname AS name, gpa AS gradeptavg FROM Students; ( studid, name, gradeptavg )( Students) Students Another example: Students Students ( studid, name, gradeptavg ) ( Students) sid sname gpa 001 John 3.4 002 Bob 1.3 Students studid name gradeptavg 001 John 3.4 002 Bob 1.3 Students(sid,name,gpa) People(ssn,name,address) SELECT DISTINCT ssid, S.name, gpa, ssn, address FROM Students S, People P WHERE S.name = P.name; Another example: Students S People P sid S.name gpa ssn P.name address 001 John 3.4 1234545 John 216 Rosse 002 Bob 1.3 5423341 Bob 217 Rosse sid S.name gpa ssn address 001 John 3.4 1234545 216 Rosse 002 Bob 1.3 5423341 216 Rosse 4

Natural Join Example: Converting SFW Query -> RA People(ssn,sname,address) SELECT DISTINCT gpa, address FROM Students S, People P WHERE gpa > 3.5 AND S.sname = P.sname; How do we represent this query in RA? CSE Lecture 30316 Database > Section 2 Advanced Relational Algebra 1. Set Operations in RA 2. Fancier RA 3. Extensions & Limitations Relational Algebra (RA) Five basic operators: 1. Selection: s 2. Projection: P 3. Cartesian Product: 4. Union: 5. Difference: - We ll look at these Derived or auxiliary operators: Intersection Joins (natural,equi-join, theta join, semi-join) Renaming: And also at some of these derived operators 19 5

1. Union ( ) and 2. Difference ( ) What about Intersection ( )? R1 R2 Example: ActiveEmployees RetiredEmployees R1 R2 Example: AllEmployees - RetiredEmployees R 1 R 2 R 1 R 2 It is a derived operator R1 R2 =? R1 R2 = R1 (R1 R2) Example UnionizedEmployees RetiredEmployees R 1 R 2 R 1 R 2 R1 R2 People(ssn,pname,address) Fancier RA Note that natural join is a theta join + a projection. SELECT * FROM Students,People WHERE q; 6

People(ssn,pname,address) Most common join in practice! SELECT * FROM Students S, People P WHERE sname = pname; Optimization Logical Equivalece of RA Plans RDBMS Architecture How does a SQL engine work? SQL Query Relational Algebra (RA) Plan Optimized RA Plan Execution We ll get a flavor of how to optimize on these plans now 7

Note: We can visualize the plan as a tree A simple plan What SQL query does this correspond to? Are there any logically equivalent RA expressions? Bottom-up tree traversal = order of operation execution! Pushing down projection Takeaways This process is called logical optimization Many equivalent plans used to search for good plans Relational algebra is an important abstraction. Why might we prefer this plan? 8

RA commutators The basic commutators: Push projection through (1) selection, (2) join Push selection through (3) selection, (4) projection, (5) join Also: Joins can be re-ordered! Optimizing the SFW RA Plan This simple set of tools allows us to greatly improve the execution time of queries by optimizing RA plans! Translating to RA T(C,D) SELECT R.A, T.D FROM R,S,T WHERE R.B = S.B AND S.C = T.C AND R.A < 10; s A<10 T(C,D) Logical Optimization Heuristically, we want selections and projections to occur as early as possible in the plan Terminology: push down selections and pushing down projections. Intuition: We will have fewer tuples in a plan. 9

Optimizing RA Plan Push down T(C,D) SELECT R.A,T.D FROM R,S,T WHERE R.B = S.B AND S.C = T.C AND R.A < 10; selection on A so it occurs earlier s A<10 Optimizing RA Plan Push down T(C,D) SELECT R.A,T.D FROM R,S,T WHERE R.B = S.B AND S.C = T.C AND R.A < 10; selection on A so it occurs earlier T(C,D) T(C,D) s A<10 Optimizing RA Plan Push down T(C,D) SELECT R.A,T.D FROM R,S,T WHERE R.B = S.B AND S.C = T.C AND R.A < 10; projection so it occurs earlier T(C,D) Optimizing RA Plan T(C,D) SELECT R.A,T.D FROM R,S,T WHERE R.B = S.B AND S.C = T.C AND R.A < 10; We eliminate B earlier! In general, when is an attribute not needed? T(C,D) s A<10 s A<10 10

Example (PPLP) Example (PPLP) What PC models have a speed of at least 3.00? Find the model numbers of all color laser printers? Answer(model) := P model (s speed >= 3.00 (PC)) P model (s color= T AND type = laser (Printer)) Which manufacturers make laptops with a hard disk of at least 100GB. Find those manufacturer that sells Laptops but not PC s. Answer(maker) := P maker (s hd >= 100 (Product Laptop)) P maker (s type= laptop (Product)) - P maker (s type= pc (Product)) Example (PPLP) Find those hard disk sizes that occur two or more PC s? Example (PPLP) Find the manufacturer(s) of the PC or Laptop with highest available speed? P hd (s pc1.model < > pc2.model AND pc1.hd = pc2.hd (ρ pc1 (PC) ρ pc2 (PC))) R1:= P model,speed (s speed >= 2.0 (PC) s speed >= 2.0 (Laptop)) Find those pairs of PC models that have both the same speed and RAM. A pair should be listed only once. P maker, model (Product) - P pdpl1.maker, pdpl1.model (s pdpl1.speed > pdpl2.speed (ρ pdpl1 (Product R1 ) ρ pdpl2 (Product R1))) P pc1.model,pc2.model (s pc1.model < pc2.model AND pc1.speed = pc2.speed AND pc1.ram = pc2.ram (ρ pc1 (PC) ρ pc2 (PC))) 11