Relational Algebra for sets Introduction to relational algebra for bags

Similar documents
Relational Query Languages: Relational Algebra. Juliana Freire

CS2300: File Structures and Introduction to Database Systems

Relational Model and Relational Algebra

Relational Database: The Relational Data Model; Operations on Database Relations

2.2.2.Relational Database concept

Announcements. Relational Model & Algebra. Example. Relational data model. Example. Schema versus instance. Lecture notes

COMP 244 DATABASE CONCEPTS AND APPLICATIONS

Overview of the Class and Introduction to DB schemas and queries. Lois Delcambre

Relational Algebra Homework 0 Due Tonight, 5pm! R & G, Chapter 4 Room Swap for Tuesday Discussion Section Homework 1 will be posted Tomorrow

Chapter 3. The Relational Model. Database Systems p. 61/569

Chapter 2: Intro to Relational Model

ECE 650 Systems Programming & Engineering. Spring 2018

More on SQL. Juliana Freire. Some slides adapted from J. Ullman, L. Delcambre, R. Ramakrishnan, G. Lindstrom and Silberschatz, Korth and Sudarshan

Relational Algebra. B term 2004: lecture 10, 11

Chapter 6 The Relational Algebra and Relational Calculus

Relational Algebra 1

Faloutsos - Pavlo CMU SCS /615

Overview. Carnegie Mellon Univ. School of Computer Science /615 - DB Applications. Concepts - reminder. History

v Conceptual Design: ER model v Logical Design: ER to relational model v Querying and manipulating data

Relational Model, Relational Algebra, and SQL

MIS Database Systems Relational Algebra

CS 4604: Introduction to Database Management Systems. B. Aditya Prakash Lecture #2: The Relational Model and Relational Algebra

Relational Algebra 1

Relational model continued. Understanding how to use the relational model. Summary of board example: with Copies as weak entity

Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. Chapter 6 Outline. Unary Relational Operations: SELECT and

Relational Query Languages

Relational Algebra and SQL

1 Relational Data Model

Relational Model History. COSC 304 Introduction to Database Systems. Relational Model and Algebra. Relational Model Definitions.

CSCC43H: Introduction to Databases. Lecture 3

CS121 MIDTERM REVIEW. CS121: Relational Databases Fall 2017 Lecture 13

Relational Algebra. Note: Slides are posted on the class website, protected by a password written on the board

CSC 261/461 Database Systems Lecture 13. Fall 2017

Relational Algebra. Mr. Prasad Sawant. MACS College. Mr.Prasad Sawant MACS College Pune

CMPT 354: Database System I. Lecture 5. Relational Algebra

Relational Algebra. Study Chapter Comp 521 Files and Databases Fall

Lecture 16. The Relational Model

Relational Algebra. [R&G] Chapter 4, Part A CS4320 1

Database Management Systems. Chapter 4. Relational Algebra. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1

Modeling access. Relational query languages. Queries. Selection σ P (R) Relational model: Relational algebra

Chapter 3: Introduction to SQL

Relational Databases. Relational Databases. Extended Functional view of Information Manager. e.g. Schema & Example instance of student Relation

CIS 330: Applied Database Systems. ER to Relational Relational Algebra

Relational Algebra. Relational Query Languages

Relational Query Languages. Preliminaries. Formal Relational Query Languages. Example Schema, with table contents. Relational Algebra

Carnegie Mellon Univ. Dept. of Computer Science Database Applications. General Overview - rel. model. Overview - detailed - SQL

Relational Model & Algebra. Announcements (Thu. Aug. 27) Relational data model. CPS 116 Introduction to Database Systems

Relational Query Languages. Relational Algebra. Preliminaries. Formal Relational Query Languages. Relational Algebra: 5 Basic Operations

CS 4604: Introduc0on to Database Management Systems. B. Aditya Prakash Lecture #3: SQL and Rela2onal Algebra- - - Part 1

Experimenting with bags (tables and query answers with duplicate rows):

The SQL data-definition language (DDL) allows defining :

DATABASE DESIGN I - 1DL300

Databases. Relational Model, Algebra and operations. How do we model and manipulate complex data structures inside a computer system? Until

Review: Where have we been?

Relational Query Languages

Ian Kenny. November 28, 2017

Database Technology Introduction. Heiko Paulheim

EECS 647: Introduction to Database Systems

Chapter 2: Intro to Relational Model

Relational Model & Algebra. Announcements (Tue. Sep. 3) Relational data model. CompSci 316 Introduction to Database Systems

Agenda. Database Systems. Session 5 Main Theme. Relational Algebra, Relational Calculus, and SQL. Dr. Jean-Claude Franchitti

Chapter 2 Introduction to Relational Models

Chapter 3: Introduction to SQL

Database Systems. Course Administration. 10/13/2010 Lecture #4

COSC344 Database Theory and Applications. σ a= c (P) Lecture 3 The Relational Data. Model. π A, COSC344 Lecture 3 1

Databases - Relational Algebra. (GF Royle, N Spadaccini ) Databases - Relational Algebra 1 / 24

CSEN 501 CSEN501 - Databases I

Relational Algebra. Relational Query Languages

This lecture. Projection. Relational Algebra. Suppose we have a relation

Chapter 2 The relational Model of data. Relational algebra

DBMS. Relational Model. Module Title?

Chapter 6 Formal Relational Query Languages

Introduction to Data Management CSE 344. Lectures 8: Relational Algebra

CS 377 Database Systems

Database Management System. Relational Algebra and operations

Database Management System Dr. S. Srinath Department of Computer Science & Engineering Indian Institute of Technology, Madras Lecture No.

Chapter 3: Introduction to SQL. Chapter 3: Introduction to SQL

CS 582 Database Management Systems II

Introduction to Data Management CSE 344. Lectures 8: Relational Algebra

Relational Algebra. Procedural language Six basic operators

Relational Algebra and SQL

Basic operators: selection, projection, cross product, union, difference,

Relational Algebra Part I. CS 377: Database Systems

The Relational Algebra

Relational algebra. Iztok Savnik, FAMNIT. IDB, Algebra

Optimization Overview

Lecture 2 SQL. Instructor: Sudeepa Roy. CompSci 516: Data Intensive Computing Systems

CS317 File and Database Systems

CMP-3440 Database Systems

Mahathma Gandhi University

Basant Group of Institution

QQ Group

Section 2.2: Relational Databases

Administration Naive DBMS CMPT 454 Topics. John Edgar 2

CS 377 Database Systems

Why Study the Relational Model? The Relational Model. Relational Database: Definitions. The SQL Query Language. Relational Query Languages

CS3DB3/SE4DB3/SE6DB3 TUTORIAL

Set theory is a branch of mathematics that studies sets. Sets are a collection of objects.

COURSE OVERVIEW THE RELATIONAL MODEL. CS121: Relational Databases Fall 2017 Lecture 1

CMPT 354: Database System I. Lecture 2. Relational Model

Transcription:

Relational Algebra for sets Introduction to relational algebra for bags Thursday, September 27, 2012 1 1

Terminology for Relational Databases Slide repeated from Lecture 1... Account Number Owner Balance Type 101 J. Smith 1000.00 checking 102 W. Wei 2000.00 checking 103 J. Smith 5000.00 savings 104 M. Jones 1000.00 checking 105 H. Martin 10,000.00 checking 2 relation Each entry in the table is called a row or a tuple. Sometimes an entry in the table is called a record. The instance is the current set of rows (or tuples).

Codd s Original Relational Algebra Operators Eight operators defined for sets: project select cross product join union intersection difference division Plus renaming (to provide names for the relation & attributes of answer) 3

Project operator (π) in relational algebra Operator invented by Codd (not part of set theory) Account Number Owner Balance Type 101 J. Smith 1000.00 checking 102 W. Wei 2000.00 checking 103 J. Smith 5000.00 savings 104 M. Jones 1000.00 checking 105 H. Martin 10,000.00 checking Consider the query: π Number, Owner Account list of attributes (to retain) 4

Project operator (π) in relational algebra Always applied to single relation a unary operator For the query: π Number, Owner Account query answer is: Number Owner 101 J. Smith 102 W. Wei 103 J. Smith 104 M. Jones 105 H. Martin 5

Project operator (π): another example Account Number Owner Balance Type 101 J. Smith 1000.00 checking 102 W. Wei 2000.00 checking 103 J. Smith 5000.00 savings 104 M. Jones 1000.00 checking 105 H. Martin 10,000.00 checking Consider the query: π Owner Account list of attributes (to retain) 6

Project operator example (cont.) Consider the query: π Owner Account Query answer is: Owner W. Wei J. Smith M. Jones H. Martin In relational algebra defined on sets, the query answer is a set. J. Smith appears just once in the query answer. 7

Select operator ( ) in relational algebra invented by Codd (not part of set theory) Given the following relation (and instance) Account Number Owner Balance Type 101 J. Smith 1000.00 checking 102 W. Wei 2000.00 checking 103 J. Smith 5000.00 savings 104 M. Jones 1000.00 checking 105 H. Martin 10,000.00 checking Consider the query: Balance < 3000 Account 8

Select operator example (cont.) Balance < 3000 Account The select predicate is evaluated for each tuple Account Number Owner Balance Type 101 J. Smith 1000.00 checking 102 W. Wei 2000.00 checking 103 J. Smith 5000.00 savings 104 M. Jones 1000.00 checking 105 H. Martin 10,000.00 checking 9

Select operator example (cont.) For this query: Balance < 3000 Account The query answer is: Number Owner Balance Type 101 J. Smith 1000.00 checking 102 W. Wei 2000.00 checking 104 M. Jones 1000.00 checking 10

Select operator in relational algebra Always applied to a single relation a unary operator Balance < 3000 Account the select operator a relation name (or a relation expression) 11 the predicate: an attribute a comparator (, >,, =,, <) an attribute or a constant

Examples using the select operator Balance < 3000 Account Number = 103 Account Balance = Number Account Attribute compared to attribute! Type = checking ( Balance < 3000 Account) relational expression! Account Number Owner Balance Type 101 J. Smith 1000.00 checking 102 W. Wei 2000.00 checking 103 J. Smith 5000.00 savings 104 M. Jones 1000.00 checking 105 H. Martin 10,000.00 checking 12

Example (Useless) Query with Answer Account Query answer is empty. But that s allowed. Number Owner Balance Type 101 J. Smith 1000.00 checking 102 W. Wei 2000.00 checking 103 J. Smith 5000.00 savings 104 M. Jones 1000.00 checking 105 H. Martin 10,000.00 checking Type= checking AND Type = savings ATMWithdrawal Number Owner Balance Type But why is this a useless query? 13

Select and Project can be combined Account Number Owner Balance Type 101 J. Smith 1000.00 checking 102 W. Wei 2000.00 checking 103 J. Smith 5000.00 instance. savings 104 M. Jones 1000.00 checking 105 H. Martin 10,000.00 checking Owner ( Balance < 3000 Account) Balance < 3000 ( Owner, Balance Account) Note: two queries are equivalent if they are guaranteed to return the same query answer for every possible DB Owner ( Balance < 3000 ( Owner, Balance Account)) Balance < 3000 ( Owner Account) Is this one well-formed? Which pairs of these queries are equivalent, if any? 14

Cross Product an operator from set theory Suppose.. A = {a, b, c} B = {1, 2} then in set theory, the cross product is defined as: A X B = {(a, 1), (b, 1), (c, 1), (a, 2), (b, 2), (c, 2)} A X B is a set consisting of pairs (2-tuples) where each pair consists of an element from A and an element from B 15

Cross Product in Set Theory Suppose.. A = {a, b, c} B = {1, 2} C = {x, y} then A X B = {(a, 1), (b, 1), (c, 1), (a, 2), (b, 2), (c, 2)} and (A X B) X C = {((a,1),x), ((b,1),x), ((c,1),x), ((a,2),x), ((b,2),x), ((c,2),x), ((a,1),y), ((b,1),y), ((c,1),y), ((a,2),y), ((b,2),y), ((c,2),y)} 16

Cross Product in Relational Algebra vs. Set Theory Given A = {a, b, c} B = {1, 2} C = {x, y} then (A X B) X C, in set theory, = {((a,1),x), ((b,1),x), ((c,1),x), ((a,2),x), ((b,2),x), ((c,2),x), ((a,1),y), ((b,1),y), ((c,1),y), ((a,2),y), ((b,2),y), ((c,2),y)} Codd simplified it in relational algebra to: {(a,1,x), (b,1,x), (c,1,x), (a,2,x), (b,2,x), (c,2,x), (a,1,y), (b,1,y), (c,1,y), (a,2,y), (b,2,y), (c,2,y)} by eliminating parentheses. flattening the tuples. 17

Same slide with color eliminated Given A = {a, b, c} B = {1, 2} C = {x, y} with the cross product (A X B) X C in set theory = {((a,1),x), ((b,1),x), ((c,1),x), ((a,2),x), ((b,2),x), ((c,2),x), ((a,1),y), ((b,1),y), ((c,1),y), ((a,2),y), ((b,2),y), ((c,2),y)} Codd simplified it in relational algebra to: {(a,1,x), (b,1,x), (c,1,x), (a,2,x), (b,2,x), (c,2,x), (a,1,y), (b,1,y), (c,1,y), (a,2,y), (b,2,y), (c,2,y)} by eliminating parentheses. flattening the tuples. 18

Example Database to show how cross product can be used in a query Imagine that we have these two relations in a university database. Teacher (t-num, t-name) Course (c-num, c-name) In reality, the relations would probably be more detailed with attribute names as follows: Teacher (Number, Name, Office, E-mail) Course (Number, Name, Description) Taught-By (Quarter, Course, Section, Teacher, TimeDays) etc. 19

X cross product operator produces every possible combination Teacher t-num t-name Cross product produces: every possible combination of a teacher and a course Course c-num c-name 101 Smith 586 Intro to DB 105 Jones 533 Intro to OS 110 Fong Teacher X Course t-num t-name c-num c-name 101 Smith 586 Intro to DB 105 Jones 586 Intro to DB 110 Fong 586 Intro to DB 101 Smith 533 Intro to OS 105 Jones 533 Intro to OS 110 Fong 533 Intro to OS 20

Account Number Owner Balance Type 101 J. Smith 1000.00 checking 102 W. Wei 2000.00 checking 103 J. Smith 5000.00 savings 104 M. Jones 1000.00 checking 105 H. Martin 10,000.00 checking Deposit Account T-id Date Amount 102 1 10/22/00 500.00 102 2 10/29/00 200.00 104 3 10/29/00 1000.00 105 4 11/02/00 10,000.00 Balance > 1000 AND Number =Account (Account X Deposit) Cross product followed by select. 21

Account Number Owner Balance Type 101 J. Smith 1000.00 checking 102 W. Wei 2000.00 checking 103 J. Smith 5000.00 savings 104 M. Jones 1000.00 checking 105 H. Martin 10,000.00 checking No! Throw it away. Deposit Account T-id Date Amount 102 1 10/22/00 500.00 102 2 10/29/00 200.00 104 3 10/29/00 1000.00 105 4 11/02/00 10,000.00 Balance > 1000 AND Number =Account (Account X Deposit) notice the columns 22 Number Owner Balance Type Account T-id Date Amount

Account Number Owner Balance Type 101 J. Smith 1000.00 checking 102 W. Wei 2000.00 checking 103 J. Smith 5000.00 savings 104 M. Jones 1000.00 checking 105 H. Martin 10,000.00 checking No! Throw it away. Deposit Account T-id Date Amount 102 1 10/22/00 500.00 102 2 10/29/00 200.00 104 3 10/29/00 1000.00 105 4 11/02/00 10,000.00 Balance > 1000 AND Number =Account (Account X Deposit) Number Owner Balance Type Account T-id Date Amount 23

Account Number Owner Balance Type 101 J. Smith 1000.00 checking 102 W. Wei 2000.00 checking 103 J. Smith 5000.00 savings 104 M. Jones 1000.00 checking 105 H. Martin 10,000.00 checking No! Throw it away. Deposit Account T-id Date Amount 102 1 10/22/00 500.00 102 2 10/29/00 200.00 104 3 10/29/00 1000.00 105 4 11/02/00 10,000.00 Balance > 1000 AND Number =Account (Account X Deposit) Number Owner Balance Type Account T-id Date Amount 24

Account Number Owner Balance Type 101 J. Smith 1000.00 checking 102 W. Wei 2000.00 checking 103 J. Smith 5000.00 savings 104 M. Jones 1000.00 checking 105 H. Martin 10,000.00 checking No! Throw it away. Deposit Account T-id Date Amount 102 1 10/22/00 500.00 102 2 10/29/00 200.00 104 3 10/29/00 1000.00 105 4 11/02/00 10,000.00 Balance > 1000 AND Number =Account (Account X Deposit) Number Owner Balance Type Account T-id Date Amount 25

Account Number Owner Balance Type 101 J. Smith 1000.00 checking 102 W. Wei 2000.00 checking 103 J. Smith 5000.00 savings 104 M. Jones 1000.00 checking 105 H. Martin 10,000.00 checking Yes! Place in query answer. Deposit Account T-id Date Amount 102 1 10/22/00 500.00 102 2 10/29/00 200.00 104 3 10/29/00 1000.00 105 4 11/02/00 10,000.00 Balance > 1000 AND Number =Account (Account X Deposit) Number Owner Balance Type Account T-id Date Amount 102 W. Wei 2000.00 checking102 1 10/22/00 500.00 26

Account Number Owner Balance Type 101 J. Smith 1000.00 checking 102 W. Wei 2000.00 checking 103 J. Smith 5000.00 savings 104 M. Jones 1000.00 checking 105 H. Martin 10,000.00 checking Yes! Place in query answer. Deposit Account T-id Date Amount 102 1 10/22/00 500.00 102 2 10/29/00 200.00 104 3 10/29/00 1000.00 105 4 11/02/00 10,000.00 Balance > 1000 AND Number =Account (Account X Deposit) Number Owner Balance Type Account T-id Date Amount 102 W. Wei 2000.00 checking102 1 10/22/00 500.00 102 W. Wei 2000.00 checking102 2 10/29/00 200.00 27

Account Number Owner Balance Type 101 J. Smith 1000.00 checking 102 W. Wei 2000.00 checking 103 J. Smith 5000.00 savings 104 M. Jones 1000.00 checking 105 H. Martin 10,000.00 checking No! Throw it away. Deposit Account T-id Date Amount 102 1 10/22/00 500.00 102 2 10/29/00 200.00 104 3 10/29/00 1000.00 105 4 11/02/00 10,000.00 Balance > 1000 AND Number =Account (Account X Deposit) Number Owner Balance Type Account T-id Date Amount 102 W. Wei 2000.00 checking102 1 10/22/00 500.00 102 W. Wei 2000.00 checking102 2 10/29/00 200.00 28

Account Number Owner Balance Type 101 J. Smith 1000.00 checking 102 W. Wei 2000.00 checking 103 J. Smith 5000.00 savings 104 M. Jones 1000.00 checking 105 H. Martin 10,000.00 checking No! Throw it away. Deposit Account T-id Date Amount 102 1 10/22/00 500.00 102 2 10/29/00 200.00 104 3 10/29/00 1000.00 105 4 11/02/00 10,000.00 Balance > 1000 AND Number =Account (Account X Deposit) Number Owner Balance Type Account T-id Date Amount 102 W. Wei 2000.00 checking102 1 10/22/00 500.00 102 W. Wei 2000.00 checking102 2 10/29/00 200.00 29

Account Number Owner Balance Type 101 J. Smith 1000.00 checking 102 W. Wei 2000.00 checking 103 J. Smith 5000.00 savings 104 M. Jones 1000.00 checking 105 H. Martin 10,000.00 checking All combinations fail! Deposit Account T-id Date Amount 102 1 10/22/00 500.00 102 2 10/29/00 200.00 104 3 10/29/00 1000.00 105 4 11/02/00 10,000.00 Balance > 1000 AND Number =Account (Account X Deposit) Number Owner Balance Type Account T-id Date Amount 102 W. Wei 2000.00 checking102 1 10/22/00 500.00 102 W. Wei 2000.00 checking102 2 10/29/00 200.00 30

Account Number Owner Balance Type 101 J. Smith 1000.00 checking 102 W. Wei 2000.00 checking 103 J. Smith 5000.00 savings 104 M. Jones 1000.00 checking 105 H. Martin 10,000.00 checking No! Throw it away. Deposit Account T-id Date Amount 102 1 10/22/00 500.00 102 2 10/29/00 200.00 104 3 10/29/00 1000.00 105 4 11/02/00 10,000.00 Balance > 1000 AND Number =Account (Account X Deposit) Number Owner Balance Type Account T-id Date Amount 102 W. Wei 2000.00 checking102 1 10/22/00 500.00 102 W. Wei 2000.00 checking102 2 10/29/00 200.00 31

Account Number Owner Balance Type 101 J. Smith 1000.00 checking 102 W. Wei 2000.00 checking 103 J. Smith 5000.00 savings 104 M. Jones 1000.00 checking 105 H. Martin 10,000.00 checking No! Throw it away. Deposit Account T-id Date Amount 102 1 10/22/00 500.00 102 2 10/29/00 200.00 104 3 10/29/00 1000.00 105 4 11/02/00 10,000.00 Balance > 1000 AND Number =Account (Account X Deposit) Number Owner Balance Type Account T-id Date Amount 102 W. Wei 2000.00 checking102 1 10/22/00 500.00 102 W. Wei 2000.00 checking102 2 10/29/00 200.00 32

Account Number Owner Balance Type 101 J. Smith 1000.00 checking 102 W. Wei 2000.00 checking 103 J. Smith 5000.00 savings 104 M. Jones 1000.00 checking 105 H. Martin 10,000.00 checking No! Throw it away. Why? Deposit Account T-id Date Amount 102 1 10/22/00 500.00 102 2 10/29/00 200.00 104 3 10/29/00 1000.00 105 4 11/02/00 10,000.00 Balance > 1000 AND Number =Account (Account X Deposit) Number Owner Balance Type Account T-id Date Amount 102 W. Wei 2000.00 checking102 1 10/22/00 500.00 102 W. Wei 2000.00 checking102 2 10/29/00 200.00 33

Account Number Owner Balance Type 101 J. Smith 1000.00 checking 102 W. Wei 2000.00 checking 103 J. Smith 5000.00 savings 104 M. Jones 1000.00 checking 105 H. Martin 10,000.00 checking No! Throw it away. Deposit Account T-id Date Amount 102 1 10/22/00 500.00 102 2 10/29/00 200.00 104 3 10/29/00 1000.00 105 4 11/02/00 10,000.00 Balance > 1000 AND Number =Account (Account X Deposit) Number Owner Balance Type Account T-id Date Amount 102 W. Wei 2000.00 checking102 1 10/22/00 500.00 102 W. Wei 2000.00 checking102 2 10/29/00 200.00 34

Account Number Owner Balance Type 101 J. Smith 1000.00 checking 102 W. Wei 2000.00 checking 103 J. Smith 5000.00 savings 104 M. Jones 1000.00 checking 105 H. Martin 10,000.00 checking The first three fail. Deposit Account T-id Date Amount 102 1 10/22/00 500.00 102 2 10/29/00 200.00 104 3 10/29/00 1000.00 105 4 11/02/00 10,000.00 Balance > 1000 AND Number =Account (Account X Deposit) Number Owner Balance Type Account T-id Date Amount 102 W. Wei 2000.00 checking102 1 10/22/00 500.00 102 W. Wei 2000.00 checking102 2 10/29/00 200.00 35

Account Number Owner Balance Type 101 J. Smith 1000.00 checking 102 W. Wei 2000.00 checking 103 J. Smith 5000.00 savings 104 M. Jones 1000.00 checking 105 H. Martin 10,000.00 checking Yes! Place in query answer. Final answer: Deposit Account T-id Date Amount 102 1 10/22/00 500.00 102 2 10/29/00 200.00 104 3 10/29/00 1000.00 105 4 11/02/00 10,000.00 Balance > 1000 AND Number =Account (Account X Deposit) Number Owner Balance Type Account T-id Date Amount 102 W. Wei 2000.00 checking102 1 10/22/00 500.00 102 W. Wei 2000.00 checking102 2 10/29/00 200.00 105 H. Martin 10,000.00 checking105 4 11/02/00 10,000.00 36

join operator (defined using σ and X) Account Deposit Check Number Owner Balance Type Account Transaction-id Date Amount Account Check-number Date Amount A.Number=Deposit.Account (Account X Deposit) Notice: select condition in first query is used as join condition in second query. is equivalent to Account A.Number=Deposit.Account Deposit 37

A few details about join Each simple Boolean predicate in the join condition must compare an attribute from one relation to an attribute in the other relation. In this query: Account A A.Number=D.Account AND D.type= checking Deposit D the D.type = checking isn t a JOIN condition. If you have a join with NO condition, then it is a cross product by definition. 38

Join.. with all six comparators Student advisor=number Faculty Student S S.age < F.age Faculty F Student S S.salary F.salary Faculty F etc. Join is sometimes called theta-join or θ-join where the θ represents any of the 6 comparators (<, >, =,,, ) (In PostgreSQL the 6 comparators are (<, >, =,!= or <>, >=, <=).) The most common join (with equality) is called equi-join 39

Exercise Class(course, term, room, teacher) teacher is foreign key referencing Faculty.id Faculty(id, name, office) Student(id, name, major) Enrolled(id, course, term, grade) id is a foreign key referencing Student.id (course, term) together is a foreign key referencing Class.(course, term) 40

Faculty(id, name, office) Class(course, term, room, teacher) Enrolled(id, course, term, grade) Student(id, name, major) Provide sample data with several faculty, students, and classes in several terms. Write a relational algebra query that lists faculty id and student id pairs where the student is enrolled in a class that is taught by the faculty member. Write a relational algebra query that lists faculty ids for faculty who teach at least two classes in the same term. 41

Equi-join (reminder) equi join: Account Number=Account Deposit When the join is based on equality, then we always have two identical attributes (columns) in the answer. Number Owner Balance Type Account Trans-id Date Amount 102 W. Wei 2000.00 checking 102 1 10/22/00 500.00 102 W. Wei 2000.00 checking 102 2 10/29/00 200.00 104 M. Jones 1000.00 checking 104 3 10/29/00 1000.00 105 H. Martin 10,000.00 checking 105 4 11/2/00 10000.00 If we use natural join, the duplicate column is eliminated. 42

Natural join Joins two relations by checking for equality on all pairs of attributes with the same name. Eliminates duplicate columns from query answer. This is risky; your queries might change if you change your schema. (If you use natural join in SQL queries.) This is great for textbooks queries are simpler. 43

NATURAL JOIN NATURAL JOIN like a macro that joins tables with an equality check for all attributes with the same name. Course (CNumber, CName, Description) Teacher (TNumber, TName, Phone) Offering (CNumber, TNumber, Time, Days, Room) 44 Teacher Offering Course This query (with natural join) does just what you want. But it requires the schema to be just right.

A simple relational algebra query with zero operators Relational algebra query: Student A relation name, by itself, is a valid relational algebra query. It returns all of the tuples in the relation in the query answer. 45

Relational Algebra Operators There are eight operators project select union intersection difference cross product join division Three operators from set theory renaming (to provide names for the relation & attributes of answer) 46

Union in set theory vs. relational algebra In set theory, the elements of a set can be all different types S = { a, 7053, (1, 2, Smith ), (3, 4, 5, 6, 7, 8, 9)} (atomic values as well as tuples of different lengths) In set theory, you can take the union (or intersection or difference) of any two sets. A = {1, (3, 4, a ), 5.3} B = {7, 1, (2, 3)} A B = {1, (3, 4, a ), 5.3, 7, (2, 3)} A B = {1} A B = {(3, 4, a ), 5.3} But in relational algebra, relations must have the same shape (be union-compatible) before you can take,,. 47

Union Compatible Two relations are union-compatible if they have the same number of attributes and the corresponding attributes have the same name and are defined on the same domains. (this is imprecise because domains/datatypes may not be precise; domains should be compatible) Suppose we have these relations: Checking-Account (num, owner, balance) Savings-Account (num, owner, balance) These are union-compatible relations. 48

Union in Relational Algebra Consider this query: Checking-account Savings-account Checking-account Savings-account num owner balance 101 J. Smith 1000.00 102 W. Wei 2000.00 104 M. Jones 1000.00 105 H. Martin 10,000.00 num owner balance 103 J. Smith 5000.00 49 num owner balance 101 J. Smith 1000.00 102 W. Wei 2000.00 104 M. Jones 1000.00 105 H. Martin 10,000.00 103 J. Smith 5000.00

Intersection in Relational Algebra (example 1) Consider this query: Checking-account Savings-account Checking-account Savings-account num owner balance 101 J. Smith 1000.00 102 W. Wei 2000.00 104 M. Jones 1000.00 105 H. Martin 10,000.00 num owner balance 103 J. Smith 5000.00 What s the answer to this query? 50 50

Intersection in Relational Algebra (ex. 1 cont.) What is the answer to this query: Checking-account Savings-account Checking-account Savings-account What s the answer to this query? num owner balance 101 J. Smith 1000.00 102 W. Wei 2000.00 104 M. Jones 1000.00 105 H. Martin 10,000.00 num owner balance 103 J. Smith 5000.00 It s empty. There are no tuples that are in both relations. 51 51

Intersection in Relational Algebra (example 2) What s the answer to this query? ( owner Checking-account) ( owner Savings-account) Checking-account Savings-account num owner balance 101 J. Smith 1000.00 102 W. Wei 2000.00 104 M. Jones 1000.00 105 H. Martin 10,000.00 num owner balance 103 J. Smith 5000.00 52 52

Intersection in Relational Algebra (ex. 2 cont.) Intermediate query answers ( owner Checking-account) ( owner Savings-account) owner J. Smith W. Wei M. Jones H. Martin owner J. Smith Query answer is (using attribute name from Checkingaccount): owner J. Smith 53 53

Set Difference: Relational Algebra (ex. 1) Consider this query: Checking-account Savings-account Find all the tuples (rows) that are in the Checking-account relation that are not in the Savings-account relation. Checking-account Savings-account num owner balance 101 J. Smith 1000.00 102 W. Wei 2000.00 104 M. Jones 1000.00 105 H. Martin 10,000.00 num owner balance 103 J. Smith 5000.00 What is the answer? 54 54

Set Difference: Relational Algebra (ex. 1 cont.) Consider this query: Checking-account Savings-account Find all the tuples (rows) that are in the Checking-account relation that are not in the Savings-account relation. Checking-account Savings-account num owner balance 101 J. Smith 1000.00 102 W. Wei 2000.00 104 M. Jones 1000.00 105 H. Martin 10,000.00 num owner balance 103 J. Smith 5000.00 What is the answer? All of the rows in the checking-account table. 55 55

Set Difference: Relational Algebra (ex. 2) ( owner Checking-account) ( owner Savings-account) Checking-account Savings-account num owner balance 101 J. Smith 1000.00 102 W. Wei 2000.00 104 M. Jones 1000.00 105 H. Martin 10,000.00 num owner balance 103 J. Smith 5000.00 Compute the intermediate query answers. Then, what is the final query answer? 56 56

Set Difference: Relational Algebra (ex. 2 cont.) ( owner Checking-account) ( owner Savings-account) 57 Checking-account Savings-account Compute the intermediate query answers. Then, what is the final query answer? num owner balance 101 J. Smith 1000.00 102 W. Wei 2000.00 104 M. Jones 1000.00 105 H. Martin 10,000.00 num owner balance 103 J. Smith 5000.00 owner W. Wei M. Jones H. Martin 57

Another example for set operators Graduate-student (id, name, GPA, phone) Undergrad-student (id, name, GPA, phone) These tables are union-compatible; we can issue the following queries: 1. Graduate-student Undergrad-student 2. Graduate-student Undergrad-student 3. Undergrad-student Graduate-student What do these queries compute, described in English? 58 58

Relational Algebra: Divide Operator Suppose we have this extra table, in the Bank database: Account-types Type checking savings Suppose we would like to know which customers have at least one account of each type of account. That is, we want to know who has accounts of ALL the types. 59

We can use the Divide operator in Rel. Alg. ( Owner, Type Account) Account-types Account Number Owner Balance Type 101 J. Smith 1000.00 checking 102 W. Wei 2000.00 checking 103 J. Smith 5000.00 savings 104 M. Jones 1000.00 checking 105 H. Martin 10,000.00 checking Account-types Type checking savings Owner J. Smith Find account owners who have ALL types of accounts. 60

Divide Operator For R S where R (r1, r2, r3, r4) and S(s1, s2) Since S has two attributes, there must be two attributes in R (say r3 and r4) that are defined on the same domains, respectively, as s1 and s2. We could say that (r3, r4) is union-compatible with (s1, s2). The query answer has the remaining attributes (r1, r2). And the answer has a tuple, (r1, r2), in the answer if the (r1, r2) value appears with every S tuple in R. 61

How does divide work? ( Owner, Type Account) Account-types Owner, Type Account Owner Type J. Smith checking W. Wei checking J. Smith savings M. Jones checking H. Martin checking Can we find an owner where there are enough tuples in this table for that owner so that we can match EVERY tuple in Account-types? List all such owners. Account-types Type checking savings Owner 62

Write this query in relational algebra Customer (Number, Name, Address, CRating, CAmount, CBalance, Salesperson) Salesperson (Number, Name, Address, Office) Find the name of salespersons (if there are any) who are assigned to ALL customers. S.Number, S.Name ((( Salesperson, Number Customer) ( Number Customer)) 5 1 2 4 3 X Salesperson=S.Number Salesperson) 63

Why do we use Relational Algebra? Because: It is mathematically defined We can prove that two relational algebra expressions are equivalent. For example: cond1 ( cond2 R) cond2 ( cond1 R) cond1 AND cond2 R ( cond1 R) ( cond2 R) R1 cond R2 cond (R1 R2) 64

Equivalences for AND, OR, and NOT cond1 OR cond2 R ( cond1 R) ( cond2 R) cond1 AND cond2 R ( cond1 R) ( cond2 R) cond1 AND NOT cond2 R ( cond1 R) ( cond2 R) The WHERE clause (and the predicate for the operator) may contain AND, OR, as well as NOT. 65

Uses of Relational Algebra Equivalences To help query writers they can write queries in several different ways To help query optimizers they can choose among different ways to execute the query and in both cases we know for sure that the two queries (the original and the replacement) are identical that they will produce the same answer on all database instances 66

Queries Account Number Owner Balance Type 101 J. Smith 1000.00 checking 102 W. Wei 2000.00 checking 103 J. Smith 5000.00 savings 104 M. Jones 1000.00 checking 105 H. Martin 10,000.00 checking Notice that a query is expressed against the schema. Balance > 1000 AND Number =Account (Account X Deposit) But the query runs or executes against the instance (the data) And may give different answers 67 on different instances Owner J. Smith W. Wei M. Jones H. Martin

Comments on Queries Account Number Owner Balance Type 101 J. Smith 1000.00 checking 102 W. Wei 2000.00 checking 103 J. Smith 5000.00 savings 104 M. Jones 1000.00 checking 105 H. Martin 10,000.00 checking Notice that the answer to a query is always a relation! It doesn t have a name. The attribute names are taken from the input tables. It might or might not have any rows. 68 Owner J. Smith W. Wei M. Jones H. Martin

Comments on Queries Because the answer to a relational query is always a table. we can use the answer from one query as input to another query. This means that we can create arbitrarily complex queries! A relational query languages is closed if it has this property. 69

Example of Codd s Definition of a Relation Suppose we have a relation defined as: Person(name, salary, num, status) with domains defined as: Name-values = {all possible strings of 30 characters} Sal-values = {real numbers between 0 and 100,000} Status-values = { f, p } Num-values = {integers between 0 and 9999} any instance of the relation is always a subset ( ) of: Name-values X Sal-values X Num-values X Status-values Note: a domain is a set of simple, atomic values. 70

Mathematical Definition of a Relational DB (cont.) Each (instance of a) relation is a subset of the cross product of it s domains. One element of a relation is called a tuple. A relation is ALWAYS a set by definition. If you add the element 2 to the set {1, 2, 3, 4} the resulting set is {1, 2, 3, 4} If you add the tuple {101, J. Smith, 1000.00, checking } to the relation on the next slide, you still only have five tuples. 71

Can we define tables (to use in relational algebra)? If you need to define a relation. You could say something like: Let R = {( John, 5, male ), ( Sue, 6, female )} or let R be the table Name Age Gender John 5 male Sue 6 female and then use R in expressions like R X Student or whatever