CS317 File and Database Systems Lecture 3 Relational Calculus and Algebra Part-2 September 10, 2017 Sam Siewert
RDBMS Fundamental Theory http://dilbert.com/strips/comic/2008-05-07/ Relational Algebra and Calculus Part-2 (Basis for SQL Transform-oriented) Sam Siewert 2
For Discussion Read E.F. Codd Paper in More Detail Now Uploaded on Canvas Read Connolly-Begg Chapters 4 & 5, then 6 for SQL practice Assignment #2 Q&A? Focus on Relational Data, Algebra and Calculus Problems Read and Summarize E.F. Codd Paper Will tie into SQL More in Assignment #3 I will Grade Assignment #1 this week with Goal to Finish and Return Monday I Will Never Hold More than One Assignment in any Class [Violation of my Protocol!] Sam Siewert 3
Recall, Goals Relational Algebra or Views Language with Operations that work on Relations (tables) to Define new Relation (no change to Base Relations) Relational Base (Schema) Define Data Model Relational Calculus What Should be Retrieved (Predicate Calculus True/False Assertions) rather than How Today, Focus is to Compare Relational Algebra to Calculus and Understand Sam Siewert 4
Connolly-Begg Chapter 5 Relational Algebra and Calculus Sam Siewert 5
Introduction Relational algebra and relational calculus are formal languages associated with the relational model. Informally, relational algebra is a (high-level) procedural language and relational calculus a non-procedural language. However, formally both are equivalent to one another. A language that produces a relation that can be derived using relational calculus is relationally complete. 6
Relational Algebra Relational algebra operations work on one or more relations to define another relation without changing the original relations. Both operands and results are relations, so output from one operation can become input to another operation. Allows expressions to be nested, just as in arithmetic. This property is called closure. [Important Concept in CS332, Organization of Programming Languages] 7
Relational Algebra Operations [Look for these in E.F. Codd] Intersection can be composed as R (R S) 8
Summary of Relational Algebra Page 132 Recall 5 Necessary Operations Recall 3 that Require Unioncompatible Recall Operations Derived from 5 fundamental and extensions Sam Siewert 9
Relational Algebra Operations 10
Join Cheat Sheet http://www.codeproject.com/kb/database/visual_sql_j oins/visual_sql_joins_orig.jpg Sam Siewert 11
Examples Load up DreamHome v1.0 (updated) siewertsdvh1 Browse with Adminer Create INNER JOIN QUERY Sam Siewert 12
DreamHome Schema Page 112, Figure 4.3 Chapter 7, Problem 7.21, 7.22 Create table(s) for schema loaded on Bb with tab delimited data Can be Used to Verify Examples in Chapter 5 & 6 Sam Siewert 13
Operations on Relations Necessary [Connolly-Begg] 1. Selection 2. Projection 3. Cartesian Product Note that Join can be composed of Cartesian Product with Selection 4. Union 5. Set Difference Note that intersection can be composed as R (R S) Codd s Operations 1. Permutation 2. Restriction 3. Projection 4. Join 5. Composition No Set Difference Codd s Redundancy Concerns 1. Strong redundant attributes in a relation 2. Weak redundant attributes in a database Sam Siewert 14
Book Examples Connolly-Begg Chapter 5 [Summary Table on Page 132] Sam Siewert 15
Selection (or Restriction) predicate (R) Works on a single relation R and defines a relation that contains only those tuples (rows) of R that satisfy the specified condition (predicate). 16
Example - Selection (or Restriction) List all staff with a salary greater than 10,000. salary > 10000 (Staff) 17
Projection col1,..., coln (R) Works on a single relation R and defines a relation that contains a vertical subset of R, extracting the values of specified attributes and eliminating duplicates. 18
Example - Projection Produce a list of salaries for all staff, showing only staffno, fname, lname, and salary details. staffno, fname, lname, salary (Staff) 19
Union R S Union of two relations R and S defines a relation that contains all the tuples of R, or S, or both R and S, duplicate tuples being eliminated. R and S must be union-compatible. If R and S have I and J tuples, respectively, union is obtained by concatenating them into one relation with a maximum of (I + J) tuples. 20
Example - Union List all cities where there is either a branch office or a property for rent. city (Branch) city (PropertyForRent) 21
Set Difference R S Defines a relation consisting of the tuples that are in relation R, but not in S. R and S must be union-compatible. 22
Example - Set Difference List all cities where there is a branch office but no properties for rent. city (Branch) city (PropertyForRent) 23
Intersection R S Defines a relation consisting of the set of all tuples that are in both R and S. R and S must be union-compatible. Expressed using basic operations: R S = R (R S) 24
Example - Intersection List all cities where there is both a branch office and at least one property for rent. city (Branch) city (PropertyForRent) 25
Cartesian product R X S Defines a relation that is the concatenation of every tuple of relation R with every tuple of relation S. 26
Example - Cartesian product List the names and comments of all clients who have viewed a property for rent. ( clientno, fname, lname (Client)) X ( clientno, propertyno, comment (Viewing)) 27
Example - Cartesian product and Selection Use selection operation to extract those tuples where Client.clientNo = Viewing.clientNo. Client.clientNo = Viewing.clientNo (( clientno, fname, lname (Client)) ( clientno, propertyno, comment (Viewing))) Cartesian product and Selection can be reduced to a single operation called a Join. 28
Join Operations Join is a derivative of Cartesian product. Equivalent to performing a Selection, using join predicate as selection formula, over Cartesian product of the two operand relations. One of the most difficult operations to implement efficiently in an RDBMS and one reason why RDBMSs have intrinsic performance problems. 29
Join Operations Various forms of join operation Theta join Equijoin (a particular type of Theta join) Natural join Outer join Semijoin 30
Theta join ( -join) R F S Defines a relation that contains tuples satisfying the predicate F from the Cartesian product of R and S. The predicate F is of the form R.a i S.b i where may be one of the comparison operators (<,, >,, =, ). 31
Theta join ( -join) Can rewrite Theta join using basic Selection and Cartesian product operations. R F S = F (R S) Degree of a Theta join is sum of degrees of the operand relations R and S. If predicate F contains only equality (=), the term Equijoin is used. 32
Example Equi-join List the names and comments of all clients who have viewed a property for rent. ( clientno, fname, lname (Client)) Client.clientNo = Viewing.clientNo ( clientno, propertyno, comment(viewing)) The = is the F 33
Natural join R S An Equijoin of the two relations R and S over all common attributes x. One occurrence of each common attribute is eliminated from the result. 34
Example - Natural join List the names and comments of all clients who have viewed a property for rent. ( clientno, fname, lname (Client)) ( clientno, propertyno, comment (Viewing)) 35
(Left) Outer join To display rows in the result that do not have matching values in the join column, use Outer join. R S (Left) outer join is join in which tuples from R that do not have matching values in common columns of S are also included in result relation. 36
Example - Left Outer join Produce a status report on property viewings. propertyno, street, city (PropertyForRent) Viewing 37
Semijoin R F S Defines a relation that contains the tuples of R that participate in the join of R with S. Can rewrite Semijoin using Projection and Join: R F S = A (R F S) 38
Example - Semijoin List complete details of all staff who work at the branch in Glasgow. Staff Staff.branchNo=Branch.branchNo ( city= Glasgow (Branch)) The = is the F 39
Division R S Defines a relation over the attributes C that consists of set of tuples from R that match combination of every tuple in S. Expressed using basic operations: T 1 C (R) T 2 C ((S X T 1 ) R) T T 1 T 2 40
Example - Division Identify all clients who have viewed all properties with three rooms. ( clientno, propertyno (Viewing)) ( propertyno ( rooms = 3 (PropertyForRent))) 41
Aggregate Operations AL (R) Applies aggregate function list, AL, to R to define a relation over the aggregate list. AL contains one or more (<aggregate_function>, <attribute>) pairs. Main aggregate functions are: COUNT, SUM, AVG, MIN, and MAX. 42
Example Aggregate Operations How many properties cost more than 350 per month to rent? R (mycount) COUNT propertyno (σ rent > 350 (PropertyForRent)) 43
Grouping Operation GA AL (R) Groups tuples of R by grouping attributes, GA, and then applies aggregate function list, AL, to define a new relation. AL contains one or more (<aggregate_function>, <attribute>) pairs. Resulting relation contains the grouping attributes, GA, along with results of each of the aggregate functions. 44
Example Grouping Operation Find the number of staff working in each branch and the sum of their salaries. R (branchno, mycount, mysum) branchno COUNT staffno, SUM salary (Staff) 45
Summary Value of Theory Sam Siewert 46
Relationally Complete Necessary Expressive Power for DBMS Avoids Redundancy in Relational Base (Except for Foreign Keys) Results in Minimal Physical Resource Requirements Results in Maximal Views Possible for Applications Built on Shared Data Model Provides Common Standard Core SQL Sam Siewert 47
Extensions DMLs Most Often Add Operators for Calculations, Summary, Ordering (Sort) See Full set of Relational Algebra Operators (Page 132) Algebra Has more than Necessary Relational Calculus has Just What s Necessary DMLs Have Much More than Necessary for Convenience (SQL:2011) with Trend to Add Procedural Features PL/SQL Sam Siewert 48
Relational Calculus - Complete Existential Qualifier There Exists an X such that Universal Qualifier For all X Same Operators as Boolean Algebra, but Applied to Predicates rather than Binary Predicate is set of all X such that P is true for x in {X P(X} Conjunction ^ is AND Disjunction v is OR Negation ~ is NOT In Tuple Expressions, Free Variables are Left of the Note De Morgan s Laws (Page 134) Sam Siewert 49
Tuple Relational Calculus - Example To find details of all staff earning more than 10,000: {S Staff(S) S.salary > 10000} To find a particular attribute, such as salary, write: {S.salary Staff(S) S.salary > 10000} 50
Tuple Relational Calculus Existential quantifier used in formulae that must be true for at least one instance, such as: Staff(S) ( B)(Branch(B) (B.branchNo = S.branchNo) B.city = London ) Means There exists a Branch tuple with same branchno as the branchno of the current Staff tuple, S, and is located in London. 51
Tuple Relational Calculus Universal quantifier is used in statements about every instance, such as: B) (B.city Paris ) Means For all Branch tuples, the address is not in Paris. Can also use ~( B) (B.city = Paris ) which means There are no branches with an address in Paris. 52
Example - Tuple Relational Calculus List the names of all managers who earn more than 25,000. {S.fName, S.lName Staff(S) S.position = Manager S.salary > 25000} List the staff who manage properties for rent in Glasgow. {S Staff(S) ( P) (PropertyForRent(P) (P.staffNo = S.staffNo) P.city = Glasgow )} 53
Tuple Relational Calculus Expressions can generate an infinite set. For example: {S ~Staff(S)} To avoid this, add restriction that all values in result must be values in the domain of the expression. 54
Domain Relational Calculus IGNORE THIS FOR OUR CLASS READ CHAPTER 6, SQL DML Back to Hands-on with MySQL in Assignment #3 Relational Algebra, Tuple Calculus form Basis for Core SQL Sam Siewert 55
More Background on Predicate Calculus Propositional Logic Calculator - http://synsem.appspot.com/al Lacks the for-all Universal and there-exists Existential Qualifiers Leslie Lamport Logic Calculator Project Sam Siewert 56