Relational Algebra. Relational Algebra. 7/4/2017 Md. Golam Moazzam, Dept. of CSE, JU

Similar documents
Chapter 3: Relational Model

CS34800 Information Systems. The Relational Model Prof. Walid Aref 29 August, 2016

Chapter 2: Relational Model

UNIT- II (Relational Data Model & Introduction to SQL)

Database System Concepts, 5 th Ed.! Silberschatz, Korth and Sudarshan See for conditions on re-use "

Database System Concepts

Chapter 6: Formal Relational Query Languages

Upon completion of this Unit, students will be introduced to the following

Chapter 4: SQL. Basic Structure

Lecture 3 SQL. Shuigeng Zhou. September 23, 2008 School of Computer Science Fudan University

Textbook: Chapter 6! CS425 Fall 2013 Boris Glavic! Chapter 3: Formal Relational Query. Relational Algebra! Select Operation Example! Select Operation!

SQL. Lecture 4 SQL. Basic Structure. The select Clause. The select Clause (Cont.) The select Clause (Cont.) Basic Structure.

Basic Structure Set Operations Aggregate Functions Null Values Nested Subqueries Derived Relations Views Modification of the Database Data Definition

DATABASE TECHNOLOGY - 1MB025

Relational Algebra. Procedural language Six basic operators

DATABASE TECHNOLOGY - 1MB025

DATABASTEKNIK - 1DL116

QQ Group

Chapter 3: SQL. Database System Concepts, 5th Ed. Silberschatz, Korth and Sudarshan See for conditions on re-use

Chapter 3: SQL. Chapter 3: SQL

Chapter 5: Other Relational Languages

Relational Model. Prepared by: Ms. Pankti Dharwa ( Asst. Prof, SVBIT)

Silberschatz, Korth and Sudarshan See for conditions on re-use

DATABASE DESIGN I - 1DL300

Database Systems SQL SL03

Chapter 2: Intro to Relational Model

Chapter 8: Relational Algebra

2.2.2.Relational Database concept

Chapter 5: Other Relational Languages.! Query-by-Example (QBE)! Datalog

Database Systems SQL SL03

DATABASE TECHNOLOGY. Spring An introduction to database systems

Domain Constraints Referential Integrity Assertions Triggers. Authorization Authorization in SQL

Relational Model, Relational Algebra, and SQL

Comp 5311 Database Management Systems. 2. Relational Model and Algebra

Simple SQL Queries (2)

The SQL database language Parts of the SQL language

Database Systems. Answers

RELATIONAL ALGEBRA. CS121: Relational Databases Fall 2017 Lecture 2

Other Relational Query Languages

CMP-3440 Database Systems

Database System Concepts, 5th Ed.! Silberschatz, Korth and Sudarshan See for conditions on re-use "

CSIE30600/CSIEB0290 Database Systems Relational Algebra and Calculus 2

Database Management System 11

Outline. CSIE30600 Database Systems Relational Algebra and Calculus 2

DBMS: AN INTERACTIVE TUTORIAL

SQL QUERIES. CS121: Relational Databases Fall 2017 Lecture 5

Relational Algebra. Mr. Prasad Sawant. MACS College. Mr.Prasad Sawant MACS College Pune

Relational Query Languages: Relational Algebra. Juliana Freire

A database can be modeled as: + a collection of entities, + a set of relationships among entities.

UNIT 2 RELATIONAL MODEL

DATABASE DESIGN I - 1DL300

Chapter 6 The Relational Algebra and Calculus

ARTICLE RELATIONAL ALGEBRA

Database System Concepts, 5th Ed.! Silberschatz, Korth and Sudarshan See for conditions on re-use "

DATABASDESIGN FÖR INGENJÖRER - 1DL124

DATABASE DESIGN - 1DL400

The Relational Algebra

1. The rst database systems were based on the network and hierarchical models. These are covered briey

Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. Chapter 6 Outline. Unary Relational Operations: SELECT and

Relational Model History. COSC 304 Introduction to Database Systems. Relational Model and Algebra. Relational Model Definitions.

Chapter 5 Relational Algebra. Nguyen Thi Ai Thao

CS121 MIDTERM REVIEW. CS121: Relational Databases Fall 2017 Lecture 13

Introduction Relational Algebra Operations

DATABASE DESIGN I - 1DL300

CS 377 Database Systems

DATABASE DESIGN - 1DL400

Relational Model: History

Contents Contents Introduction Basic Steps in Query Processing Introduction Transformation of Relational Expressions...

Chapter 14 Query Optimization

Chapter 14 Query Optimization

Chapter 14 Query Optimization

RELATIONAL DATA MODEL: Relational Algebra

Other Query Languages II. Winter Lecture 12

Chapter 4: SQL Database System Concepts 4.1 Silberschatz, Korth and Sudarshan

CSIT5300: Advanced Database Systems

Darshan Institute of Engineering & Technology Relational Model

Database System Concepts

Relational Algebra. Relational Algebra Overview. Relational Algebra Overview. Unary Relational Operations 8/19/2014. Relational Algebra Overview

Ian Kenny. November 28, 2017

More on SQL. Juliana Freire. Some slides adapted from J. Ullman, L. Delcambre, R. Ramakrishnan, G. Lindstrom and Silberschatz, Korth and Sudarshan

Chapter 8: The Relational Algebra and The Relational Calculus

Integrity and Security

1. Considering functional dependency, one in which removal from some attributes must affect dependency is called

Lecture Notes for 3 rd August Lecture topic : Introduction to Relational Model. Rishi Barua Shubham Tripathi

Chapter 6 5/2/2008. Chapter Outline. Database State for COMPANY. The Relational Algebra and Calculus

Chapter 2: Relational Model

Chapter 6 - Part II The Relational Algebra and Calculus

Design Process Modeling Constraints E-R Diagram Design Issues Weak Entity Sets Extended E-R Features Design of the Bank Database Reduction to

CSCC43H: Introduction to Databases. Lecture 4

SQL (Structured Query Language)

Relational Algebra 1. Week 4

Relational Algebra and SQL

Chapter 12: Query Processing

Database Technology Introduction. Heiko Paulheim

Relational Algebra and SQL. Basic Operations Algebra of Bags

Relational Algebra. Algebra of Bags

Chapter 14: Query Optimization

Set theory is a branch of mathematics that studies sets. Sets are a collection of objects.

INF1383 -Bancos de Dados

Relational Database: The Relational Data Model; Operations on Database Relations

Transcription:

Relational Algebra 1

Structure of Relational Databases A relational database consists of a collection of tables, each of which is assigned a unique name. A row in a table represents a relationship among a set of values. Since a table is a collection of such relationships, there is a close correspondence between the concept of table and the mathematical concept of relation, from which the relational data model takes its name. Example of a Relation: 2

Basic Structure Consider the account table. It has three column headers: account-number, branch-name, and balance. For each attribute, there is a set of permitted values, called the domain of that attribute. For the attribute branch-name, for example, the domain is the set of all branch names. Let D 1 denote the set of all account numbers, D 2 the set of all branch names, and D 3 the set of all balances. Any row of account must consist of a 3-tuple (v 1, v 2, v 3 ), where v 1 is an account number (that is, v 1 is in domain D 1 ), v 2 is a branch name (that is, v 2 is in domain D 2 ), and v 3 is a balance (that is, v 3 is in domain D 3 ). 3

Tuple Variable A tuple variable is a variable that stands for a tuple. In other words, a tuple variable is a variable whose domain is the set of all tuples. In the account relation, there are seven tuples. Let the tuple variable t refer to the first tuple of the relation. Thus, t[account-number] = A-101, and t[branch-name] = Downtown. Alternatively, we may write t[1] to denote the value of tuple t on the first attribute (account-number), t[2] to denote branchname, and so on. Since a relation is a set of tuples, we use the mathematical notation of t r to denote that tuple t is in relation r. 4

Attribute Types Each attribute of a relation has a name. The set of allowed values for each attribute is called the domain of the attribute. Attribute values are (normally) required to be atomic, that is, indivisible. E.g. multivalued attribute values are not atomic. E.g. composite attribute values are not atomic. The special value null is a member of every domain. The null value causes complications in the definition of many operations. 5

Query Languages A query language is a language in which a user requests information from the database. These languages are usually on a level higher than that of a standard programming language. Query languages can be categorized as either procedural or nonprocedural. In a procedural language, the user instructs the system to perform a sequence of operations on the database to compute the desired result. In a nonprocedural language, the user describes the desired information without giving a specific procedure for obtaining that information. Very widely used query language is SQL. Some other query languages: QBE and Datalog. 6

Relational Algebra Relational algebra is a procedural query language. It consists of a set of operations that take one or two relations as input and produce a new relation as their result. The fundamental operations in the relational algebra are select, project, union, set difference, Cartesian product, and rename. select, project, and rename operations are called unary operations, because they operate on one relation. The other three operations operate on pairs of relations and are, therefore, called binary operations. There are several other operations namely, set intersection, natural join, division, and assignment. 7

Consider the following database branch(branch- name, branch-city, assets) customer(customer-name, customer-street, customer-city) loan(loan-number, branch-name, amount) borrower(customer-name, loan-number) account(account-number, branch-name, balance) depositor(customer-name, account-number) 8

Select Operation The select operation selects tuples that satisfy a given predicate. The lowercase Greek letter sigma (σ) used to denote selection. The predicate appears as a subscript to σ. The argument relation is in parentheses after the σ. Notation: p (r) p is called the selection predicate. 9

Select Operation Example: Select those tuples of the loan relation where the branch is Perryridge. σ branch-name = Perryridge (loan) Find all tuples in which the amount lent is more than $1200. σ amount>1200 (loan) Allow comparisons using =,, <,, >, in the selection predicate. We can combine several predicates by using the connectives: and ( ), or ( ), and not ( ). 10

Select Operation Find those tuples pertaining to loans of more than $1200 made by the Perryridge branch. σ branch-name = Perryridge amount>1200 (loan) 11

Project Operation Project operation is a unary operation that returns its argument relation, with certain attributes left out. Since a relation is a set, any duplicate rows are eliminated. Projection is denoted by the uppercase Greek letter pi (Π). We list those attributes that we wish to appear in the result as a subscript to Π. The argument relation follows in parentheses. Notation: A1, A2,, Ak (r) where A 1, A 2 are attribute names and r is a relation name. 12

Project Operation Example: List all loan numbers and the amount of the loan. Π loan-number, amount (loan) Find those customers who live in Harrison. Π customer-name (σ customer-city = Harrison (customer)) 13

Union Operation Notation: r s Defined as: For r s to be valid: r s = {t t r or t s} r, s must have the same number of attributes. Attribute domains must be compatible. Example: Find the names of all bank customers who have either an account or a loan or both. Π customer-name (borrower ) Π customer-name (depositor) Since relations are sets, duplicate values are eliminated. 14

Set Difference Operation Set-difference operation, denoted by, allows us to find tuples that are in one relation but are not in another. Notation: r s Defined as: r s = {t t r and t s} Set differences must be taken between compatible relations. r and s must have the same number of attributes. Attribute domains of r and s must be compatible. Example: Find all customers of the bank who have an account but not a loan. Π customer-name (depositor) Π customer-name (borrower ) 15

Cartesian-Product Operation Cartesian-product operation, denoted by a cross ( ), allows us to combine information from any two relations. Notation: r x s Defined as: r x s = {t q t r and q s} Assume that attributes of r(r) and s(s) are disjoint. (That is, R S = ). If attributes of r(r) and s(s) are not disjoint, then renaming must be used. 16

Cartesian-Product Operation Example: Relations r, s: r A B s C D E Thus, r x s: A B C D E 1 2 10 10 20 10 a a b b 1 1 1 1 2 2 2 2 10 10 20 10 10 10 20 10 a a b b a a b b 17

Cartesian-Product Operation Example: A=C (r x s) A B C D E 1 2 2 10 20 20 a a b r = borrower loan is (customer-name, borrower.loan-number, loan.loannumber, branch-name, amount) Find the names of all customers who have a loan at the Perryridge branch. Π customer-name (σ borrower.loan-number =loan.loan-number (σ branch-name = Perryridge (borrower loan))) 18

Rename Operation rename operator, denoted by the lowercase Greek letter rho (ρ). Notation: ρ x (E) returns the result of expression E under the name x. A second form of the rename operation is as follows. ρ x(a1,a2,...,an) (E) returns the result of expression E under the name x, and with the attributes renamed to A1,A2,...,An. 19

Rename Operation Example: Find the largest account balance in the bank. Π balance (account) Π account.balance (σ account.balance < d.balance (account ρ d (account))) Find the names of all customers who live on the same street and in the same city as Smith. Π customer.customer-name (σ customer.customer-street=smith-addr.street customer.customer-city=smith-addr.city (customer ρ smith-addr(street,city) (Π customer-street, customer-city (σ customer-name= Smith (customer))))) 20

Set-Intersection Operation set intersection denoted by ( ). Notation: r s Defined as: r s ={ t t r and t s } Assume: r and s must have the same number of attributes. Attribute domains of r and s must be compatible. 21

Set-Intersection Operation Relation r, s: A B A B Thus, r s : A B 2 r 1 2 1 s 2 3 Example: Find all customers who have both a loan and an account. Π customer-name (borrower ) Π customer-name (depositor) 22

Natural-Join Operation natural join is a binary operation that allows us to combine certain selections and a Cartesian product into one operation. It is denoted by the join symbol : natural-join operation forms a Cartesian product of its two arguments, performs a selection forcing equality on those attributes that appear in both relation schemas, and finally removes duplicate attributes. Notation: r s Let R = (A, B, C, D) and S = (E, B, D) Result schema = (A, B, C, D, E) r s is defined as: r.a, r.b, r.c, r.d, s.e ( r.b = s.b r.d = s.d (r x s)) 23

Natural-Join Operation Example: Find the names of all customers who have a loan at the bank, along with the loan number and the loan amount. Using Cartesian product: Π customer-name, loan.loan-number, amount (σ borrower.loan-number =loan.loan-number (borrower loan)) Using Natural-Join operation: Π customer-name, loan-number, amount (borrower loan) 24

Natural-Join Operation Find the names of all branches with customers who have an account in the bank and who live in Harrison. Π branch-name (σ customer-city = Harrison (customer account depositor)) 25

Division Operation division operation, denoted by, is suited to queries that include the phrase for all. Example: Find all customers who have an account at all the branches located in Brooklyn. r1 = Π branch-name (σ branch-city = Brooklyn (branch)) r2 = Π customer-name, branch-name (depositor account) Now, find customers who appear in r2 with every branch name in r1. Π customer-name, branch-name (depositor account) Π branch-name (σ branch-city = Brooklyn (branch)) 26

Assignment Operation Assignment operation () provides a convenient way to express complex queries. Aggregate Functions Aggregate functions take a collection of values as input and return a single value as a result. avg: average value min: minimum value max: maximum value sum: sum of values count: number of values 27

Aggregate Functions The aggregate function avg returns the average of the values. The aggregate function min returns the minimum value in a collection. The aggregate function max returns the maximum value in a collection. The aggregate function sum returns the sum of the values. The aggregate function count returns the number of the elements in the collection. 28

Aggregate Functions Aggregate operation in relational algebra: G1, G2,, Gn g F1( A1), F2( A2),, Fn( An) (E) E is any relational-algebra expression. G 1, G 2, G n is a list of attributes on which to group (group by). Each F i is an aggregate function. Each A i is an attribute name. 29

Aggregate Functions Find out the total sum of salaries of all the part-time employees in the bank. g sum(salary) (pt-works) ; pt-works(employee-name, branch-name, salary) The symbol g is the letter G in calligraphic font; read it as calligraphic G. Its subscript specifies the aggregate operation to be applied. 30

Aggregate Functions Find the number of branches appearing in the pt-works relation. g count - distinct(branch-name) (pt-works) Find the total salary sum of all part-time employees at each branch of the bank. branch-name g sum(salary) (pt-works) The attribute branch-name in the left-hand subscript of G indicates that the input relation pt-works must be divided into groups based on the value of branch-name. 31

Outer Join outer-join operation is an extension of the join operation to deal with missing information. It avoids loss of information. Consider loan and borrower relations: loan-number L-170 L-230 L-260 customer-name Jones Smith Hayes branch-name Downtown Redwood Perryridge loan-number L-170 L-230 L-155 amount 3000 4000 1700 32

Outer Join Inner join operation (natural join) loan borrower loan-number L-170 L-230 branch-name Downtown Redwood amount 3000 4000 customer-name Jones Smith We have lost some information. We can use the outer-join operation to avoid this loss of information. There are actually three forms of the outer-join operation: left outer join, denoted by ; right outer join, denoted by full outer join, denoted by. and 33

Outer Join Left-outer join loan borrower loan-number branch-name amount customer-name L-170 L-230 L-260 Downtown Redwood Perryridge 3000 4000 1700 Jones Smith null left outer join takes all tuples in the left relation that did not match with any tuple in the right relation, pads the tuples with null values for all other attributes from the right relation, and adds them to the result of the natural join. 34

Outer Join Right-outer join loan borrower loan-number branch-name amount customer-name L-170 L-230 L-155 Downtown Redwood null 3000 4000 null Jones Smith Hayes right outer join is symmetric with the left outer join: It pads tuples from the right relation that did not match any from the left relation with nulls and adds them to the result of the natural join. 35

Outer Join Full-outer join loan borrower loan-number branch-name amount customer-name L-170 L-230 L-260 L-155 Downtown Redwood Perryridge null 3000 4000 1700 null Jones Smith null Hayes full outer join does both of those operations, padding tuples from the left relation that did not match any from the right relation, as well as tuples from the right relation that did not match any from the left relation, and adding them to the result of the join. 36

Null Values It is possible for tuples to have a null value, denoted by null, for some of their attributes. null signifies an unknown value or that a value does not exist. The result of any arithmetic expression involving null is null. Aggregate functions simply ignore null values. For duplicate elimination and grouping, null is treated like any other value, and two nulls are assumed to be the same 37

Modification of the Database The content of the database may be modified using the following operations: Deletion Insertion Updating All these operations are expressed using the assignment operator. 38

Deletion A delete request is expressed similarly to a query, except instead of displaying tuples to the user, the selected tuples are removed from the database. We can delete only whole tuples; we cannot delete values on only particular attributes. In relational algebra a deletion is expressed by: r r E where r is a relation and E is a relational-algebra query. 39

Deletion Example: Delete all of Smith s account records. depositor depositor σ customer-name = Smith (depositor) Delete all loans with amount in the range 0 to 50. loan loan σ amount 0 and amount 50 (loan) Delete all accounts at branches located in Needham. r 1 σ branch-city = Needham (account branch) r 2 Π branch-name, account -number, balance (r 1 ) account account r 2 40

Insertion To insert data into a relation, we either specify a tuple to be inserted or write a query whose result is a set of tuples to be inserted. The relational algebra expresses an insertion by: r r E where r is a relation and E is a relational-algebra expression. Example: Insert information in the database specifying that Smith has $1200 in account A-973 at the Perryridge branch. account account {(A-973, Perryridge, 1200)} depositor depositor {( Smith, A-973)} 41

Updating A mechanism to change a value in a tuple without changing all values in the tuple. Use the generalized-projection operator to do this task: r Π F1,F2,...,Fn (r) where each Fi is either the ith attribute of r, if the ith attribute is not updated, or, if the attribute is to be updated, Fi is an expression, involving only constants and the attributes of r, that gives the new value for the attribute. 42

Updating Example: Make interest payments by increasing all balances by 5 percent. account Π account-number, branch-name, balance *1.05 (account) Pay all accounts with balances over $10,000 6 percent interest and pay all others 5 percent. account Π AN,BN, balance *1.06 (σ balance>10000 (account)) Π AN, BN balance *1.05 (σ balance 10000 (account)) where the abbreviations AN and BN stand for accountnumber and branch-name, respectively. 43

Views In some cases, it is not desirable for all users to see the entire logical model (i.e., all the actual relations stored in the database.) Consider a person who needs to know a customer s loan number but has no need to see the loan amount. This person should see a relation described, in the relational algebra, by customer-name, loan-number (borrower loan) Any relation that is not of the conceptual model but is made visible to a user as a virtual relation is called a view. 44

View definition A view is defined using the create view statement which has the form: create view v as <query expression> where <query expression> is any legal relational algebra query expression. The view name is represented by v. Once a view is defined, the view name can be used to refer to the virtual relation that the view generates. View definition is not the same as creating a new relation by evaluating the query expression. 45

View definition Example: Create a view (named all-customer) consisting of branches and their customers. create view all-customer as branch-name, customer-name (depositor account) branch-name, customer-name (borrower loan) 46