Databases Tutorial March,15,2012 Jing Chen Mcmaster University
Outline 1NF Functional Dependencies BCNF 3NF
Larger Schema Suppose we combine borrower and loan to get bor_loan - borrower = (customer_id, loan_number) - loan = (loan_number, amount) - bor_loan = (customer_id, loan_number, amount )
Larger Schema (Cont.) Redundancy! Inconsistency!
Larger Schema (Cont.) Suppose we combine loan and loan_branch to get loan_amt_br - loan = (loan_number, amount) - Loan_branch = (loan_number, branch_name) - Loan_amt_br = (loan_number, amount, branch_name )
Larger Schema (Cont.) Null values!
Smaller Schema Suppose we split up employee into employee1 and employee2 - employee = (id, name, telephone_number, start_date) - employee1 = (id, name) - employee2 = (name, telephone_number, start_date)
Lossy Decomposition
First Normal Form (1NF) Domain is atomic if its elements are considered to be indivisible units A relational schema R is in first normal form if the domains of all attributes of R are atomic Exmaple - Identification numbers like CS101 that can be broken up into parts
Functional Dependencies Let R be a relation schema α R and β R The functional dependency α β holds on R if and only if for any legal relations r(r), whenever any two tuples t 1 and t 2 of r agree on the attributes α, they also agree on the attributes β. That is, t 1 [α] = t 2 [α] t 1 [β ] = t 2 [β ] A functional dependency α β is trivial if β α
Functional Dependencies (Cont.) Example R(A,B,C,D), PK=(A,B) Functional dependency α β t1 A a1 B b1 C c1 D d1 (1) α R and β R (2) t1[α] = t2 [α] t1[β ] = t2 [β ] t2 t3 t4 t5 a1 a2 a2 a3 b2 b2 b3 b3 c1 c2 c2 c2 d2 d2 d3 d4 A C does hold C A does NOT hold A,B A is trivial B A is NOT trivial
Boyce-Codd Normal Form (BCNF) R is in BCNF if for any A B one of the following holds -it is a trivial dependency (B is a subset of A) -A is a superkey for R
BCNF (Cont.) Example Consider a relation Winner (Tournament, Year, Name, Province), you know the following facts: the only candidate key is (Tournament, Year) and except the key, there is only one other functional dependency Name Province. (For instance, the tournaments are held once a year, Province is the birth province of the winner, and from each province only a single representative is allowed.) (1) Give a definition of BCNF Answer: A relation R is in BCNF if for any functional dependency α β, either - α β is trivial or - α is a superkey of R
BCNF (Cont.) Example Consider a relation Winner (Tournament, Year, Name, Province), you know the following facts: the only candidate key is (Tournament, Year) and except the key, there is only one other functional dependency Name Province. (For instance, the tournaments are held once a year, Province is the birth province of the winner, and from each province only a single representative is allowed.) (2) Show relation Winner is not in BCNF Answer: Consider the dependency Name Province, it is not trivial. {Name} is not a superkey. Thus, this dependency violates BCNF.
BCNF (Cont.) Example Consider a relation Winner (Tournament, Year, Name, Province), you know the following facts: the only candidate key is (Tournament, Year) and except the key, there is only one other functional dependency Name Province. (For instance, the tournaments are held once a year, Province is the birth province of the winner, and from each province only a single representative is allowed.) (3) Decompose (losslessly) relation Winner into relations that are in BCNF Answer: Tournament (Tournament, Year, Name), where PK=(Tournament, Year) and Player (Name, Province), where PK=(Name), These tables are in BCNF and it is a lossless decomposition
BCNF (Cont.) Decompose a schema into BCNF Suppose we have a schema R and a nontrivial dependency α β causes a violation of BCNF. We decompose R into: - (α U β ) - ( R - ( β - α ) )
Third Normal Form (3NF) R is in 3NF if for any A B one of the following holds - it is a trivial dependency (B is a subset of A) - A is a superkey for R - every attribute t from B-A is a part of a candidate key
3NF (Cont.) Example Consider a relation SALES (CustomerID, CustomerName, Salesperson, Region) (1) fact: the only candidate key is CustomerID (2) fact: Salesperson Region Answer: the dependency Salesperson Region violates 3NF Decomposition: SALES(CustomerID,CustomerName,Salesperson) PK=CustomerID SALESPERSON(Salesperson, Region) PK=Saleperson these tables are in 3NF and it is a lossless decomposition.
Reference Databases system concepts, fifth edition, Authored by Abraham Silberschatz, Henry F.Korth, S.Sudarshan Slides of Chapter 7: Relational Databases Design,Databases system concepts, fifth edition, Authored by Abraham Silberschatz, Henry F.Korth, S.Sudarshan Databases course website > Help > 3NF exercises, Copyright preserved by Dr. Frantisek (Franya) FRANEK Midterm 2 of 2010, Copyright preserved by Dr. Frantisek (Franya) FRANEK