Theory of Normal Forms Decomposition of Relations. Overview

Similar documents
Part II: Using FD Theory to do Database Design

Normalization. Murali Mani. What and Why Normalization? To remove potential redundancy in design

Databases Lecture 7. Timothy G. Griffin. Computer Laboratory University of Cambridge, UK. Databases, Lent 2009

Database Systems. Basics of the Relational Data Model

Database design III. Quiz time! Using FDs to detect anomalies. Decomposition. Decomposition. Boyce-Codd Normal Form 11/4/16

Normalisation. Normalisation. Normalisation

CSIT5300: Advanced Database Systems

Lecture 4. Database design IV. INDs and 4NF Design wrapup

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2009 Lecture 4 - Schema Normalization

Desired properties of decompositions

Design Theory for Relational Databases

Chapter 16. Relational Database Design Algorithms. Database Design Approaches. Top-Down Design

Relational Database Design Theory. Introduction to Databases CompSci 316 Fall 2017

CSCI 403: Databases 13 - Functional Dependencies and Normalization

Databases -Normalization I. (GF Royle, N Spadaccini ) Databases - Normalization I 1 / 24

Functional Dependencies and Finding a Minimal Cover

Relational Design: Characteristics of Well-designed DB

CS411 Database Systems. 05: Relational Schema Design Ch , except and

Introduction to Databases, Fall 2003 IT University of Copenhagen. Lecture 4: Normalization. September 16, Lecturer: Rasmus Pagh

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Database Design Theory and Normalization. CS 377: Database Systems

Lecture 11 - Chapter 8 Relational Database Design Part 1

Chapter 10. Normalization. Chapter Outline. Chapter Outline(contd.)

This lecture. Databases -Normalization I. Repeating Data. Redundancy. This lecture introduces normal forms, decomposition and normalization.

Final Review. Zaki Malik November 20, 2008

Schema Normalization. 30 th August Submitted By: Saurabh Singla Rahul Bhatnagar

Chapter 14. Database Design Theory: Introduction to Normalization Using Functional and Multivalued Dependencies

Functional Dependencies CS 1270

Schema Refinement: Dependencies and Normal Forms

Normalization. Anomalies Functional Dependencies Closures Key Computation Projecting Relations BCNF Reconstructing Information Other Normal Forms

Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Overview - detailed. Goal. Faloutsos & Pavlo CMU SCS /615

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2009 Lecture 3 - Schema Normalization

CMU SCS CMU SCS CMU SCS CMU SCS whole nothing but

MODULE: 3 FUNCTIONAL DEPENDENCIES

Lectures 5 & 6. Lectures 6: Design Theory Part II

UNIT 3 DATABASE DESIGN

Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 10-2

Functional Dependencies and Normalization for Relational Databases Design & Analysis of Database Systems

CSE 562 Database Systems

Relational Design Theory. Relational Design Theory. Example. Example. A badly designed schema can result in several anomalies.

Chapter 10. Chapter Outline. Chapter Outline. Functional Dependencies and Normalization for Relational Databases

Redundancy:Dependencies between attributes within a relation cause redundancy.

Schema Refinement: Dependencies and Normal Forms

Theory of Normal Forms Examples. Database Anomalies. .. Winter 2008 CPE/CSC 366: Database Modeling, Design and Implementation Alexander Dekhtyar..

More Normalization Algorithms. CS157A Chris Pollett Nov. 28, 2005.

Functional dependency theory

Schema Refinement & Normalization Theory 2. Week 15

Introduction to Database Design, fall 2011 IT University of Copenhagen. Normalization. Rasmus Pagh

COSC Dr. Ramon Lawrence. Emp Relation

CS 338 Functional Dependencies

Schema Refinement: Dependencies and Normal Forms

Database Management System Prof. Partha Pratim Das Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur

5 Normalization:Quality of relational designs

Databases The theory of relational database design Lectures for m

Functional Dependencies and. Databases. 1 Informal Design Guidelines for Relational Databases. 4 General Normal Form Definitions (For Multiple Keys)

Concepts from

Chapter 6: Relational Database Design

Chapter 7: Relational Database Design

Chapter 14 Outline. Normalization for Relational Databases: Outline. Chapter 14: Basics of Functional Dependencies and

Normalisation theory

Informal Design Guidelines for Relational Databases

Lectures 12: Design Theory I. 1. Normal forms & functional dependencies 2/19/2018. Today s Lecture. What you will learn about in this section

Lecture 6a Design Theory and Normalization 2/2

Database Design Principles

Relational Database Design (II)

The strategy for achieving a good design is to decompose a badly designed relation appropriately.

In This Lecture. Normalisation to BCNF. Lossless decomposition. Normalisation so Far. Relational algebra reminder: product

Case Study: Lufthansa Cargo Database

DATABASE MANAGEMENT SYSTEM SHORT QUESTIONS. QUESTION 1: What is database?

Steps in normalisation. Steps in normalisation 7/15/2014

customer = (customer_id, _ customer_name, customer_street,

Chapter 8: Relational Database Design

Relational Database design. Slides By: Shree Jaswal

The Relational Data Model

Schema Refinement and Normal Forms

Review: Attribute closure

Relational Design 1 / 34

E-R diagrams and database schemas. Functional dependencies. Definition (tuple, attribute, value). A tuple has the form

Fundamentals of Database Systems

Unit 3 : Relational Database Design

CS 2451 Database Systems: Database and Schema Design

6.830 Lecture PS1 Due Next Time (Tuesday!) Lab 1 Out end of week start early!

Database Management System

UNIT -III. Two Marks. The main goal of normalization is to reduce redundant data. Normalization is based on functional dependencies.

Databases Normalization. Christian S. Jensen Computer Science, Aarhus University

Homework 3: Relational Database Design Theory (100 points)

Database Normalization. (Olav Dæhli 2018)

Administrivia. Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Relational Model: Keys. Correction: Implied FDs

Relational Model and Relational Algebra A Short Review Class Notes - CS582-01

DBMS Chapter Three IS304. Database Normalization-Comp.

Relational Design Theory

V. Database Design CS448/ How to obtain a good relational database schema

FUNCTIONAL DEPENDENCIES

NORMAL FORMS. CS121: Relational Databases Fall 2017 Lecture 18

Normal Forms. Winter Lecture 19

Database Constraints and Design

Combining schemas. Problems: redundancy, hard to update, possible NULLs

Normalization 03. CSE3421 notes

Functional Dependencies & Normalization for Relational DBs. Truong Tuan Anh CSE-HCMUT

Database Systems. Normalization Lecture# 7

Transcription:

.. Winter 2008 CPE/CSC 366: Database Modeling, Design and Implementation Alexander Dekhtyar.. Overview Theory of Normal Forms Decomposition of Relations Functional Dependencies capture the attribute dependencies within a relational table. Functional Dependencies that do not involve full keys lead to anomalies: problems with management/maintenance of tables. Normal Forms are restrictions on functional dependencies in relational tables. Third Normal Form (3NF) and Boyce-Codd Normal Form (BCNF eliminate from relational tables FDs that do not involve full keys 1. If a table in a database schema violates 3NF, often it is best to replace it with a collection of tables in 3NF. This is achieved via the procedure called decomposition of relations. Relational table decomposition: replacement of a relational table R(A 1,...,A n ) with relational tables R 1, R 2,...,R k. We want relational decomopositions to have the following properties: P1. For all i, R i = π L (R), where L {A 1,...,A n }. P2. R 1 R 2... R k = R. Decompositions are designed to address the following issues: Eliination of anomalies. Property P1 usually takes care of that. Recoverability of information. This is achieved by decompositions satisfying property P2. Preservation of dependencies. This is an important challenge. It may affect how far towards normalization we can get. 1 Remember, in case of 3NF, it is possible to have an FD involving an incomplete key, but the right side of such FD hase to be a prime attribute. 1

2 Algorithms for Functional Dependencies In order to present the algorithm for decomposing any relation into a relational schema in 3NF, we first need two algorithms involving functional dependencies: Closure: given a list of attributes and a set of FDs find all attributes that depend on it. FDs in Projection: given a relation, a collection of FDs and a list of attrbiutes, find the FDs that hold in the projection of the relation on the given attribute list. Closure Closure of a set of attributes. Let R be a relation, and Q = {A 1,...,A n } be a subset of R s attributes. The closure of Q, denoted Q + is a set of attributes, which are functionally determined by Q, given a set S of functional dependencies on R. FD Closure algorithm. The following algorithm computes the closure of a set of attributes. Algorithm FD-Closure(R Relational_Schema, Q Set_of_Attributes, S Set_of_FDs) returns Set_of_Attributes; // Step 1. split all rules in S for each f in S if (right side of f has multiple attributes) then replace f in S with FDs that have only on attribute on the right; // Step 2. Include set Q in the closure X := Q; // Step 3. Keep including new attributes for as long as FDs allow repeat X := X; for each f: B1,...,Bn -> C in S if {B1,...,Bn} is a subset of X then X := X union {C}; until (X == X); // repeat until no new attributes can be added to X return X;

3 Finding Functional Dependencies in Projection Algorithm FD-Project(R Relational_Schema, R1 Relational_Schema, L Set_of_Attributes, S Set_of_FDs) returns Set_of_Attributes; // R: original relation // R1: projection of R onto attributes L // S : FDs asserted on R // initialization T := {}; // T is a set of functional dependencies // main loop for each subset X of L Y := FD-Closure(R, X, S); // for each subset of L, find its closure for each A in Y-X if A in L then T := T Union {X -> A}; // for return T; Note: FD-Project checks every subset of the input set of attributes. This means, that in general case, FD-Project is exponential in the size of input. However, in practice, if the list of FDs is short, this algorithm will work faster. Decomposition of Relations BCNF Decomposition The BCNF Decomposition algorithm, given a relation and a collection of FDs asserted on it, decomposes the relation into a number of relational tables in BCNF.

4 Algorithm BCNF-Decompose(R Relational_Schema, S Set_of_FDs) returns Set_of_Relational_Schema; if R is in BCNF then return {R}; // trivial case Z:= all attributes of R; W := {}; // initialize eventual result // main case: R is not in BCNF, needs to be decomposed Find rule X-> A in S which violates BCNF; Y1 := FD-Closure(R,X); // closure of left side of violating rule Create relational schema T1(Y1); S1 := FD-Project(R,T1,Y1,S); // find FDs in the new table Y2 := (T - Y) Union X; Create relational schema T2(Y2); S2 := FD-Project(R,T2,Y2,S); // find FDs in the new table // recursively call BCN-Decomposition algorithm W := W Union BCNF-Decompose(T1,S1) Union BCNF-Decompose(T2,S2); return W; Note: BCNF-Decompose is a recursive algorithm. The idea is simple: find a rule that violates BCNF, use it to decompose the table into two. Then, find the proper set of FDs for each of the new tables, and decompose each of them in turn. Properties of BCNF Decomposition Elimination of anomalies. Anomalies are eliminated, since the decomposition process breaks off any BCNF-violating FD into a separate table. Recoverability of information. Natural join will recover the original relation from the decomposition. Chase test allows us to verify that. Preservation of dependencies. This is not always the case. If a table was in 3NF but not in BCNF, some dependencies may be lost. 3NF Decomposition BCNF Decomposition may lead to breaking off of FDs X A where A is a prime attribute, i.e., part of another key. This, in turn, may yield loss of dependency: it would be possible to store information in the set of decomposed tables, which cannot be stored in the original table. 3NF decomposition shall be used when such FDs are present in a table, instead of BCNF decomposition. While 3NF tables are not as normalized, they are guaranteed to always preserve all dependencies from the original table. Basis and Minimal Basis of FD Sets FD Basis. Let R be a table and S be a set of FDs asserted on it. Let S be another set of FDs, such that S S. Then S is called the basis of S.

5 Minimal Basis. A minimal basis for a set of FDs S is a set of FDs S, such that: 1. S is a basis of S. 2. All FDs in S are of the form X A, where A is a single attribute. 3. For any X A S, S {X A} is not a basis of S. 4. For any X A S, Y X, S {X A} {Y A} is not a basis of S. Note: Informally, a minimal basis, is a basis which has only rules with one attribute on the right, and from which no rule can be removed, or simplified. Algorithm 3NF-Decompose(R Relational_Schema, S Set_of_FDs) returns Set_of_Relational_Schema; if R is in 3NF then return {R}; // trivial case W:= {}; G := Minimal_Basis(S); // find minimal basis of S for each X->A in G create schema Rxa(X,A); W:= W Union {Rxa}; // make certain the key of R is preserved if no relation from W is a superkey for R then Y := some key of R; create schema T(Y); W:= W Union {T}; return W; Informally, 3NF-Decompose works as follows: for each rule in the minimal basis it creates a new relation, with the left side of the rule as the key. Then, if no relation contained a key of the original table, one more relation is used to preserve the key. Things to note about 3NF-Decompose: 3NF-Decompose takes care of both FDs that violate 3NF and FDs that violate 2NF. 3NF-Decompose may split the table into more relations than necessary. If there are two FDs: X A and X B in the minimal basis, then two tables Rxa(X, A) and Rxb(X, B) will be created. At the same time, a single table Rx(X, A, B) is sufficient in many cases (if both A and B are non-prime). Properties of 3NF-Decompose : Lossless join. Chase procedure can be used to verify that 3NF-Decompose yields lossless join.

6 Dependency preservation. Every FD from the minimal basis is preserved in one table! 3NF. If any of the relations Rxa created during decomposition is NOT in 3NF, then, the basis we used during the decomposition is not minimal.