More Normalization Algorithms. CS157A Chris Pollett Nov. 28, 2005.

Similar documents
Chapter 16. Relational Database Design Algorithms. Database Design Approaches. Top-Down Design

Normalization. Murali Mani. What and Why Normalization? To remove potential redundancy in design

Database Management System Prof. Partha Pratim Das Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2009 Lecture 4 - Schema Normalization

Databases Lecture 7. Timothy G. Griffin. Computer Laboratory University of Cambridge, UK. Databases, Lent 2009

Schema Normalization. 30 th August Submitted By: Saurabh Singla Rahul Bhatnagar

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

COSC Dr. Ramon Lawrence. Emp Relation

Desired properties of decompositions

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2009 Lecture 3 - Schema Normalization

Lecture 11 - Chapter 8 Relational Database Design Part 1

CSIT5300: Advanced Database Systems

More Relational Algebra

Functional Dependencies & Normalization for Relational DBs. Truong Tuan Anh CSE-HCMUT

Part II: Using FD Theory to do Database Design

Techno India Batanagar Computer Science and Engineering. Model Questions. Subject Name: Database Management System Subject Code: CS 601

Theory of Normal Forms Decomposition of Relations. Overview

UNIT 3 DATABASE DESIGN

Chapter 10. Normalization. Chapter Outline. Chapter Outline(contd.)

CS411 Database Systems. 05: Relational Schema Design Ch , except and

CSCI 403: Databases 13 - Functional Dependencies and Normalization

Lecture 4. Database design IV. INDs and 4NF Design wrapup

CSE 562 Database Systems

Database Management Systems Paper Solution

Fundamentals of Database Systems

Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 10-2

Chapter 10. Chapter Outline. Chapter Outline. Functional Dependencies and Normalization for Relational Databases

Databases Tutorial. March,15,2012 Jing Chen Mcmaster University

Steps in normalisation. Steps in normalisation 7/15/2014

Database Management

Normalisation. Normalisation. Normalisation

Functional Dependencies and Normalization for Relational Databases Design & Analysis of Database Systems

Chapter 14 Outline. Normalization for Relational Databases: Outline. Chapter 14: Basics of Functional Dependencies and

Examination examples

Relational Database Design (II)

Chapter 14. Database Design Theory: Introduction to Normalization Using Functional and Multivalued Dependencies

SCHEMA REFINEMENT AND NORMAL FORMS

Functional Dependencies and Finding a Minimal Cover

DATABASE DESIGN I - 1DL300

CMU SCS CMU SCS CMU SCS CMU SCS whole nothing but

Informal Design Guidelines for Relational Databases

1. Considering functional dependency, one in which removal from some attributes must affect dependency is called

Functional Dependencies CS 1270

Database design III. Quiz time! Using FDs to detect anomalies. Decomposition. Decomposition. Boyce-Codd Normal Form 11/4/16

Schema And Draw The Dependency Diagram

A7-R3: INTRODUCTION TO DATABASE MANAGEMENT SYSTEMS

MODULE: 3 FUNCTIONAL DEPENDENCIES

Schema Refinement and Normal Forms

Relational Design: Characteristics of Well-designed DB

Database Design Theory and Normalization. CS 377: Database Systems

Home Page. Title Page. Page 1 of 14. Go Back. Full Screen. Close. Quit

CSIT5300: Advanced Database Systems

Redundancy:Dependencies between attributes within a relation cause redundancy.

DATABASE MANAGEMENT SYSTEMS

Administrivia. Physical Database Design. Review: Optimization Strategies. Review: Query Optimization. Review: Database Design

Functional Dependencies and. Databases. 1 Informal Design Guidelines for Relational Databases. 4 General Normal Form Definitions (For Multiple Keys)

Lecture 6a Design Theory and Normalization 2/2

Data Modeling with the Entity Relationship Model. CS157A Chris Pollett Sept. 7, 2005.

Physical Database Design and Tuning. Review - Normal Forms. Review: Normal Forms. Introduction. Understanding the Workload. Creating an ISUD Chart

Database Design and Tuning

COMP7640 Assignment 2

Applied Databases. Sebastian Maneth. Lecture 5 ER Model, normal forms. University of Edinburgh - January 25 th, 2016

Relational Design 1 / 34

Introduction to Database Design, fall 2011 IT University of Copenhagen. Normalization. Rasmus Pagh

Homework 3: Relational Database Design Theory (100 points)

CSE 544 Principles of Database Management Systems

Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Overview - detailed. Goal. Faloutsos & Pavlo CMU SCS /615

Schema Refinement: Dependencies and Normal Forms

Midterm Exam (Version B) CS 122A Spring 2017

Schema Refinement: Dependencies and Normal Forms

ACS-2914 Normalization March 2009 NORMALIZATION 2. Ron McFadyen 1. Normalization 3. De-normalization 3

Database Normalization. (Olav Dæhli 2018)

We shall represent a relation as a table with columns and rows. Each column of the table has a name, or attribute. Each row is called a tuple.

MIDTERM EXAMINATION Spring 2010 CS403- Database Management Systems (Session - 4) Ref No: Time: 60 min Marks: 38

Schema Refinement: Dependencies and Normal Forms

Lecture #8 (Still More Relational Theory...!)

Combining schemas. Problems: redundancy, hard to update, possible NULLs

CS 2451 Database Systems: Database and Schema Design

Chapter 6: Relational Database Design

CSE 544 Principles of Database Management Systems

CS 338 Functional Dependencies

NORMAL FORMS. CS121: Relational Databases Fall 2017 Lecture 18

Bachelor in Information Technology (BIT) O Term-End Examination

Relational Database Design Theory. Introduction to Databases CompSci 316 Fall 2017

CS/B.Tech/CSE/New/SEM-6/CS-601/2013 DATABASE MANAGEMENENT SYSTEM. Time Allotted : 3 Hours Full Marks : 70

Databases The theory of relational database design Lectures for m

CS352 Lecture - Conceptual Relational Database Design

Relational Database design. Slides By: Shree Jaswal

DBMS Chapter Three IS304. Database Normalization-Comp.

CS352 Lecture - Conceptual Relational Database Design

Edited by: Nada Alhirabi. Normalization

6.830 Lecture PS1 Due Next Time (Tuesday!) Lab 1 Out end of week start early!

Database Systems: Design, Implementation, and Management Tenth Edition. Chapter 6 Normalization of Database Tables

UNIT -III. Two Marks. The main goal of normalization is to reduce redundant data. Normalization is based on functional dependencies.

Homework 6. Question Points Score Query Optimization 20 Functional Dependencies 20 Decompositions 30 Normal Forms 30 Total: 100

Introduction to Data Management. Lecture #17 (Physical DB Design!)

Physical Database Design and Tuning

Relational Model History. COSC 416 NoSQL Databases. Relational Model (Review) Relation Example. Relational Model Definitions. Relational Integrity

BIRKBECK (University of London)

The strategy for achieving a good design is to decompose a badly designed relation appropriately.

Transcription:

More Normalization Algorithms CS157A Chris Pollett Nov. 28, 2005.

Outline 3NF Decomposition Algorithms BCNF Algorithms Multivalued Dependencies and 4NF

Dependency-Preserving Decomposition into 3NF Input: A universal relation R and a set of FDs F on the attributes of R Output: D -- a decomposition of R Algorithm: 1. Find a minimal cover G for F. 2. For each left hand side X of an FD in G add a new relation schema Q in D with attributes {X {A 1 } {A 2 } {A k }}, where X--> A 1,, X-->A k are the only FDs in G with X as the left hand side. Make X the key of this schema. 3. Place any remaining attributes in a single relation to ensure the attribute preservation property. Notice Step 2 guarantees dependencies will be preserved. The reason why the decomposition will be in 3NF is due to the use of a minimal cover. (If X--> Z and Z-->A are in G then X-->A won t be in G.)

Lossless Join Decomposition into BCNF Input: A universal relation R and a set of FDs F on the attributes of R Output: D -- a decomposition of R Algorithm: Set D:={R}; While there is a relation schema Q in D that is not in BCNF { 1. Choose a relation Q in D not in BCNF 2. Rind a FD X-->Y in Q that violates BCNF 3. Replace Q in D by the two relation schemas (Q-Y) and (X Y) } Since relations only have finitely many attributes this algorithm terminates with all the relations in BCNF. (3) ensures that our binary decomposition test for LJP will be passed.

Dependency-Preserving Lossless Join Decomposition into 3NF Input: A universal relation R and a set of functional dependencies F on the attributes of R. Output: D -- a decomposition of R Algorithm: 1. Find a minimal cover G for F. 2. For each left hand side X of an FD in G add a new relation schema Q in D with attributes {X {A 1 } {A 2 } {A k }}, where X--> A 1,, X-->A k are the only FDs in G with X as the left hand side. Make X the key of this schema. 3. If none of the schemas in D contains a key of R, then create one more relation schema that contains attributes which form a key of R. Except for Step 3 this is like our earlier algorithm. The new step 3 still preserves attributes because a key needs to have all attributes not in any FD in G. Also, Step 3 in above ensures LJP.

Algorithm to Find a Key of R In order to do Step 3 above we need an algorithm which can find a key for R. Here is how to do this: Input: A universal relation R and a set of FDs F on the attributes of R. Output: K -- a key Algorithm: 1. Set K:= R 2. For each attribute A in K { } Check (K-A) + == R. If Yes set K := K-A;

Problems with Null Values and Dangling Tuples Our decomposition algorithms don t really address how to handle nulls in our relation schemas. To see why this is hard consider our original COMPANY schema. It is in BCNF. Suppose we have employees with null values for their DNO. If we natural join EMPLOYEE with DEPARTMENT, these employees disappear. On the other hand using left outer joins we get a new fictitious department with null for all its attributes. Similarly, SUM and AVERAGE over columns with nulls must be carefully considered. We might try to come up with a decomposition property which further split EMPLOYEE into a table EMP1 which does not have DNO and a table EMP2(SSN,DNO). If we do not store in EMP2 employees with a NULL value for DNO then we have gotten rid of all NULLS Unfortunately, if we natural join EMP1 and EMP2 we might not get all the rows we had in EMPLOYEE. The tuples which were lost were in only one of the two relations (EMP1) of the decomposition. These are called dangling tuples.

Discussion of Normalization Algorithms Some other issues which make it hard to totally rely on decomposition algorithms are: 1. It is hard to know in advance all the FDs that apply to the data. 2. The algorithms depend on certain nondeterministic choices which we have forced by how we ordered our FDs. For instance, there are usually many different minimal covers. We calculate one based on our initial ordering of our FDs. This might be suboptimal. Notice also, we can t in general decompose into BCNF and preserve dependencies. (Lots example last day is an example why not.)