Normalization. Murali Mani. What and Why Normalization? To remove potential redundancy in design

Similar documents
Functional Dependencies and Normalization for Relational Databases Design & Analysis of Database Systems

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2009 Lecture 4 - Schema Normalization

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Chapter 10. Normalization. Chapter Outline. Chapter Outline(contd.)

Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 10-2

Chapter 10. Chapter Outline. Chapter Outline. Functional Dependencies and Normalization for Relational Databases

CSE 562 Database Systems

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2009 Lecture 3 - Schema Normalization

Functional Dependencies and. Databases. 1 Informal Design Guidelines for Relational Databases. 4 General Normal Form Definitions (For Multiple Keys)

Functional Dependencies & Normalization for Relational DBs. Truong Tuan Anh CSE-HCMUT

UNIT 3 DATABASE DESIGN

Chapter 14. Database Design Theory: Introduction to Normalization Using Functional and Multivalued Dependencies

CS 338 Functional Dependencies

COSC Dr. Ramon Lawrence. Emp Relation

Relational Design 1 / 34

Database Design Theory and Normalization. CS 377: Database Systems

Informal Design Guidelines for Relational Databases

Relational Database design. Slides By: Shree Jaswal

Schema Refinement: Dependencies and Normal Forms

Schema Refinement: Dependencies and Normal Forms

The Relational Data Model

Schema Refinement: Dependencies and Normal Forms

We shall represent a relation as a table with columns and rows. Each column of the table has a name, or attribute. Each row is called a tuple.

MODULE: 3 FUNCTIONAL DEPENDENCIES

Functional Dependencies and Finding a Minimal Cover

CS411 Database Systems. 05: Relational Schema Design Ch , except and

V. Database Design CS448/ How to obtain a good relational database schema

Schema Refinement and Normal Forms

Unit 3 : Relational Database Design

Relational Database Design (II)

Chapter 14 Outline. Normalization for Relational Databases: Outline. Chapter 14: Basics of Functional Dependencies and

Theory of Normal Forms Decomposition of Relations. Overview

CS 2451 Database Systems: Database and Schema Design

Relational Design: Characteristics of Well-designed DB

Functional Dependencies CS 1270

Chapter 8: Relational Database Design

The strategy for achieving a good design is to decompose a badly designed relation appropriately.

Chapter 7: Relational Database Design

Part II: Using FD Theory to do Database Design

FUNCTIONAL DEPENDENCIES

Lecture 5 Design Theory and Normalization

Schema Normalization. 30 th August Submitted By: Saurabh Singla Rahul Bhatnagar

Database Systems. Basics of the Relational Data Model

This lecture. Databases -Normalization I. Repeating Data. Redundancy. This lecture introduces normal forms, decomposition and normalization.

Database Management System Prof. Partha Pratim Das Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur

FUNCTIONAL DEPENDENCIES CHAPTER , 15.5 (6/E) CHAPTER , 10.5 (5/E)

Databases -Normalization I. (GF Royle, N Spadaccini ) Databases - Normalization I 1 / 24

Design Theory for Relational Databases

Mapping ER Diagrams to. Relations (Cont d) Mapping ER Diagrams to. Exercise. Relations. Mapping ER Diagrams to Relations (Cont d) Exercise

Chapter 6: Relational Database Design

Lecture 11 - Chapter 8 Relational Database Design Part 1

Normalisation. Normalisation. Normalisation

CSCI 403: Databases 13 - Functional Dependencies and Normalization

CMU SCS CMU SCS CMU SCS CMU SCS whole nothing but

Database Design Principles

Unit IV. S_id S_Name S_Address Subject_opted

CSIT5300: Advanced Database Systems

DBMS Chapter Three IS304. Database Normalization-Comp.

Chapter 16. Relational Database Design Algorithms. Database Design Approaches. Top-Down Design

Dr. Anis Koubaa. Advanced Databases SE487. Prince Sultan University

Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Overview - detailed. Goal. Faloutsos & Pavlo CMU SCS /615

Redundancy:Dependencies between attributes within a relation cause redundancy.

Fundamentals of Database Systems

Functional Dependencies and Normalization

Functional dependency theory

Lecture 4. Database design IV. INDs and 4NF Design wrapup

Edited by: Nada Alhirabi. Normalization

Relational Database Design Theory. Introduction to Databases CompSci 316 Fall 2017

Databases The theory of relational database design Lectures for m

Functional Dependencies, Normalization. Rose-Hulman Institute of Technology Curt Clifton

More Normalization Algorithms. CS157A Chris Pollett Nov. 28, 2005.

Functional Dependencies and Single Valued Normalization (Up to BCNF)

Database Management System

Lectures 12: Design Theory I. 1. Normal forms & functional dependencies 2/19/2018. Today s Lecture. What you will learn about in this section

Relational Design Theory. Relational Design Theory. Example. Example. A badly designed schema can result in several anomalies.

MIDTERM EXAMINATION Spring 2010 CS403- Database Management Systems (Session - 4) Ref No: Time: 60 min Marks: 38

TDDD12 Databasteknik Föreläsning 4: Normalisering

Steps in normalisation. Steps in normalisation 7/15/2014

Database Normalization. (Olav Dæhli 2018)

BCNF. Yufei Tao. Department of Computer Science and Engineering Chinese University of Hong Kong BCNF

customer = (customer_id, _ customer_name, customer_street,

Birkbeck. (University of London) BSc/FD EXAMINATION. Department of Computer Science and Information Systems. Database Management (COIY028H6)

Database design III. Quiz time! Using FDs to detect anomalies. Decomposition. Decomposition. Boyce-Codd Normal Form 11/4/16

IS 263 Database Concepts

In This Lecture. Normalisation to BCNF. Lossless decomposition. Normalisation so Far. Relational algebra reminder: product

Desirable database characteristics Database design, revisited

NORMALISATION (Relational Database Schema Design Revisited)

SCHEMA REFINEMENT AND NORMAL FORMS

Concepts from

CS403- Database Management Systems Solved Objective Midterm Papers For Preparation of Midterm Exam

Lecture 6a Design Theory and Normalization 2/2

CS403- Database Management Systems Solved MCQS From Midterm Papers. CS403- Database Management Systems MIDTERM EXAMINATION - Spring 2010

18. Relational Database Design

DATABASE MANAGEMENT SYSTEMS

Case Study: Lufthansa Cargo Database

DATABASE DESIGN I - 1DL300

UNIT -III. Two Marks. The main goal of normalization is to reduce redundant data. Normalization is based on functional dependencies.

ADVANCED DATABASES ; Spring 2015 Prof. Sang-goo Lee (11:00pm: Mon & Wed: Room ) Advanced DB Copyright by S.-g.

DATABASE DESIGN I - 1DL300

Introduction to Databases, Fall 2003 IT University of Copenhagen. Lecture 4: Normalization. September 16, Lecturer: Rasmus Pagh

Transcription:

1 Normalization What and Why Normalization? To remove potential redundancy in design Redundancy causes several anomalies: insert, delete and update Normalization uses concept of dependencies Functional Dependencies Idea used: Decomposition Break R (A, B, C, D) into R1 (A, B) and R2 (B, C, D) Use decomposition judiciously.

2 Insert Anomaly s1 s2 sname pnumber p1 Greg p2 ER Note: We cannot insert a professor who has no students. Insert Anomaly: We are not able to insert valid value/(s) Delete Anomaly sname pnumber s1 p1 s2 Greg p2 ER Note: We cannot delete a student that is the only student of a professor. Delete Anomaly: We are not able to perform a delete without losing some valid information. Note: In both cases, minimum cardinality of Professor in the corresponding ER schema is 0

3 Update Anomaly sname pnumber s1 p1 s2 Greg p1 Note: To update the name of a professor, we have to update in multiple tuples. Update anomalies are due to redundancy. Update Anomaly: To update a value, we have to update multiple rows. Note the maximum cardinality of Professor in the corresponding ER schema is * pnumber (1,1) Has (0,*) Advisor Professor sname years Keys : Revisited A key for a relation R (a1, a2,, an) is a set of attributes, K that uniquely determine the values for all attributes of R. A key K is minimal: no subset of K is a key. A superkey need not be minimal Prime Attribute: An attribute of a key

4 Keys: Example 1 2 sname Greg address 144FL 320FL Primary Key: <> Candidate key: <sname> Some superkeys: {<, address>, <sname>, <>, <, sname>, <, sname, address>} Prime Attribute: {, sname} Functional Dependencies (FDs) 1 2 sname Greg address 144FL 320FL Suppose we have the FD sname address for any two rows in the relation with the same value for sname, the value for address must be the same i.e., there is a function from sname to address Note: We will assume no null values. Any key (primary or candidate) or superkey of a relation R functionally determines all attributes of R

5 Properties of FDs Consider A, B, C, Z are sets of attributes Reflexive (also called trivial FD): if A B, then A B Transitive: if A B, and B C, then A C Augmentation: if A B, then AZ BZ Union: if A B, A C, then A BC Decomposition: if A BC, then A B, A C Inferring FDs Why? Suppose we have a relation R (A, B, C) and we have functional dependencies A B, B C, C A what is a key for R? Should we split R into multiple relations? We can infer A ABC, B ABC, C ABC. Hence A, B, C are all keys.

6 Algorithm for inference of FDs Computing the closure of set of attributes {A1, A2,, An}, denoted {A1, A2,, An} + 1. Let X = {A1, A2,, An} 2. If there exists a FD B1, B2,, Bm C, such that every Bi X, then X = X C 3. Repeat step 2 till no more attributes can be added. 4. {A1, A2,, An} + = X Inferring FDs: Example 1 Given R (A, B, C), and FDs A B, B C, C A, what are possible keys for R Compute the closure of attributes: {A} + = {A, B, C} {B} + = {A, B, C} {C} + = {A, B, C} So keys for R are <A>, <B>, <C>

7 Inferring FDs: Example 2 Consider R (A, B, C, D, E) with FDs A B, B C, CD E, does A E? Let us compute {A} + {A} + = {A, B, C} Therefore A E is false Decomposing Relations s1 s2 Prof sname pnumber p1 Greg p2 FDs: pnumber Professor s1 s2 sname Greg pnumber p1 p2 pnumber p1 p2

8 Decomposition: Lossless Join Property Generating spurious tuples Professor sname pnumber S1 p1 S2 Greg p2 Prof sname pnumber s1 p1 s1 p2 s2 Greg p1 s2 Greg p2 Normalization Step Consider relation R with set of attributes A R. Consider a FD A B (such that no other attribute in (A R A B) is functionally determined by A). If A is not a superkey for R, we may decompose R as: Create R (A R B) Create R with attributes A B Key for R = A

9 Normal Forms: BCNF Boyce Codd Normal Form (BCNF): For every non-trivial FD X a in R, X is a superkey of R. BCNF example SCI (student, course, instructor) FDs: student, course instructor instructor course Decomposition: SI (student, instructor) Instructor (instructor, course)

10 Dependency Preservation We might want to ensure that all specified FDs are captured. BCNF does not necessarily preserve FDs. 2NF, 3NF preserve FDs. SI Instructor student instructor instructor course DB 1 ER ER DB 1 SCI (from SI and Instructor) student instructor ER course DB 1 DB 1 SCI violates the FD student, course instructor Normal Forms: 3NF Third Normal Form (3NF): For every nontrivial FD X a in R, either a is a prime attribute or X is a superkey of R.

11 3NF - example Lot (propno, county, lotnum, area, price, taxrate) Candidate key: <county, lotnum> FDs: county taxrate area price Decomposition: Lot (propno, county, lotnum, area, price) County (county, taxrate) 3NF - example Lot (propno, county, lotnum, area, price) County (county, taxrate) Candidate key for Lot: <county, lotnum> FDs: county taxrate area price Decomposition: Lot (propno, county, lotnum, area) County (county, taxrate) Area (area, price)

12 Extreme Example Consider relation R (A, B, C, D) with primary key (A, B, C), and FDs B D, and C D. R violates 3NF. Decomposing it, we get 3 relations as: R1 (A, B, C), R2 (B, D), R3 (C, D) Let us consider an instance where we need these 3 relations and how we do a natural join A a1 a2 a3 R1 B b1 b2 b1 C c1 c1 c2 R2 B D b1 d1 b2 d2 R3 C D R1 R2: violates C D A B C D a1 b1 c1 d1 a2 b2 c1 d2 a3 b1 c2 d1 R1 R2 R3: no FD is violated c1 d1 A B C D c2 d2 a1 b1 c1 d1 Define Foreign Keys? Consider the normalization step: A relation R with set of attributes A R, and a FD A B, and A is not a key for R, we decompose R as: Create R (A R B) Create R with attributes A B Key for R = A We can also define foreign key R (A) references R (A) Question: What is key for R?

13 How does Normalization Help? ssn name lot did dname Employee (ssn, name, lot, dept) Dept (did, dname) FK: Employee (dept) REFERENCES Dept (did) Employee Works (1, 1) For (0, *) Dept Suppose: employees of a dept are in the same lot FD: dept lot ssn name Employee did (1, 1) Works For (0, *) lot dname Dept Decomposing: Employee (ssn, name, dept) Dept (did, dname) DeptLot (dept, lot) (or) Employee (ssn, name, dept) Dept (did, dname, lot)