Functional Dependencies and Finding a Minimal Cover

Similar documents
Chapter 10. Normalization. Chapter Outline. Chapter Outline(contd.)

Databases -Normalization I. (GF Royle, N Spadaccini ) Databases - Normalization I 1 / 24

Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 10-2

Normalization. Murali Mani. What and Why Normalization? To remove potential redundancy in design

Chapter 10. Chapter Outline. Chapter Outline. Functional Dependencies and Normalization for Relational Databases

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2009 Lecture 4 - Schema Normalization

UNIT 3 DATABASE DESIGN

Relational Database Design Theory. Introduction to Databases CompSci 316 Fall 2017

Functional Dependencies and Normalization for Relational Databases Design & Analysis of Database Systems

This lecture. Databases -Normalization I. Repeating Data. Redundancy. This lecture introduces normal forms, decomposition and normalization.

Functional Dependencies and. Databases. 1 Informal Design Guidelines for Relational Databases. 4 General Normal Form Definitions (For Multiple Keys)

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2009 Lecture 3 - Schema Normalization

Chapter 8: Relational Database Design

CS411 Database Systems. 05: Relational Schema Design Ch , except and

Informal Design Guidelines for Relational Databases

Chapter 14. Database Design Theory: Introduction to Normalization Using Functional and Multivalued Dependencies

CSE 562 Database Systems

Review: Attribute closure

Relational Design: Characteristics of Well-designed DB

Schema Refinement and Normal Forms

Functional Dependencies CS 1270

Unit 3 : Relational Database Design

Design Theory for Relational Databases

Database Management System 15

Database Systems. Basics of the Relational Data Model

Functional Dependencies & Normalization for Relational DBs. Truong Tuan Anh CSE-HCMUT

CSCI 403: Databases 13 - Functional Dependencies and Normalization

Normalisation. Normalisation. Normalisation

Chapter 7: Relational Database Design

Schema Refinement: Dependencies and Normal Forms

Schema Refinement & Normalization Theory 2. Week 15

Lecture 6a Design Theory and Normalization 2/2

Fundamentals of Database Systems

Functional Dependencies and Normalization

Database Design Principles

Lecture 11 - Chapter 8 Relational Database Design Part 1

Relational Database Design (II)

CS 2451 Database Systems: Database and Schema Design

Announcements (January 20) Relational Database Design. Database (schema) design. Entity-relationship (E/R) model. ODL (Object Definition Language)

CSIT5300: Advanced Database Systems

MODULE: 3 FUNCTIONAL DEPENDENCIES

COSC Dr. Ramon Lawrence. Emp Relation

Part II: Using FD Theory to do Database Design

Database Management System

Schema Refinement: Dependencies and Normal Forms

Relational Database design. Slides By: Shree Jaswal

Database Constraints and Design

Database Design Theory and Normalization. CS 377: Database Systems

Schema Refinement: Dependencies and Normal Forms

Relational Database Design. Announcements. Database (schema) design. CPS 216 Advanced Database Systems. DB2 accounts have been set up

UNIT -III. Two Marks. The main goal of normalization is to reduce redundant data. Normalization is based on functional dependencies.

Design Theory for Relational Databases

The Relational Data Model

Normalization is based on the concept of functional dependency. A functional dependency is a type of relationship between attributes.

customer = (customer_id, _ customer_name, customer_street,

Typical relationship between entities is ((a,b),(c,d) ) is best represented by one table RS (a,b,c,d)

Database Management System Prof. Partha Pratim Das Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

FUNCTIONAL DEPENDENCIES

Redundancy:Dependencies between attributes within a relation cause redundancy.

Theory of Normal Forms Decomposition of Relations. Overview

Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Overview - detailed. Goal. Faloutsos & Pavlo CMU SCS /615

Normalization 03. CSE3421 notes

Database design III. Quiz time! Using FDs to detect anomalies. Decomposition. Decomposition. Boyce-Codd Normal Form 11/4/16

TDDD12 Databasteknik Föreläsning 4: Normalisering

Lectures 12: Design Theory I. 1. Normal forms & functional dependencies 2/19/2018. Today s Lecture. What you will learn about in this section

Desirable database characteristics Database design, revisited

Database Management Systems Paper Solution

To overcome these anomalies we need to normalize the data. In the next section we will discuss about normalization.

Chapter 16. Relational Database Design Algorithms. Database Design Approaches. Top-Down Design

Normalization. Anomalies Functional Dependencies Closures Key Computation Projecting Relations BCNF Reconstructing Information Other Normal Forms

Functional dependency theory

BCNF. Yufei Tao. Department of Computer Science and Engineering Chinese University of Hong Kong BCNF

Chapter 14 Outline. Normalization for Relational Databases: Outline. Chapter 14: Basics of Functional Dependencies and

Logical Database Design Normalization

Functional Dependencies and Single Valued Normalization (Up to BCNF)

CMU SCS CMU SCS CMU SCS CMU SCS whole nothing but

V. Database Design CS448/ How to obtain a good relational database schema

Chapter 14. Chapter 14 - Objectives. Purpose of Normalization. Purpose of Normalization

Chapter 6: Relational Database Design

Homework 6: FDs, NFs and XML (due April 13 th, 2016, 4:00pm, hard-copy in-class please)

Unit- III (Functional dependencies and Normalization, Relational Data Model and Relational Algebra)

COMP7640 Assignment 2

Databases The theory of relational database design Lectures for m

Lectures 5 & 6. Lectures 6: Design Theory Part II

Mapping ER Diagrams to. Relations (Cont d) Mapping ER Diagrams to. Exercise. Relations. Mapping ER Diagrams to Relations (Cont d) Exercise

NORMAL FORMS. CS121: Relational Databases Fall 2017 Lecture 18

Ryan Marcotte CS 475 (Advanced Topics in Databases) March 14, 2011

SCHEMA REFINEMENT AND NORMAL FORMS

Lecture 5 Design Theory and Normalization

CS352 Lecture - Conceptual Relational Database Design

CS352 Lecture - Conceptual Relational Database Design

Assignment 2 Solutions

SCHEMA REFINEMENT AND NORMAL FORMS

Part V Relational Database Design Theory

Relational Design Theory. Relational Design Theory. Example. Example. A badly designed schema can result in several anomalies.

5 Normalization:Quality of relational designs

The Relational Data Model. Functional Dependencies. Example. Functional Dependencies

Functional Dependencies

Introduction to Databases, Fall 2003 IT University of Copenhagen. Lecture 4: Normalization. September 16, Lecturer: Rasmus Pagh

Transcription:

Functional Dependencies and Finding a Minimal Cover Robert Soulé 1 Normalization An anomaly occurs in a database when you can update, insert, or delete data, and get undesired side-effects. These side side-effects include inconsistent, redundant, or missing data. Database normalization is the process of organizing the data into tables in such a way as to remove anomalies. To understand how to normalize a database, we defined four types of data dependencies: From the full key: From the full primary key to outside the key Partial dependency: From part of the primary key to outside the key Transitive dependency: From outside the key to outside the key Into key dependency: From outside the key into the key We then discussed how to removed these dependencies informally, and defined several normal forms based on the definition of these dependencies: First Normal Form (1NF): there are a fixed number of columns. Second Normal Form (2NF): 1NF and no partial dependencies Third Normal Form (3NF) 2NF and no transitive dependencies. Boyce-Codd Normal Form (BCNF) 1NF and all dependencies from full key. 1

2 Motivation Recall the example database in class Pay(Employee, Grade, Salary), in which the value of Salary was dependent on the value of Grade, and Employee was the primary key. In other words, we had a from the full key dependency E GS and a transitive dependency G S. To normalize the database, we eliminated the transitive dependency by splitting the Pay(Employee, Grade, Salary) table into two separate tables: Pay(Employee, Grade). and Rate(Grade, Salary). This approach works, but it becomes difficult to do as the database becomes larger and more complex, and the number of functional dependencies increases. Instead of starting with a table and re-organizing it, we are going to start with the functional dependencies and synthesize the correct tables. 3 Small Example In the Pay(Employee, Grade, Salary) example, we had two dependencies dependencies: E GS and G S. These might seem simple, because the example is small, but they are not actually as simple as they could be. Notice that if E GS, then of course, E G. In other words, if E forces a relation to be equal on G and S, then of course it forces the relation to be equal on just G. Without going into detail, we notice that we could re-write our dependencies as: E G E S G S Then, (just take my word for it for now), we could further simplify the dependencies to be: E G G S This is the simplest way that we could write the dependencies, while still preserving the same information. This form is called a minimal cover, and corresponds exactly to the normalized tables that we want. 2

4 Functional Dependencies A functional dependency is a constraint between two sets of attributes in a relation. For a given relation R, if the set of attributes A 1,, A n, functionally determine the set of attributes B 1,, B n, then it means that if two tuples that have the same values for all the attributes A 1,, A n, then they must also have the same values for all the attributes B 1,, B n. This is written: A 1,, A n B 1,, B n Note that functional dependencies generalize the concept of a key. 5 Strength of Functional Dependencies If a relation satisfies the dependency A BC, then it must also satisfy the dependency A B. Informally, if A forces a relation to be equal on B and C, then of course it forces the relation to be equal on just B. That is, A BC A B. If a functional dependency F impies a functional dependency H, we say that F is stronger than H. Be careful, though. It is easy to make mistakes. For example, if we are given the functional dependency AB C, it might be tempting to conclude that A B and A C. However, this decomposition is incorrect, since neither A nor B alone determine C. In other words, AB C A C, and AB C is weaker than A C. 6 Properties of Functional Dependencies There are often several ways to write the same functional dependencies. Two sets of functional dependencies, S and T, are equivalent if the same set of relation instances that satisfy S also satisfy T, and vice versa. There are several useful rules that let you replace one set of functional dependencies with an equivalent set. Some of those rules are as follows: Reflexivity: If Y X, then X Y Augmentation: If X Y, then XZ Y Z Transitivity: If X Y and Y Z, then X Z Union: If X Y and X Z, then X Y Z Decomposition: If X Y Z, then X Y and X Z 3

Pseudotransitivity: If X Y and W Y Z, then W X Z Composition: If X Y and Z W, then XZ Y W These rules are a slightly more formal way of capturing the idea of relative strength of functional dependencies. Note that a trivial functional dependency is one in thich the constraint holds for any instance of the relation. For example, X X always holds. 7 Closure Given a set of functional dependencies F, and a set of attributes X, the closure of X with respect to F, written X + F, is the set of all the attributes whose values are determined by the values of X because of F. Often, if F is understood, then the closure of X is written X +. For example, given the following set, M, of functional dependencies: 1. A B 2. B C 3. BC D Then we can compute the closure of A with respect to M in the following way: i A A ( by reflexivity rule ) ii A AB ( by (i) and 1 ) iii A ABC ( by (ii), 2, and transitivity rule ) iv A ABCD ( by (iii), and 3 ) Therefore, A + M = ABCD. 8 Cover We say that a set of functional dependencies F covers another set of functional dependencies G, if every functional dependency in G can be inferred from F. A minimal cover is the smallest set of functional dependencies that covers G. We won t prove it, but every set of functional dependencies has a minimal cover. Also, note that there may be more than one minimal cover. We find the minimal cover by iteratively simplifying the set of functional dependencies. To do this, we will use three methods: 4

Simplifying an FD by the Union Rule Let X, Y, and Z be sets of attributes. If X Y and X Z, then X Y Z Simplifying an FD by simplifying the left-hand side be sets of attributes, and B be a single attribute not in X. Let F be: Let X and Y XB Y and H be X Y If F X Y, then we can replace F by H. In other words, if Y X + F, then we can replace F by H. For example, say we have the following functional dependencies (F ): AB C A B And we want to know if we can simplify to the following (H): A C A B Then A + F = ABC. Since, Y X+ F, we can replace F by H. Simplifying an FD by simplifying the right-hand side Y be sets of attributes, and C be a single attribute not in Y. Let F be: Let X and X Y C and H be X Y 5

If H X Y C, then we can replace F by H. In other words, if Y C X + H, then we can replace F by H. For example, say we have the following functional dependencies (F ): A BC B C And we want to know if we can simplify to the following (H): A B B C Then A + H = ABC. Since, BC X+ H, we can replace F by H. 9 Finding the Minimal Cover Given a set of functional dependencies F : 1. Start with F 2. Remove all trivial functional dependencies 3. Repeatedly apply (in whatever order you like), until no changes are possible Union Simplification (it is better to do it as soon as possible, whenever possible) RHS Simplification LHS Simplification 4. Result is the minimal cover Small example Applying to algorithm to EGS with 1. E G 2. G S 3. E S Using the union rule, we combine 1 and 3 and get 6

1. E GS 2. G S Simplifying RHS of 1 (this is the only attribute we can remove), we get: 1. E G 2. G S Done! 10 An algorithm for (almost) 3NF Lossless-Join Decomposition We can now use the following algorithm to compute our 3NF Lossless-Join Decomposition: 1. Compute F m, the minimal cover for F 2. For each X Y in F m, create a new relation schema XY 3. For every relation schema that is a subset of some other relation schema, remove the smaller one. 4. The set of the remaining relation schemas is an almost final decomposition This algorithm is almost correct, because it may be possible to compute a set of relations that do not contain the key of the original relation. If that is the case, you need to add a relation whose attributes form such a key. For example, consider the relation R(A,B) that has no functional dependencies. The relation has only one key AB, but our algorithm, without the key, would produce no relations. So, we need to add the relation R(A,B). It is easy to check if your relation contains the key of the original relation. Simply compute the closure of the relation with respect to all the functional dependencies and see if you get all the attributes in the original relation. 7