Databases Lecture 7. Timothy G. Griffin. Computer Laboratory University of Cambridge, UK. Databases, Lent 2009

Similar documents
Chapter 16. Relational Database Design Algorithms. Database Design Approaches. Top-Down Design

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2009 Lecture 4 - Schema Normalization

Theory of Normal Forms Decomposition of Relations. Overview

Databases. Timothy G. Griffin. Computer Laboratory University of Cambridge, UK. Databases, Lent 2010

Databases. Timothy G. Griffin. Computer Laboratory University of Cambridge, UK. Databases, Lent 2010

CSE 562 Database Systems

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2009 Lecture 3 - Schema Normalization

Part II: Using FD Theory to do Database Design

More Normalization Algorithms. CS157A Chris Pollett Nov. 28, 2005.

Database design III. Quiz time! Using FDs to detect anomalies. Decomposition. Decomposition. Boyce-Codd Normal Form 11/4/16

Normalization. Murali Mani. What and Why Normalization? To remove potential redundancy in design

CSIT5300: Advanced Database Systems

Relational Database Design (II)

Exercise 9: Normal Forms

Functional dependency theory

FUNCTIONAL DEPENDENCIES

Lectures 5 & 6. Lectures 6: Design Theory Part II

Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Overview - detailed. Goal. Faloutsos & Pavlo CMU SCS /615

CS411 Database Systems. 05: Relational Schema Design Ch , except and

CMU SCS CMU SCS CMU SCS CMU SCS whole nothing but

Combining schemas. Problems: redundancy, hard to update, possible NULLs

Schema Normalization. 30 th August Submitted By: Saurabh Singla Rahul Bhatnagar

BCNF. Yufei Tao. Department of Computer Science and Engineering Chinese University of Hong Kong BCNF

The Relational Data Model

Relational design algorithms

Normalisation. Normalisation. Normalisation

Schema Refinement: Dependencies and Normal Forms

Review: Attribute closure

Schema Refinement & Normalization Theory 2. Week 15

Home Page. Title Page. Page 1 of 14. Go Back. Full Screen. Close. Quit

Homework 3: Relational Database Design Theory (100 points)

Case Study: Lufthansa Cargo Database

Databases Tutorial. March,15,2012 Jing Chen Mcmaster University

Database Systems. Basics of the Relational Data Model

Informal Design Guidelines for Relational Databases

Chapter 6: Relational Database Design

Functional Dependencies CS 1270

Databases. Ken Moody. Computer Laboratory University of Cambridge, UK. Lecture notes by Timothy G. Griffin. Lent 2012

Database Constraints and Design

Relational Design: Characteristics of Well-designed DB

Desired properties of decompositions

Databases -Normalization I. (GF Royle, N Spadaccini ) Databases - Normalization I 1 / 24

Schema Refinement: Dependencies and Normal Forms

NORMAL FORMS. CS121: Relational Databases Fall 2017 Lecture 18

Chapter 10. Normalization. Chapter Outline. Chapter Outline(contd.)

In This Lecture. Normalisation to BCNF. Lossless decomposition. Normalisation so Far. Relational algebra reminder: product

Databases tgg22. Computer Laboratory University of Cambridge, UK. Databases, Lent 2013

Acknowledgement: The slides were kindly lent by Doc. RNDr. Tomas Skopal, Ph.D., Department of Software Engineering, Charles University in Prague

Normalisation theory

UNIT 3 DATABASE DESIGN

Databases The theory of relational database design Lectures for m

Relational Database Design Theory. Introduction to Databases CompSci 316 Fall 2017

Introduction to Databases, Fall 2003 IT University of Copenhagen. Lecture 4: Normalization. September 16, Lecturer: Rasmus Pagh

Lecture 11 - Chapter 8 Relational Database Design Part 1

Databases Lectures 1 and 2

Chapter 10. Chapter Outline. Chapter Outline. Functional Dependencies and Normalization for Relational Databases

Functional Dependencies & Normalization for Relational DBs. Truong Tuan Anh CSE-HCMUT

Lecture 4. Database design IV. INDs and 4NF Design wrapup

Databases 2011 Lectures 01 03

Databases Timothy G. Griffin. Computer Laboratory University of Cambridge, UK. Databases, Lent 2015

Design Theory for Relational Databases

Normalization. Anomalies Functional Dependencies Closures Key Computation Projecting Relations BCNF Reconstructing Information Other Normal Forms

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

SCHEMA REFINEMENT AND NORMAL FORMS

Lectures 12: Design Theory I. 1. Normal forms & functional dependencies 2/19/2018. Today s Lecture. What you will learn about in this section

Introduction to Database Design, fall 2011 IT University of Copenhagen. Normalization. Rasmus Pagh

Chapter 14 Outline. Normalization for Relational Databases: Outline. Chapter 14: Basics of Functional Dependencies and

Chapter 7: Relational Database Design

Fundamentals of Database Systems

Homework 6: FDs, NFs and XML (due April 15 th, 2015, 4:00pm, hard-copy in-class please)

Database Design Theory and Normalization. CS 377: Database Systems

Lecture 6a Design Theory and Normalization 2/2

Homework 6: FDs, NFs and XML (due April 13 th, 2016, 4:00pm, hard-copy in-class please)

Database Management System Prof. Partha Pratim Das Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur

Schema Refinement: Dependencies and Normal Forms

Functional Dependencies and Normalization for Relational Databases Design & Analysis of Database Systems

Database Management

Chapter 8: Relational Database Design

Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 10-2

Chapter 14. Database Design Theory: Introduction to Normalization Using Functional and Multivalued Dependencies

COSC Dr. Ramon Lawrence. Emp Relation

Functional Dependencies and Finding a Minimal Cover

This lecture. Databases -Normalization I. Repeating Data. Redundancy. This lecture introduces normal forms, decomposition and normalization.

Relational Database Systems 1

CS 338 Functional Dependencies

Unit 3 : Relational Database Design

Normalization 03. CSE3421 notes

Normal Forms. Winter Lecture 19

Final Review. Zaki Malik November 20, 2008

Handout 3: Functional Dependencies and Normalization

Functional Dependencies and. Databases. 1 Informal Design Guidelines for Relational Databases. 4 General Normal Form Definitions (For Multiple Keys)

Homework 6. Question Points Score Query Optimization 20 Functional Dependencies 20 Decompositions 30 Normal Forms 30 Total: 100

Concepts from

Relational Database Design. Announcements. Database (schema) design. CPS 216 Advanced Database Systems. DB2 accounts have been set up

customer = (customer_id, _ customer_name, customer_street,

CSCI 403: Databases 13 - Functional Dependencies and Normalization

Ryan Marcotte CS 475 (Advanced Topics in Databases) March 14, 2011

Presentation on Functional Dependencies CS x265

Lecture #8 (Still More Relational Theory...!)

Database Management System

Transcription:

Databases Lecture 7 Timothy G. Griffin Computer Laboratory University of Cambridge, UK Databases, Lent 2009 T. Griffin (cl.cam.ac.uk) Databases Lecture 7 DB 2009 1 / 17

Lecture 07: Decomposition to Normal Forms Outline Attribute closure algorithm Schema decomposition methods Problems with obtaining both dependency preservation and lossless-join property T. Griffin (cl.cam.ac.uk) Databases Lecture 7 DB 2009 2 / 17

Closure By soundness and completeness closure(f, X) = {A F X A} = {A X A F + } Claim 2 (from previous lecture) Y W F + if and only if W closure(f, Y). If we had an algorithm for closure(f, X), then we would have a (brute force!) algorithm for enumerating F + : F + for every subset Y atts(f) for every subset Z closure(f, Y), output Y Z T. Griffin (cl.cam.ac.uk) Databases Lecture 7 DB 2009 3 / 17

Attribute Closure Algorithm Input : a set of FDs F and a set of attributes X. Output : Y = closure(f, X) 1 Y := X 2 while there is some S T F with S Y and T Y, then Y := Y T. T. Griffin (cl.cam.ac.uk) Databases Lecture 7 DB 2009 4 / 17

An Example (UW1997, Exercise 3.6.1) R(A, B, C, D) with F made up of the FDs What is F +? A, B C C D D A Brute force! Let s just consider all possible nonempty sets X there are only 15... T. Griffin (cl.cam.ac.uk) Databases Lecture 7 DB 2009 5 / 17

Example (cont.) For the single attributes we have {A} + = {A}, {B} + = {B}, {C} + = {A, C, D}, F = {A, B C, C D, D A} C D {C} = {C, D} D A = {A, C, D} {D} + = {A, D} {D} D A = {A, D} The only new dependency we get with a single attribute on the left is C A. T. Griffin (cl.cam.ac.uk) Databases Lecture 7 DB 2009 6 / 17

Example (cont.) F = {A, B C, C D, D A} Now consider pairs of attributes. {A, B} + = {A, B, C, D}, so A, B D is a new dependency {A, C} + = {A, C, D}, so A, C D is a new dependency {A, D} + = {A, D}, so nothing new. {B, C} + = {A, B, C, D}, so B, C A, D is a new dependency {B, D} + = {A, B, C, D}, so B, D A, C is a new dependency {C, D} + = {A, C, D}, so C, D A is a new dependency T. Griffin (cl.cam.ac.uk) Databases Lecture 7 DB 2009 7 / 17

Example (cont.) For the triples of attributes: F = {A, B C, C D, D A} {A, C, D} + = {A, C, D}, {A, B, D} + = {A, B, C, D}, so A, B, D C is a new dependency {A, B, C} + = {A, B, C, D}, so A, B, C D is a new dependency {B, C, D} + = {A, B, C, D}, so B, C, D A is a new dependency And since {A, B, C, D}+ = {A, B, C, D}, we get no new dependencies with four attributes. T. Griffin (cl.cam.ac.uk) Databases Lecture 7 DB 2009 8 / 17

Example (cont.) We generated 11 new FDs: C A A, B D A, C D B, C A B, C D B, D A B, D C C, D A A, B, C D A, B, D C B, C, D A Can you see the Key? {A, B}, {B, C}, and {B, D} are keys. Note: this schema is already in 3NF! Why? T. Griffin (cl.cam.ac.uk) Databases Lecture 7 DB 2009 9 / 17

General Decomposition Method (GDM) GDM 1 Understand your FDs F (compute F + ), 2 find R(X) = R(Z, W, Y) (sets Z, W and Y are disjoint) with FD Z W F + violating a condition of desired NF, 3 split R into two tables R 1 (Z, W) and R 2 (Z, Y) 4 wash, rinse, repeat Reminder For Z W, if we assume Z W = {}, then the conditions are 1 Z is a superkey for R (2NF, 3NF, BCNF) 2 W is a subset of some key (2NF, 3NF) 3 Z is not a proper subset of any key (2NF) T. Griffin (cl.cam.ac.uk) Databases Lecture 7 DB 2009 10 / 17

The lossless-join condition is guaranteed by GDM This method will produce a lossless-join decomposition because of (repeated applications of) Heath s Rule! That is, each time we replace an S by S 1 and S 2, we will always be able to recover S as S 1 S 2. Note that in GDM step 3, the FD Z W may represent a key constraint for R 1. But does the method always terminate? Please think about this... T. Griffin (cl.cam.ac.uk) Databases Lecture 7 DB 2009 11 / 17

Return to Example Decompose to BCNF R(A, B, C, D) F = {A, B C, C D, D A} Which FDs in F + violate BCNF? C A C D D A A, C D C, D A T. Griffin (cl.cam.ac.uk) Databases Lecture 7 DB 2009 12 / 17

Return to Example Decompose to BCNF Decompose R(A, B, C, D) to BCNF Use C D to obtain R 1 (C, D). This is in BCNF. Done. R 2 (A, B, C) This is not in BCNF. Why? A, B and B, C are the only keys, and C A is a FD for R 1. So use C A to obtain R2.1 (A, C). This is in BCNF. Done. R2.2 (B, C). This is in BCNF. Done. Exercise : Try starting with any of the other BCNF violations and see where you end up. T. Griffin (cl.cam.ac.uk) Databases Lecture 7 DB 2009 13 / 17

The GDM does not always preserve dependencies! R(A, B, C, D, E) A, B C D, E C B D {A, B} + = {A, B, C, D}, so A, B C, D, and {A, B, E} is a key. {B, E} + = {B, C, D, E}, so B, E C, D, and {A, B, E} is a key (again) Let s try for a BCNF decomposition... T. Griffin (cl.cam.ac.uk) Databases Lecture 7 DB 2009 14 / 17

Decomposition 1 Decompose R(A, B, C, D, E) using A, B C, D : R 1 (A, B, C, D). Decompose this using B D: R1.1 (B, D). Done. R1.2 (A, B, C). Done. R 2 (A, B, E). Done. But in this decomposition, how will we enforce this dependency? D, E C T. Griffin (cl.cam.ac.uk) Databases Lecture 7 DB 2009 15 / 17

Decomposition 2 Decompose R(A, B, C, D, E) using B, E C, D: R 3 (B, C, D, E). Decompose this using D, E C R3.1 (C, D, E). Done. R3.2 (B, D, E). Decompose this using B D: R 3.2.1 (B, D). Done. R 3.2.2 (B, E). Done. R 4 (A, B, E). Done. But in this decomposition, how will we enforce this dependency? A, B C T. Griffin (cl.cam.ac.uk) Databases Lecture 7 DB 2009 16 / 17

Summary It always is possible to obtain BCNF that has the lossless-join property (using GDM) But the result may not preserve all dependencies. It is always possible to obtain 3NF that preserves dependencies and has the lossless-join property. Using methods based on minimal covers (for example, see EN2000). T. Griffin (cl.cam.ac.uk) Databases Lecture 7 DB 2009 17 / 17