Functional dependency theory

Similar documents
Normalisation theory

Relational Database Design (II)

UNIT 3 DATABASE DESIGN

Chapter 8: Relational Database Design

Relational Design: Characteristics of Well-designed DB

Normalization. Murali Mani. What and Why Normalization? To remove potential redundancy in design

Unit 3 : Relational Database Design

FUNCTIONAL DEPENDENCIES

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2009 Lecture 4 - Schema Normalization

Databases Tutorial. March,15,2012 Jing Chen Mcmaster University

UNIT -III. Two Marks. The main goal of normalization is to reduce redundant data. Normalization is based on functional dependencies.

Relational Design Theory

Relational Database Design. Announcements. Database (schema) design. CPS 216 Advanced Database Systems. DB2 accounts have been set up

CS411 Database Systems. 05: Relational Schema Design Ch , except and

Functional Dependencies CS 1270

Lecture 11 - Chapter 8 Relational Database Design Part 1

Database Systems. Basics of the Relational Data Model

Chapter 10. Normalization. Chapter Outline. Chapter Outline(contd.)

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2009 Lecture 3 - Schema Normalization

CS352 Lecture - Conceptual Relational Database Design

The Relational Data Model

Combining schemas. Problems: redundancy, hard to update, possible NULLs

Database Management System Prof. Partha Pratim Das Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur

Relational Database Design Theory. Introduction to Databases CompSci 316 Fall 2017

Chapter 10. Chapter Outline. Chapter Outline. Functional Dependencies and Normalization for Relational Databases

CSE 562 Database Systems

COSC Dr. Ramon Lawrence. Emp Relation

Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 10-2

Informal Design Guidelines for Relational Databases

Announcements (January 20) Relational Database Design. Database (schema) design. Entity-relationship (E/R) model. ODL (Object Definition Language)

Chapter 14. Database Design Theory: Introduction to Normalization Using Functional and Multivalued Dependencies

Database Design Principles

Database Design Theory and Normalization. CS 377: Database Systems

CSIT5300: Advanced Database Systems

Chapter 7: Relational Database Design

Schema Refinement: Dependencies and Normal Forms

Introduction to Databases, Fall 2003 IT University of Copenhagen. Lecture 4: Normalization. September 16, Lecturer: Rasmus Pagh

CS352 Lecture - Conceptual Relational Database Design

Functional Dependencies and. Databases. 1 Informal Design Guidelines for Relational Databases. 4 General Normal Form Definitions (For Multiple Keys)

customer = (customer_id, _ customer_name, customer_street,

Functional Dependencies and Finding a Minimal Cover

Databases Lecture 7. Timothy G. Griffin. Computer Laboratory University of Cambridge, UK. Databases, Lent 2009

Schema Refinement & Normalization Theory 2. Week 15

Schema Refinement: Dependencies and Normal Forms

Schema Normalization. 30 th August Submitted By: Saurabh Singla Rahul Bhatnagar

Design Theory for Relational Databases

This lecture. Databases -Normalization I. Repeating Data. Redundancy. This lecture introduces normal forms, decomposition and normalization.

NORMAL FORMS. CS121: Relational Databases Fall 2017 Lecture 18

In This Lecture. Normalisation to BCNF. Lossless decomposition. Normalisation so Far. Relational algebra reminder: product

Databases The theory of relational database design Lectures for m

Case Study: Lufthansa Cargo Database

Draw A Relational Schema And Diagram The Functional Dependencies In The Relation >>>CLICK HERE<<<

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Relational Database Systems 1

Theory of Normal Forms Decomposition of Relations. Overview

Relational Database design. Slides By: Shree Jaswal

Schema Refinement: Dependencies and Normal Forms

Databases -Normalization I. (GF Royle, N Spadaccini ) Databases - Normalization I 1 / 24

Lectures 12: Design Theory I. 1. Normal forms & functional dependencies 2/19/2018. Today s Lecture. What you will learn about in this section

Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Overview - detailed. Goal. Faloutsos & Pavlo CMU SCS /615

Relational Design Theory. Relational Design Theory. Example. Example. A badly designed schema can result in several anomalies.

Lecture 6a Design Theory and Normalization 2/2

Chapter 6: Relational Database Design

V. Database Design CS448/ How to obtain a good relational database schema

Ryan Marcotte CS 475 (Advanced Topics in Databases) March 14, 2011

BCNF. Yufei Tao. Department of Computer Science and Engineering Chinese University of Hong Kong BCNF

Functional Dependencies & Normalization for Relational DBs. Truong Tuan Anh CSE-HCMUT

Database Management System

CMU SCS CMU SCS CMU SCS CMU SCS whole nothing but

Functional Dependencies

Normal Forms. Winter Lecture 19

Normalisation. Normalisation. Normalisation

Part II: Using FD Theory to do Database Design

Homework 6: FDs, NFs and XML (due April 15 th, 2015, 4:00pm, hard-copy in-class please)

Chapter 14 Outline. Normalization for Relational Databases: Outline. Chapter 14: Basics of Functional Dependencies and

Relational Database Systems 1. Christoph Lofi Simon Barthel Institut für Informationssysteme Technische Universität Braunschweig

Review: Attribute closure

Functional Dependencies and Normalization

Schema Refinement and Normal Forms

Exercise 9: Normal Forms

Functional Dependencies and Single Valued Normalization (Up to BCNF)

CS 2451 Database Systems: Database and Schema Design

Desirable database characteristics Database design, revisited

GUJARAT TECHNOLOGICAL UNIVERSITY

Relational Database Systems 1 Wolf-Tilo Balke Hermann Kroll, Janus Wawrzinek, Stephan Mennicke

Informationslogistik Unit 5: Data Integrity & Functional Dependency

MODULE: 3 FUNCTIONAL DEPENDENCIES

INTRODUCTION TO RELATIONAL DATABASE SYSTEMS

FINAL EXAM REVIEW. CS121: Introduction to Relational Database Systems Fall 2018 Lecture 27

Introduction to Database Design, fall 2011 IT University of Copenhagen. Normalization. Rasmus Pagh

Babu Banarasi Das National Institute of Technology and Management

Homework 6: FDs, NFs and XML (due April 13 th, 2016, 4:00pm, hard-copy in-class please)

Relational Design Theory

CS 461: Database Systems. Final Review. Julia Stoyanovich

Handout 3: Functional Dependencies and Normalization

The Relational Model and Normalization

Example Examination. Allocated Time: 100 minutes Maximum Points: 250

SCHEMA REFINEMENT AND NORMAL FORMS

Database Management Systems Paper Solution

Normalisation Chapter2 Contents

Transcription:

Functional dependency theory Introduction to Database Design 2012, Lecture 8 Course evaluation Recalling normal forms Functional dependency theory Computing closures of attribute sets BCNF decomposition Next time: - Canonical covers - 3NF decomposition Overview 2

Course evaluation Main points - More TAs - Exercises after lecture rather than before (why?) Question: Hand-in schedule Minor points - More examples (SQL was hard) - Recap at start of lecture - Correction of non-mandatory exercises - Exam on paper 3 Teaching cancelled April 16 Instead we continue one more week April 16 4

Normal forms Definition A legal instance of a database schema is an instance that does not break the rules of the real world Definition. A functional dependency (FD) α β holds if for all pairs of tuples t, u in any legal instance if t[α] = u[α] then t[β] = u[β] Here α,β denote sets of attributes Example: dept_name budget because if t[dept_name] = u[dept_name] then t[budget] = u[budget] 6

Boyce-Codd normal form (BCNF) A table r(r) is in BCNF if for all functional dependencies α β either - β α (α β is trivial) - or α is a superkey A schema is in BCNF if all tables are in BCNF A schema in BCNF does not allow for (most) redundancies Example of a non-bcnf schema: - instructor(id, name, salary, dept_name, building, budget) 7 Third normal form (3NF) A table r(r) is in 3NF if for all functional dependencies α β either - β α (α β is trivial) - or α is a superkey - or each A in β-α is part of a candidate key Any schema in BCNF is also in 3NF A schema is in 3NF if all tables are in 3NF Example: dept_advisor(s_id, i_id, dept_name) is 3NF: i_id dept_name s_id, dept_name i_id 8

Bill Kent: [Every] non-key [attribute] must provide a fact about the key, the whole key, and nothing but the key (so help me Codd) The 3NF oath Explanation: - Think of an FD α A as A provides a fact about α - A key attribute is an attribute part of a candidate key 9 Motivation for 3NF Schemas in 3NF can contain some redundancies not found in schemas in BCNF Not always possible to reduce to BCNF in a dependency preserving way This means that 3NF schemas can be more efficient to work with See this weeks exercises for an illustration! 10

Functional-dependency theory Consider the following schema departure(fn, date, an) flight_time(fn, std, sta) airport(ac, an, city) destination(fn, ac) Motivating example - fn = flight number - an = airport name - ac = airport code - std, sta: scheduled time of departure / arrival Is this BCNF? 12

Primary keys as FDs departure(fn, date, an) flight_time(fn, std, sta) airport(ac, an, city) destination(fn, ac) None of these violate BCNF A logically implied FD: fn an This violates BCNF in departure Motivating example fn, date an fn std, sta ac an, city fn ac 13 Moral We need to take logically implied FDs into account when reasoning about BCNF, 3NF We need to know rules for constructing new logically implied FDs We need to know algorithms for finding all implied FDs On the other hand: - If there is only one table, then it suffices to inspect the given set of FDs 14

We use F to range over sets of FDs So F could be: Derived FDs Definition. An FD α β is logically implied if all instances satisfying F also satisfy α β fn, date an fn std, sta ac an, city fn ac Example. FD fn an is logically implied by the above F F+ is set of all FDs logically implied by F 15 Armstrong s axioms Axioms for deriving implied functional dependencies - Reflexivity: If β α then α β - Augmentation: If α β then α,γ β, γ - Transitivity: If α β and β γ then also α γ Exercise: Check that these are valid Theorem. All FDs logically implied by F can be derived using Armstrong s axioms Derived rule: (α β and α γ) if and only if α β, γ 16

By transitivity fn an, city By reflexivity an, city an By transitivity again fn an An example derivation fn, date an fn std, sta ac an, city fn ac 17 Closures of attribute sets Definition. Given attribute set α and set of FDs F, α + is the largest set of attributes such that α α + is logically implied by F Example. If F is Then i_id dept_name s_id, dept_name i_id (s_id, dept_name) + = (s_id, dept_name, i_id) (s_id, i_id) + = (s_id, dept_name, i_id) (i_id) + = (dept_name, i_id) 18

An algorithm for computing closures Computing α + : Set result = α Repeat - Find an FD β γ in F such that β α and γ not a subset of result - Set result = result γ Terminate when no such FD exists 19 Example Consider schema R(A,B,C,D) with dependencies AB C C D D A Compute all candidate keys Is R in BCNF? Is it in 3NF? (this is a typical exam exercise) 20

Consider schema R(A,B,C), S(C,D,E) with dependencies AB C C D D B Compute all candidate keys Are R, S in BCNF? Are they in 3NF? Example 21 BCNF normalisation

Not a BCNF Decomposition into BCNF Decompose to BCNF: - instructor(id, name, salary, dept_name) - department(dept_name, building, budget) 23 Compute F+ BCNF decomposition Repeat the following while the schema is not BCNF - Find a BCNF violation A1 A2... An B1 B2... Bm in schema R(α) - Decompose R into ((α-b1 B2... Bm) A1 A2... An) and (A1 A2... An B1 B2... Bm) 24

Decompose the relation With the functional dependencies Example cd_shop(cd_id, artist, title, order_id, order_date, quantity, customer_id, name, address) cd_id artist, title customer_id name, address order_id order_date, customer_id order_id, cd_id quantity 25 Non determinancy Much depends on the choice of BCNF violation Try e.g. decomposing first using There is no guarantee that decomposition is dependency preserving (even if there is a dependency preserving decomposition) order_id order_date, customer_id One heuristic is to maximise right hand sides of BCNF violations 26

Correctness Correctness: - Tables become smaller for every decomposition - Every 2-attribute table is BCNF - So in the end, the schema must be BCNF Every decomposition is lossless In fact if α β then decomposition of R(αβγ) into (αβ) and (αγ) is always lossless (book page 346) 27 Discussion BCNF algorithm suggests a new strategy to DB design: - Put everything in one table - Write up all FDs - Apply BCNF decomposition This is not a good strategy This can lead to terrible designs Without ER diagram, it is also harder to extend an existing DB design 28

Summary When checking BCNF of schemas with more than one table one must inspect also logically implied FDs Armstrong s axioms provide an algorithm for computing closures of attribute sets We can always decompose a relation to a BCNF schema 29 Intended learning outcomes After this lecture you should be able to - Compute closures of attribute sets - Judge if a design is BCNF or 3NF - Apply BCNF algorithm 30