Functional Dependencies

Similar documents
Chapter 7: Relational Database Design

Chapter 6: Relational Database Design

Databases Tutorial. March,15,2012 Jing Chen Mcmaster University

Lecture 11 - Chapter 8 Relational Database Design Part 1

Chapter 8: Relational Database Design

Lecture 14 of 42. E-R Diagrams, UML Notes: PS3 Notes, E-R Design. Thursday, 15 Feb 2007

Unit 3 : Relational Database Design

customer = (customer_id, _ customer_name, customer_street,

This lecture. Databases -Normalization I. Repeating Data. Redundancy. This lecture introduces normal forms, decomposition and normalization.

Databases -Normalization I. (GF Royle, N Spadaccini ) Databases - Normalization I 1 / 24

Relational Database Design (II)

Normal Forms. Winter Lecture 19

UNIT -III. Two Marks. The main goal of normalization is to reduce redundant data. Normalization is based on functional dependencies.

Database Design Principles

FUNCTIONAL DEPENDENCIES

Combining schemas. Problems: redundancy, hard to update, possible NULLs

UNIT 3 DATABASE DESIGN

Chapter 10. Normalization. Chapter Outline. Chapter Outline(contd.)

Normalization. Murali Mani. What and Why Normalization? To remove potential redundancy in design

Informal Design Guidelines for Relational Databases

Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 10-2

NORMAL FORMS. CS121: Relational Databases Fall 2017 Lecture 18

Chapter 10. Chapter Outline. Chapter Outline. Functional Dependencies and Normalization for Relational Databases

CS411 Database Systems. 05: Relational Schema Design Ch , except and

Database System Concepts, 5 th Ed.! Silberschatz, Korth and Sudarshan See for conditions on re-use "

Relational Design: Characteristics of Well-designed DB

QUESTION BANK. SUBJECT CODE / Name: CS2255 DATABASE MANAGEMENT SYSTEM UNIT III. PART -A (2 Marks)

Functional Dependencies and. Databases. 1 Informal Design Guidelines for Relational Databases. 4 General Normal Form Definitions (For Multiple Keys)

Database Design Theory and Normalization. CS 377: Database Systems

Chapter 14. Database Design Theory: Introduction to Normalization Using Functional and Multivalued Dependencies

A subquery is a nested query inserted inside a large query Generally occurs with select, from, where Also known as inner query or inner select,

Schema Refinement: Dependencies and Normal Forms

Functional dependency theory

Chapter 2: Intro to Relational Model

Schema Refinement: Dependencies and Normal Forms

Functional Dependencies and Finding a Minimal Cover

Schema Refinement: Dependencies and Normal Forms

Desirable database characteristics Database design, revisited

Functional Dependencies CS 1270

Database Systems. Basics of the Relational Data Model

Functional Dependency: Design and Implementation of a Minimal Cover Algorithm

To overcome these anomalies we need to normalize the data. In the next section we will discuss about normalization.

Database Management System 15

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2009 Lecture 3 - Schema Normalization

Typical relationship between entities is ((a,b),(c,d) ) is best represented by one table RS (a,b,c,d)

V. Database Design CS448/ How to obtain a good relational database schema

Relational Database Design Theory. Introduction to Databases CompSci 316 Fall 2017

FINAL EXAM REVIEW. CS121: Introduction to Relational Database Systems Fall 2018 Lecture 27

CSE 562 Database Systems

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2009 Lecture 4 - Schema Normalization

Lecture 5 Design Theory and Normalization

Database Normalization

Relational Database design. Slides By: Shree Jaswal

Announcements (January 20) Relational Database Design. Database (schema) design. Entity-relationship (E/R) model. ODL (Object Definition Language)

Informationslogistik Unit 5: Data Integrity & Functional Dependency

Lecture 6a Design Theory and Normalization 2/2

Chapter 2: Relational Model

Functional Dependencies & Normalization for Relational DBs. Truong Tuan Anh CSE-HCMUT

CSCI 403: Databases 13 - Functional Dependencies and Normalization

Relational Database Systems 1. Christoph Lofi Simon Barthel Institut für Informationssysteme Technische Universität Braunschweig

CS352 Lecture - Conceptual Relational Database Design

Functional Dependencies and Normalization for Relational Databases Design & Analysis of Database Systems

CS352 Lecture - Conceptual Relational Database Design

Instructor: Amol Deshpande

Relational Database Systems 1

Relational Database Design. Announcements. Database (schema) design. CPS 216 Advanced Database Systems. DB2 accounts have been set up

MODULE: 3 FUNCTIONAL DEPENDENCIES

Chapter 2: Intro to Relational Model

QUIZ 1 REVIEW SESSION DATABASE MANAGEMENT SYSTEMS

Relational Database Systems 1 Wolf-Tilo Balke Hermann Kroll, Janus Wawrzinek, Stephan Mennicke

Normalisation Chapter2 Contents

Note that alternative datatypes are possible. Other choices for not null attributes may be acceptable.

Schema Refinement and Normal Forms

Design Process Modeling Constraints E-R Diagram Design Issues Weak Entity Sets Extended E-R Features Design of the Bank Database Reduction to

Database Systems. Answers

ADVANCED DATABASES ; Spring 2015 Prof. Sang-goo Lee (11:00pm: Mon & Wed: Room ) Advanced DB Copyright by S.-g.

Database Systems CSE Comprehensive Exam Spring 2005

INTRODUCTION TO RELATIONAL DATABASE SYSTEMS

Normalisation theory

Relational Design Theory. Relational Design Theory. Example. Example. A badly designed schema can result in several anomalies.

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Database Management System

E.G.S. PILLAY ENGINEERING COLLEGE (An Autonomous Institution, Affiliated to Anna University, Chennai) Nagore Post, Nagapattinam , Tamilnadu.

Homework 6: FDs, NFs and XML (due April 13 th, 2016, 4:00pm, hard-copy in-class please)

CS 338 Functional Dependencies

We shall represent a relation as a table with columns and rows. Each column of the table has a name, or attribute. Each row is called a tuple.

Name :. Roll No. :... Invigilator s Signature : DATABASE MANAGEMENT SYSTEM

Assignment 2 Solutions

A database can be modeled as: + a collection of entities, + a set of relationships among entities.

Babu Banarasi Das National Institute of Technology and Management

Chapter 2: Relational Model

Normalization is based on the concept of functional dependency. A functional dependency is a type of relationship between attributes.

Mapping ER Diagrams to. Relations (Cont d) Mapping ER Diagrams to. Exercise. Relations. Mapping ER Diagrams to Relations (Cont d) Exercise

Introduction to Data Management. Lecture #7 (Relational Design Theory)

CSCI 127 Introduction to Database Systems

Relational Design Theory

Professor: Pete Keleher! Closures, candidate keys, canonical covers etc! Armstrong axioms!

Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Overview - detailed. Goal. Faloutsos & Pavlo CMU SCS /615

Lecture Notes for 3 rd August Lecture topic : Introduction to Relational Model. Rishi Barua Shubham Tripathi

CMU SCS CMU SCS CMU SCS CMU SCS whole nothing but

Transcription:

Functional Dependencies

Redundancy in Database Design A table Students-take-courses (stud-id, name, address, phone, crs-id, instructor-name, office) Students(stud-id, name, address, phone, ) Instructors(name, office, ) Redundant information If a student takes 20 courses, her/his name, address, phone number have to be repeated 20 times If an instructor teaches 2 courses with 120 students in total, her/his office number is repeated 120 times CMPT 354: Database I -- Functional Dependencies 2

Why Redundancy Could Be Bad? Space cost Maintenance overhead If a student updates her/his address, 20 records need to be updated If an instructor moves to a new office, 120 records need to be updated What if inconsistency happens during the update? CMPT 354: Database I -- Functional Dependencies 3

Why Redundancy Could Be Good? Students-take-courses(stud-id, name, crs-id) Student name is redundant if we have table Students(stud-id, name, address, phone, ) Only need Students-take-courses(stud-id, crsid) What if often we need to generate class rosters? Fast query answering: avoid joining two tables many times CMPT 354: Database I -- Functional Dependencies 4

Requirements of Good Design Correctness: no information loss Must be guaranteed Efficiency Minimum (or, as less as possible) redundant (repeated) information Good performance with respect to (expected) typical workload May have to trade off between space and query answering time Redundant information may help query answering CMPT 354: Database I -- Functional Dependencies 5

Atomic Domains Domain is atomic if its elements are considered to be indivisible units Course-id consisting of department code and course number, e.g., CMPT 354 Bad examples: a customer s all accounts, all owners of an account Non-atomic values complicate storage and query answering, and encourage redundant (repeated) storage of data Storage and redundancy: a set of accounts stored with each customer, and a set of owners stored with each account CMPT 354: Database I -- Functional Dependencies 6

First Normal Form Normal form: a quality criteria that the database design should meet A relational schema R is in first normal form if the domains of all attributes of R are atomic All relations are assumed in first normal form A property of how the elements of the domain are used Strings would normally be considered indivisible Course-id is not atomic since two pieces of information are encoded CMPT 354: Database I -- Functional Dependencies 7

Combine Schemas? Combine borrow and loan to get bor_loan = (customer_id, loan_number, amount ) Result is possible repetition of information (L-100 in example below) CMPT 354: Database I -- Functional Dependencies 8

Why Decomposition? Suppose we had started with bor_loan, how would we know to split up (decompose) it into borrower and loan? Write a rule if there were a schema (loan_number, amount), then loan_number would be a candidate key CMPT 354: Database I -- Functional Dependencies 9

Why Decomposition? Denote as a functional dependency loan_number amount In bor_loan, because loan_number is not a candidate key, the amount of a loan may have to be repeated This indicates the need to decompose bor_loan CMPT 354: Database I -- Functional Dependencies 10

Combined Schema w/o Repetition Consider combining loan_branch and loan loan_amt_br = (loan_number, amount, branch_name) No repetition CMPT 354: Database I -- Functional Dependencies 11

Decomposition Is Not Always Good Suppose we decompose employee into employee1 = (employee_id, employee_name) employee2 = (employee_name, telephone_number, start_date) We cannot reconstruct the original employee relation if there are two employees having the same name CMPT 354: Database I -- Functional Dependencies 12

A Lossy Decomposition More tuples after rejoining the tables is considered loss of information instead of gain CMPT 354: Database I -- Functional Dependencies 13

Designing by Decomposition Start from a wide table the universal table Containing all pieces of information Decide whether a particular relation R is in good form In the case that a relation R is not in a good form, decompose it into a set of relations {R 1, R 2,..., R n } such that Each relation is in good form The decomposition does not lose information CMPT 354: Database I -- Functional Dependencies 14

Functional Dependencies Constraints on the set of legal relations Require that the value for a certain set of attributes determines uniquely the value for another set of attributes A functional dependency is a generalization of the notion of a key CMPT 354: Database I -- Functional Dependencies 15

Functional Dependencies Let R be a relation schema, α R and β R The functional dependency α β holds on R if and only if for any legal relations r(r), whenever any two tuples t 1 and t 2 of r agree on the attributes α, they also agree on the attributes β t 1 [α] = t 2 [α] t 1 [β ] = t 2 [β ] CMPT 354: Database I -- Functional Dependencies 16

Example Example: Consider r(a,b ) with the following instance of r. A B 1 4 1 5 3 7 On this instance, A B does NOT hold, but B A does hold CMPT 354: Database I -- Functional Dependencies 17

Super Keys and Candidate Keys K is a superkey for relation schema R if and only if K R K is a candidate key for R if and only if K R and for no α K, α R CMPT 354: Database I -- Functional Dependencies 18

Dependencies and Constraints Functional dependencies can express constraints that cannot be expressed using superkeys Consider the schema bor_loan = (customer_id, loan_number, amount ) We expect loan_number amount We do not expect amount customer_id CMPT 354: Database I -- Functional Dependencies 19

Use of Functional Dependencies Testing relations to see if they are legal under a given set of functional dependencies If a relation r is legal under a set F of functional dependencies, we say that r satisfies F Specifying constraints on the set of legal relations We say that F holds on R if all legal relations on R satisfy the set of functional dependencies F A specific instance of a relation schema may satisfy a functional dependency even if the functional dependency does not hold on all legal instances For example, a specific instance of loan may, by chance, satisfy amount customer_name CMPT 354: Database I -- Functional Dependencies 20

Trivial Functional Dependencies A functional dependency is trivial if it is satisfied by all instances of a relation Example: customer_name, loan_number customer_name customer_name customer_name In general, α β is trivial if β α CMPT 354: Database I -- Functional Dependencies 21

Closure A set of functional dependencies may logically imply other functional dependencies If A B and B C, then A C The set of all functional dependencies logically implied by F is the closure of F We denote the closure of F by F + F + is a superset of F CMPT 354: Database I -- Functional Dependencies 22

Armstrong s Axioms Finding F + (reflexivity) If β α, then α β (augmentation) If α β, then γ α γβ (transitivity) If α β, and β γ, then α γ These rules are Sound: generate only functional dependencies that actually hold Complete: generate all functional dependencies that hold CMPT 354: Database I -- Functional Dependencies 23

Example R = (A, B, C, G, H, I) F = { A B A C CG H CG I B H} some members of F+ A H By using transitivity from A B and B H AG I By augmenting A C with G, to get AG CG and then using transitivity with CG I CG HI By augmenting CG I to infer CG CGI, and augmenting of CG H to infer CGI HI, and then using transitivity CMPT 354: Database I -- Functional Dependencies 24

Procedure for Computing F + F + = F repeat for each functional dependency f in F + apply reflexivity and augmentation rules on f add the resulting functional dependencies to F + for each pair of functional dependencies f 1 and f 2 in F + if f 1 and f 2 can be combined using transitivity then add the resulting functional dependency to F + until F + does not change any further CMPT 354: Database I -- Functional Dependencies 25

Auxiliary Rules We can further simplify manual computation of F+ by using the following additional rules (union) If α βholds and α γholds, then α βγholds (decomposition) If α βγholds, then α β holds and α γholds (pseudotransitivity) If α β holds and γ β δ holds, then α γ δholds The above rules can be inferred from Armstrong s axioms CMPT 354: Database I -- Functional Dependencies 26

Summary First normal form Decomposition in database design Functional dependencies Armstrong s axioms and auxiliary rules for closure computation CMPT 354: Database I -- Functional Dependencies 27

To-Do-List Please prove the auxiliary rules using Armstrong s Axioms CMPT 354: Database I -- Functional Dependencies 28