Normalisation theory

Similar documents
Functional dependency theory

Lecture 11 - Chapter 8 Relational Database Design Part 1

Relational Database Design (II)

Chapter 8: Relational Database Design

Databases Tutorial. March,15,2012 Jing Chen Mcmaster University

Normalization. Murali Mani. What and Why Normalization? To remove potential redundancy in design

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2009 Lecture 4 - Schema Normalization

Unit 3 : Relational Database Design

Chapter 14. Database Design Theory: Introduction to Normalization Using Functional and Multivalued Dependencies

Chapter 10. Normalization. Chapter Outline. Chapter Outline(contd.)

Chapter 10. Chapter Outline. Chapter Outline. Functional Dependencies and Normalization for Relational Databases

Informal Design Guidelines for Relational Databases

Chapter 6: Relational Database Design

NORMAL FORMS. CS121: Relational Databases Fall 2017 Lecture 18

Part II: Using FD Theory to do Database Design

UNIT 3 DATABASE DESIGN

CSE 562 Database Systems

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2009 Lecture 3 - Schema Normalization

Functional Dependencies and. Databases. 1 Informal Design Guidelines for Relational Databases. 4 General Normal Form Definitions (For Multiple Keys)

Schema Normalization. 30 th August Submitted By: Saurabh Singla Rahul Bhatnagar

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 10-2

Normalisation. Normalisation. Normalisation

Chapter 14 Outline. Normalization for Relational Databases: Outline. Chapter 14: Basics of Functional Dependencies and

Relational Design: Characteristics of Well-designed DB

Lectures 12: Design Theory I. 1. Normal forms & functional dependencies 2/19/2018. Today s Lecture. What you will learn about in this section

Case Study: Lufthansa Cargo Database

Chapter 7: Relational Database Design

Normal Forms. Winter Lecture 19

COSC Dr. Ramon Lawrence. Emp Relation

customer = (customer_id, _ customer_name, customer_street,

Database Management System Prof. Partha Pratim Das Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur

Relational Database design. Slides By: Shree Jaswal

Theory of Normal Forms Decomposition of Relations. Overview

Normalisation Chapter2 Contents

Functional Dependencies and Finding a Minimal Cover

Combining schemas. Problems: redundancy, hard to update, possible NULLs

Database design III. Quiz time! Using FDs to detect anomalies. Decomposition. Decomposition. Boyce-Codd Normal Form 11/4/16

Schema Refinement: Dependencies and Normal Forms

Introduction to Databases, Fall 2003 IT University of Copenhagen. Lecture 4: Normalization. September 16, Lecturer: Rasmus Pagh

Database Design Theory and Normalization. CS 377: Database Systems

Functional Dependencies & Normalization for Relational DBs. Truong Tuan Anh CSE-HCMUT

Schema Refinement & Normalization Theory 2. Week 15

Databases Lecture 7. Timothy G. Griffin. Computer Laboratory University of Cambridge, UK. Databases, Lent 2009

CS 338 Functional Dependencies

Database Management System

This lecture. Databases -Normalization I. Repeating Data. Redundancy. This lecture introduces normal forms, decomposition and normalization.

ACS-2914 Normalization March 2009 NORMALIZATION 2. Ron McFadyen 1. Normalization 3. De-normalization 3

The Relational Data Model

Lectures 5 & 6. Lectures 6: Design Theory Part II

UNIT -III. Two Marks. The main goal of normalization is to reduce redundant data. Normalization is based on functional dependencies.

In This Lecture. Normalisation to BCNF. Lossless decomposition. Normalisation so Far. Relational algebra reminder: product

Redundancy:Dependencies between attributes within a relation cause redundancy.

Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Overview - detailed. Goal. Faloutsos & Pavlo CMU SCS /615

Draw A Relational Schema And Diagram The Functional Dependencies In The Relation >>>CLICK HERE<<<

Schema Refinement: Dependencies and Normal Forms

CS411 Database Systems. 05: Relational Schema Design Ch , except and

CMU SCS CMU SCS CMU SCS CMU SCS whole nothing but

Functional Dependencies and Normalization for Relational Databases Design & Analysis of Database Systems

Steps in normalisation. Steps in normalisation 7/15/2014

Review: Attribute closure

CSCI 403: Databases 13 - Functional Dependencies and Normalization

Databases -Normalization I. (GF Royle, N Spadaccini ) Databases - Normalization I 1 / 24

Database Systems. Basics of the Relational Data Model

Techno India Batanagar Computer Science and Engineering. Model Questions. Subject Name: Database Management System Subject Code: CS 601

CSIT5300: Advanced Database Systems

Lecture 4. Database design IV. INDs and 4NF Design wrapup

The Relational Model and Normalization

FUNCTIONAL DEPENDENCIES

Relational Design Theory

Schema Refinement and Normal Forms

Desired properties of decompositions

BCNF. Yufei Tao. Department of Computer Science and Engineering Chinese University of Hong Kong BCNF

Relational Database Design Theory. Introduction to Databases CompSci 316 Fall 2017

IS 263 Database Concepts

Normalization. Anomalies Functional Dependencies Closures Key Computation Projecting Relations BCNF Reconstructing Information Other Normal Forms

Functional Dependencies and Normalization

Schema Refinement: Dependencies and Normal Forms

Distributed Database Systems By Syed Bakhtawar Shah Abid Lecturer in Computer Science

CS352 Lecture - Conceptual Relational Database Design

Design Theory for Relational Databases

Administrivia. Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Relational Model: Keys. Correction: Implied FDs

CS403- Database Management Systems Solved Objective Midterm Papers For Preparation of Midterm Exam

Functional Dependencies

CS403- Database Management Systems Solved MCQS From Midterm Papers. CS403- Database Management Systems MIDTERM EXAMINATION - Spring 2010

Introduction to Database Design, fall 2011 IT University of Copenhagen. Normalization. Rasmus Pagh

The strategy for achieving a good design is to decompose a badly designed relation appropriately.

Chapter 16. Relational Database Design Algorithms. Database Design Approaches. Top-Down Design

Database Systems. Normalization Lecture# 7

MODULE: 3 FUNCTIONAL DEPENDENCIES

Relational Design 1 / 34

Relational model and basic SQL

Lecture 6a Design Theory and Normalization 2/2

GUJARAT TECHNOLOGICAL UNIVERSITY

CS 2451 Database Systems: Database and Schema Design

Applied Databases. Sebastian Maneth. Lecture 5 ER Model, normal forms. University of Edinburgh - January 25 th, 2016

Announcements (January 20) Relational Database Design. Database (schema) design. Entity-relationship (E/R) model. ODL (Object Definition Language)

Normalization in DBMS

DBMS Chapter Three IS304. Database Normalization-Comp.

Functional Dependencies CS 1270

Transcription:

Normalisation theory Introduction to Database Design 2012, Lecture 7 Challenging exercises E-R diagrams example Normalisation theory, motivation Functional dependencies Boyce-Codd normal form (BCNF) 3rd normal form (3NF) Next two lectures - Functional dependency theory - Normalisation algorithms Overview 2

Second challenging exercise Reformulate as we did last week Further reformulation Challenging exercises Find all cds that were bought in all orders in which 'Paranoid' by 'Black Sabbath' was bought. Find all cds c such that there is no order containing 'Paranoid' by 'Black Sabbath' not containing c Find all cds c such that there is no order containing 'Paranoid' by 'Black Sabbath' not in set of orders containing c 3 Problem Solution Challenging exercises Find all cds c such that there is no order containing 'Paranoid' by 'Black Sabbath' not containing c select artist, title from cd as S where not exists (select * from cd natural join purch_cd! where title = 'Paranoid' and artist = 'Black Sabbath'! and purch_id not in (select purch_id from purch_cd where cd_id = S.cd_id)); orders containing Paranoid, but not c orders containing c 4

Benjamin s solution select count(*) as AppearsInNumberOfOrdersWithParanoid, cd_id, title, artist from purchase natural join purch_cd natural join cd where purch_id in ( #Purch_ID for all paranoid orders select purch_id from purchase natural join purch_cd natural join cd where title='paranoid'and artist='black sabbath') group by cd_id having AppearsInNumberOfOrdersWithParanoid = (select count(*) from purchase natural join purch_cd natural join cd where title='paranoid'and artist='black sabbath' group by cd_id order by purch_id); 5 E-R diagrams

Generalisation in E-R diagrams, example Many-to-one relationship assigned_to cannot be described without using generalisation task id description assigned_to employee id name total permanent job title trainee start-date supervisor 7 Normal forms

Challenge: Avoiding redundancy A poor database design Redundancy wastes space and leads to inconsistency issues Normalisation theory deals with this issue 9 Anomalies caused by redundancy Update anomalies: occur when information is updated one but not all places where information occurs Deletion anomalies: occur when deleting one fact leads to deletion of other facts in an unwanted way Insertion anomalies: cannot insert information about one thing without knowing additional information about something else We should design database such that these anomalies cannot occur 10

Functional dependencies Cause of redundancy - value of budget determined by value of department - department is not a superkey - So budget information repeated between tuples with same department value Say there is a functional dependency from department to budget Notation dept_name budget 11 Can involve sets of tuples A poor flight db design: Functional dependencies Functional dependencies flight_all(flight_num, dept_date, capacity, dept_airport, arr_airport, STD, STA, date_offset) flight_num, dept_date capacity flight_num STD,STA flight_num, dept_date STD,STA The last one is not minimal (in a sense to be made precise later) but still true 12

Functional dependencies Functional dependencies are rules derived from the real world situation we are modelling Functional dependencies are a form of integrity constraint Goal is to make database enforce these constraints by design 13 Definition A legal instance of a database schema is an instance that does not break the rules of the real world Definition. A functional dependency α β holds if for all pairs of tuples t, u in any legal instance if t[α] = u[α] then t[β] = u[β] Here α,β denote sets of attributes t[α] = u[α] means tuples t and u agree on the values in α 14

Examples α β always holds if β α (a trivial dependency) A set of attributes α is a superkey for relation r (R) if α R A candidate key is a minimal superkey Example: flight_all table (3 slides back) - flight_num, dept_date, capacity is a superkey - flight_num, dept_date is a candidate key 15 Boyce-Codd normal form (BCNF) Definition. A table r(r) is in BCNF if for all functional dependencies α β either - β α (α β is trivial) - or α is a superkey A schema is in BCNF if all tables are in BCNF Example: - instructor(id, name, salary, dept_name, building, budget) is not BCNF - Because of dept_name building, budget 16

Not a BCNF Decomposition into BCNF Decompose to BCNF: - instructor(id, name, salary, dept_name) - department(dept_name, building, budget) 17 Not BCNF Decompose as This is BCNF Another example flight_all(flight_num, dept_date, capacity, dept_airport, arr_airport, STD, STA, date_offset) flight(flight_num, dept_airport, arr_airport, STD, STA, date_offset) departure(flight_num, dept_date, capacity) 18

Decomposition into - instructor(id, name, salary, dept_name) - department(dept_name, building, budget) Lossless decomposition is a lossless composition, because we can recover big table, by joining instructor and department A lossy decomposition of instructor: - instructor_name(id, name) - instructor(name, salary, dept_name) (what happens if two instructors have same name?) 19 Lossless decomposition Definition. A decomposition of r(r) into s(α), s (β) is lossless if for any legal instance of r r = α (r) β(r) This is the case of the decomposition into instructor, department Another example of a lossy decomposition - instructor(id, name, salary) - department(dept_name, building, budget) 20

3rd normal form 3NF, motivation A database in BCNF is to a large extent free of redundancy But reducing to BCNF can also lead to inefficient databases 3NF is a less strict normal form Reducing to 3NF is often enough In some situations 3NF allows for more efficient databases than BCNF 22

Dependency preservation, example DB schema - department, instructor, student - dept_advisor(s_id, i_id, dept_name) i_id dept_name s_id, dept_name i_id 23 Dependency preservation, example i_id dept_name s_id, dept_name i_id Schema previous slide not BCNF Consider decomposition of dept_advisor - (s_id, i_id) - (i_id, dept_name) (this table can be dropped because it is contained in instructor table) Second functional dependency involves 2 tables This decomposition is not dependency preserving 24

Efficiency of insertions 1st design Consider first single table solution - dept_advisor(s_id, i_id, dept_name) When inserting a tuple into dept_advisor check - Given i_id works in department dept_name This is a lookup into instructor - s_id, dept_name i_id is not violated This is a primary key condition Both these conditions can be checked efficiently i_id dept_name s_id, dept_name i_id 25 Efficiency of insertions, 2nd design Consider first single table solution - dept_advisor(s_id, i_id) When inserting a tuple into dept_advisor check - s_id, dept_name i_id is not violated This involves computing a join of dept_advisor and department Computing joins is slow and we would not like to do that for each insertion! i_id dept_name s_id, dept_name i_id 26

Third normal form (3NF) A table r(r) is in 3NF if for all functional dependencies α β either - β α (α β is trivial) - or α is a superkey - or each A in β-α is part of a candidate key A schema is in 3NF if all tables are in 3NF 27 The schema is in 3NF - dept_advisor(s_id, i_id, dept_name) Two candidate keys - (s_id, dept_name) - (s_id, i_id) i_id dept_name s_id, dept_name i_id So any attribute is part of candidate key Example in 3NF i.e. third condition holds for any functional dependency 28

Normalisation theory seeks to minimise redundancy Summary A DB design satisfying BCNF is redundancy free Sometimes tradeoffs between efficiency and removal of redundancy In this case the weaker 3NF may be preferable 29 Learning objectives You should be able to see if a db schema satisfies BCNF or 3NF Next time: - BCNF decomposition Can always compose a db design to BCNF in a lossless way - 3NF decomposition Can always compose into 3NF in a lossless and dependency preserving way 30