In This Lecture. Normalisation to BCNF. Lossless decomposition. Normalisation so Far. Relational algebra reminder: product

Similar documents
The Relational Data Model

Normalization. Murali Mani. What and Why Normalization? To remove potential redundancy in design

CS411 Database Systems. 05: Relational Schema Design Ch , except and

Physical DB Issues, Indexes, Query Optimisation. Database Systems Lecture 13 Natasha Alechina

Normalisation. Normalisation. Normalisation

Normalization Rule. First Normal Form (1NF) Normalization rule are divided into following normal form. 1. First Normal Form. 2. Second Normal Form

Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Overview - detailed. Goal. Faloutsos & Pavlo CMU SCS /615

Database Foundations. 3-9 Validating Data Using Normalization. Copyright 2015, Oracle and/or its affiliates. All rights reserved.

CMU SCS CMU SCS CMU SCS CMU SCS whole nothing but

Database Systems. Normalization Lecture# 7

Database Management System Prof. Partha Pratim Das Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur

Case Study: Lufthansa Cargo Database

COSC Dr. Ramon Lawrence. Emp Relation

Database Normalization. (Olav Dæhli 2018)

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2009 Lecture 4 - Schema Normalization

Lectures 5 & 6. Lectures 6: Design Theory Part II

Database design III. Quiz time! Using FDs to detect anomalies. Decomposition. Decomposition. Boyce-Codd Normal Form 11/4/16

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2009 Lecture 3 - Schema Normalization

Edited by: Nada Alhirabi. Normalization

Functional dependency theory

5 Normalization:Quality of relational designs

Functional Dependencies CS 1270

The Relational Model and Normalization

Lectures 12: Design Theory I. 1. Normal forms & functional dependencies 2/19/2018. Today s Lecture. What you will learn about in this section

In This Lecture. The Relational Model. The Relational Model. Relational Data Structure. Unnamed and named tuples. New thing:scheme (and attributes)

Institute of Southern Punjab, Multan

Databases -Normalization I. (GF Royle, N Spadaccini ) Databases - Normalization I 1 / 24

NORMAL FORMS. CS121: Relational Databases Fall 2017 Lecture 18

CSE 562 Database Systems

Databases 1. Daniel POP

Database Systems. Basics of the Relational Data Model

Learning outcomes. On successful completion of this unit you will: 1. Understand data models and database technologies.

CSE 544 Principles of Database Management Systems

Database Design Theory and Normalization. CS 377: Database Systems

Relational Database Design (II)

Relational Design: Characteristics of Well-designed DB

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Normal Forms. Winter Lecture 19

Relational Design 1 / 34

This lecture. Databases -Normalization I. Repeating Data. Redundancy. This lecture introduces normal forms, decomposition and normalization.

Theory of Normal Forms Decomposition of Relations. Overview

UNIT 3 DATABASE DESIGN

Normalisation Chapter2 Contents

DBMS Chapter Three IS304. Database Normalization-Comp.

Chapter 8: Relational Database Design

Steps in normalisation. Steps in normalisation 7/15/2014

CSE 544 Principles of Database Management Systems

Administrivia. Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Relational Model: Keys. Correction: Implied FDs

Redundancy:Dependencies between attributes within a relation cause redundancy.

Draw A Relational Schema And Diagram The Functional Dependencies In The Relation >>>CLICK HERE<<<

Schema Refinement: Dependencies and Normal Forms

Unit 3 : Relational Database Design

Database Constraints and Design

Databases Lecture 7. Timothy G. Griffin. Computer Laboratory University of Cambridge, UK. Databases, Lent 2009

DATABASE DESIGN I - 1DL300

Schema Refinement: Dependencies and Normal Forms

DATABASE MANAGEMENT SYSTEM SHORT QUESTIONS. QUESTION 1: What is database?

LOGICAL DATABASE DESIGN Part #2/2

6.830 Lecture PS1 Due Next Time (Tuesday!) Lab 1 Out end of week start early!

Concepts from

ACS-2914 Normalization March 2009 NORMALIZATION 2. Ron McFadyen 1. Normalization 3. De-normalization 3

Fundamentals of Database Systems

Part II: Using FD Theory to do Database Design

Normalisation theory

Relational Design Theory

CS 2451 Database Systems: Database and Schema Design

Informal Design Guidelines for Relational Databases

In This Lecture. Exam revision. Main topics. Exam format. Particular topics. How to revise. Exam format Main topics How to revise

Homework 6: FDs, NFs and XML (due April 13 th, 2016, 4:00pm, hard-copy in-class please)

Applied Databases. Sebastian Maneth. Lecture 5 ER Model, normal forms. University of Edinburgh - January 25 th, 2016

Database Systems: Design, Implementation, and Management Tenth Edition. Chapter 6 Normalization of Database Tables

Normalization. Anomalies Functional Dependencies Closures Key Computation Projecting Relations BCNF Reconstructing Information Other Normal Forms

Introduction to Database Design, fall 2011 IT University of Copenhagen. Normalization. Rasmus Pagh

Databases The theory of relational database design Lectures for m

Introduction to Databases, Fall 2003 IT University of Copenhagen. Lecture 4: Normalization. September 16, Lecturer: Rasmus Pagh

V. Database Design CS448/ How to obtain a good relational database schema

DATABASE MANAGEMENT SYSTEMS

Schema Normalization. 30 th August Submitted By: Saurabh Singla Rahul Bhatnagar

Functional Dependencies and Finding a Minimal Cover

Final Review. Zaki Malik November 20, 2008

Relational Database Design. Announcements. Database (schema) design. CPS 216 Advanced Database Systems. DB2 accounts have been set up

Combining schemas. Problems: redundancy, hard to update, possible NULLs

Techno India Batanagar Computer Science and Engineering. Model Questions. Subject Name: Database Management System Subject Code: CS 601

IS 263 Database Concepts

SAMPLE FINAL EXAM SPRING/2H SESSION 2017

CSIT5300: Advanced Database Systems

Exercise 9: Normal Forms

How to design a database

Schema Refinement: Dependencies and Normal Forms

RELATIONAL DESIGN THEORY / LECTURE 4 TIM KRASKA


Logical Database Design Normalization

Lecture 11 - Chapter 8 Relational Database Design Part 1

Normalisation. Databases: Topic 4

Jordan University of Science & Technology Computer Science Department CS 728: Advanced Database Systems Midterm Exam First 2009/2010

Normalization in DBMS

Relational Model and Relational Algebra A Short Review Class Notes - CS582-01

SCHEMA REFINEMENT AND NORMAL FORMS

Home Page. Title Page. Page 1 of 14. Go Back. Full Screen. Close. Quit

Unit- III (Functional dependencies and Normalization, Relational Data Model and Relational Algebra)

Transcription:

In This Lecture Normalisation to BCNF Database Systems Lecture 12 Natasha Alechina More normalisation Brief review of relational algebra Lossless decomposition Boyce-Codd normal form (BCNF) Higher normal forms Denormalisation For more information Connolly and Begg chapter 14 Normalisation so Far Lossless decomposition First normal form All data values are atomic Second normal form In 1NF plus no non-key attribute is partially dependent on a candidate key Third normal form In 2NF plus no non-key attribute depends transitively on a candidate key To normalise a relation, we used projections If R(A,B,C) satisfies A B then we can project it on A,B and A,C without losing information Lossless decomposition: R = π AB (R) π AC (R) where π AB (R) is projection of R on AB and is natural join. Reminder of projection: R π AB (R) A B C A B Relational algebra reminder: selection Relational algebra reminder: product R σ C=D (R) R1 R2 R1 R2 A B C D 1 x c c 2 y d e 3 z a a 4 u b c 5 w c d A B C D 1 x c c 3 z a a A B 1 2 y x A C 1 w 2 v 3 u A B A C 1 x 1 w 1 x 2 v 1 x 3 u 2 y 1 w 2 y 2 v 2 y 3 u 1

While I am on the subject Relational algebra: natural join R1 R2 = π R1.A,B,C σ R1.A = R2.A (R1 R2) SELECT A,B FROM R1, R2, R3 WHERE (some property α holds) translates into relational algebra R1 A B 1 2 y x R2 A C 1 w 2 v 3 u R1 R2 A B C 1 x w 2 y v π A,B σ α (R1 R2 R3) R When is decomposition lossless: Module Lecturer π Module,Lecturer R π Module,Text R S When is decomposition is not lossless: no fd π First,Last S π First,Age S Module Lecturer Text Module Lecturer Module Text First Last Age First Last First Age DBS nza CB DBS nza UW RDB nza UW APS rcb B DBS RDB APS nza nza rcb DBS DBS RDB APS CB UW UW B John Smith 20 John Brown 30 Mary Smith 20 Tom Brown 10 John John Mary Tom Smith Brown Smith Brown John 20 John 30 Mary 20 Tom 10 When is decomposition is not lossless: no fd Normalisation Example π First,Last S π First,Last S First Last Age John Smith 20 John Smith 30 John Brown 20 John Brown 30 Mary Smith 20 Tom Brown 10 π First,Last S First Last John Smith John Brown Mary Smith Tom Brown π First,Age S First Age John 20 John 30 Mary 20 Tom 10 We have a table representing orders in an online store Each entry in the table represents an item on a particular order Columns Order Product Customer Address Quantity UnitPrice Primary key is {Order, Product} 2

Functional Dependencies Normalisation to 2NF Each order is for a single customer Each customer has a single address Each product has a single price From FDs 1 and 2 and transitivity {Order} {Customer} {Customer} {Address} {Product} {UnitPrice} {Order} {Address} Second normal form means no partial dependencies on candidate keys {Order} {Customer, Address} {Product} {UnitPrice} To remove the first FD we project over {Order, Customer, Address} (R1) and {Order, Product, Quantity, UnitPrice} (R2) Normalisation to 2NF Normalisation to 3NF R1 is now in 2NF, but there is still a partial FD in R2 {Product} {UnitPrice} To remove this we project over {Product, UnitPrice} (R3) and {Order, Product, Quantity} (R4) R has now been split into 3 relations - R1, R3, and R4 R3 and R4 are in 3NF R1 has a transitive FD on its key To remove {Order} {Customer} {Address} we project R1 over {Order, Customer} {Customer, Address} Normalisation The Stream Relation 1NF: {Order, Product, Customer, Address, Quantity, UnitPrice} 2NF: {Order, Customer, Address}, {Product, UnitPrice}, and {Order, Product, Quantity} 3NF: {Product, UnitPrice}, {Order, Product, Quantity}, {Order, Customer}, and {Customer, Address} Consider a relation, Stream, which stores information about times for various streams of courses For example: labs for first years Each course has several streams Only one stream (of any course at all) takes place at any given time Each student taking a course is assigned to a single stream for it 3

The Stream Relation FDs in the Stream Relation John Databases 12:00 Mary Databases 12:00 Richard Databases 15:00 Richard Programming 10:00 Mary Programming 10:00 Rebecca Programming 13:00 Candidate keys: {Student, Course} and {Student, Time} Stream has the following non-trivial FDs {Student, Course} {Time} {Time} {Course} Since all attributes are key attributes, Stream is in 3NF Anomalies in Stream Boyce-Codd Normal Form INSERT anomalies You can t add an empty stream UPDATE anomalies Moving the 12:00 class to 9:00 means changing two rows DELETE anomalies Deleting Rebecca removes a stream John Databases 12:00 Mary Databases 12:00 Richard Databases 15:00 Richard Programming 10:00 Mary Programming 10:00 Rebecca Programming 13:00 A relation is in Boyce- Codd normal form (BCNF) if for every FD A B either B is contained in A (the FD is trivial), or A contains a candidate key of the relation, In other words: every determinant in a nontrivial dependency is a (super) key. The same as 3NF except in 3NF we only worry about non-key Bs If there is only one candidate key then 3NF and BCNF are the same Stream and BCNF Conversion to BCNF Stream is not in BCNF as the FD {Time} {Course} is non-trivial and {Time} does not contain a candidate key John Databases 12:00 Mary Databases 12:00 Richard Databases 15:00 Richard Programming 10:00 Mary Programming 10:00 Rebecca Programming 13:00 Student Course Course Time Stream has been put into BCNF but we have lost the FD {Student, Course} {Time} 4

Decomposition Properties Higher Normal Forms Lossless: Data should not be lost or created when splitting relations up Dependency preservation: It is desirable that FDs are preserved when splitting relations up Normalisation to 3NF is always lossless and dependency preserving Normalisation to BCNF is lossless, but may not preserve all dependencies BCNF is as far as we can go with FDs Higher normal forms are based on other sorts of dependency Fourth normal form removes multi-valued dependencies Fifth normal form removes join dependencies 1NF Relations 2NF Relations 3NF Relations BCNF Relations 4NF Relations 5NF Relations Denormalisation Denormalisation Normalisation Removes data redundancy Solves INSERT, UPDATE, and DELETE anomalies This makes it easier to maintain the information in the database in a consistent state However It leads to more tables in the database Often these need to be joined back together, which is expensive to do So sometimes (not often) it is worth denormalising You might want to denormalise if Database speeds are unacceptable (not just a bit slow) There are going to be very few INSERTs, UPDATEs, or DELETEs There are going to be lots of SELECTs that involve the joining of tables Address Number Street City Postcode Not normalised since {Postcode} {City} Address1 Number Street Postcode Address2 Postcode City Normalisation in Exams Normalisation in Exams Given a relation with scheme {ID, Name, Address, Postcode, CardType, CardNumber}, the candidate key {ID}, and the following functional dependencies: {ID} {Name, Address, Postcode, CardType, CardNumber} {Address} {Postcode} {CardNumber} {CardType} (i) Explain why this relation is in second normal form, but not in third normal form. (3 marks) (ii) Show how this relation can be converted to third normal form. You should show what functional dependencies are being removed, explain why they need to be removed, and give the relation(s) that result. (4 marks) (iii) Give an example of a relation that is in third normal form, but that is not in Boyce-Codd normal form, and explain why it is in third, but not Boyce-Codd, normal form. (4 marks) 5

Next Lecture Physical DB Issues RAID arrays for recovery and speed Indexes and query efficiency Query optimisation Query trees For more information Connolly and Begg chapter 21 and appendix C.5 6