FIT1004 Database Topic 6: Normalisation

Similar documents
E-R Modeling: Table Normalization

Database Systems: Design, Implementation, and Management Tenth Edition. Chapter 6 Normalization of Database Tables

1. Heading 1. Normalisation LEARNING OBJECTIVES. Study Guide. On completion of this session you will be able to:

The strategy for achieving a good design is to decompose a badly designed relation appropriately.

DATABASE (2) NORMALIZATION

IS 263 Database Concepts

Database Management Systems

Data about data is database Select correct option: True False Partially True None of the Above

Review -Chapter 4. Review -Chapter 5

Normalisation. Normalisation. Normalisation

Steps in normalisation. Steps in normalisation 7/15/2014

FIT1004 Database Topic 2: Database Design Life Cycle

CMP-3440 Database Systems

Techno India Batanagar Computer Science and Engineering. Model Questions. Subject Name: Database Management System Subject Code: CS 601

Database Tables and Normalization

Course Outline Faculty of Computing and Information Technology

Learning outcomes. On successful completion of this unit you will: 1. Understand data models and database technologies.

More examples on Normalization

8) A top-to-bottom relationship among the items in a database is established by a

Detailed Data Modelling. Detailed Data Modelling. Detailed Data Modelling. Identifying Attributes. Attributes

Distributed Database Systems By Syed Bakhtawar Shah Abid Lecturer in Computer Science

Database Principles: Fundamentals of Design, Implementation, and Management Tenth Edition. Chapter 9 Normalizing Database Designs

CS403- Database Management Systems Solved MCQS From Midterm Papers. CS403- Database Management Systems MIDTERM EXAMINATION - Spring 2010

Department of Information Technology B.E/B.Tech : CSE/IT Regulation: 2013 Sub. Code / Sub. Name : CS6302 Database Management Systems

Normalization in DBMS

Detailed Data Modelling: Attribute Collection and Normalisation of Data

Chapter 1 SQL and Data

ACS-2914 Normalization March 2009 NORMALIZATION 2. Ron McFadyen 1. Normalization 3. De-normalization 3

Redundancy:Dependencies between attributes within a relation cause redundancy.

DATABASE MANAGEMENT SYSTEM SHORT QUESTIONS. QUESTION 1: What is database?

Introduction to MS Access: creating tables, keys, and relationships

Chapter 3. The Relational database design

Database Foundations. 3-9 Validating Data Using Normalization. Copyright 2015, Oracle and/or its affiliates. All rights reserved.

Bachelor in Information Technology (BIT) O Term-End Examination

Normalisation. Connolly, Ch. 13

MIDTERM EXAMINATION Spring 2010 CS403- Database Management Systems (Session - 4) Ref No: Time: 60 min Marks: 38

Normalisation Chapter2 Contents

Normalization. Murali Mani. What and Why Normalization? To remove potential redundancy in design

Normalization Normalization

UGC NET - COMPUTER SCIENCE

CS403- Database Management Systems Solved Objective Midterm Papers For Preparation of Midterm Exam

Normalization (1) IT 5101 Introduction to Database Systems. J.G. Zheng Fall 2011

Edited by: Nada Alhirabi. Normalization

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2009 Lecture 4 - Schema Normalization

Normalization and Functional Dependencies. CS6302 Database management systems T.R.Lekhaa, AP/IT

CS403 - Database Management Systems Short Notes

Dependency Diagram To Meet The 3nf

NORMAL FORMS. CS121: Relational Databases Fall 2017 Lecture 18

Informal Design Guidelines for Relational Databases

Database Systems. Normalization Lecture# 7

CS/B.Tech/CSE/New/SEM-6/CS-601/2013 DATABASE MANAGEMENENT SYSTEM. Time Allotted : 3 Hours Full Marks : 70

Logical Database Design Normalization

DATABASES SQL INFOTEK SOLUTIONS TEAM

Schema And Draw The Dependency Diagram

DATABASE MANAGEMENT SYSTEM SUBJECT CODE: CE 305

Normalization. VI. Normalization of Database Tables. Need for Normalization. Normalization Process. Review of Functional Dependence Concepts

MaanavaN.Com DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING QUESTION BANK

Database Design. SYS-ED / Computer Education Techniques, Inc. 1

Chapter 10. Chapter Outline. Chapter Outline. Functional Dependencies and Normalization for Relational Databases

BIRKBECK (University of London)

Relational Database design. Slides By: Shree Jaswal

D.K.M COLLEGE FOR WOMEN(AUTONOMOUS),VELLORE DATABASE MANAGEMENT SYSTEM QUESTION BANK


Lecture 5 STRUCTURED ANALYSIS. PB007 So(ware Engineering I Faculty of Informa:cs, Masaryk University Fall Bühnová, Sochor, Ráček

Teaching Scheme BIT/MMC/BCS Database Systems 1

Chapter 14. Database Design Theory: Introduction to Normalization Using Functional and Multivalued Dependencies

A7-R3: INTRODUCTION TO DATABASE MANAGEMENT SYSTEMS

Database Systems: Design, Implementation, and Management Tenth Edition. Chapter 4 Entity Relationship (ER) Modeling

CMPE 131 Software Engineering. Database Introduction

Relational Design: Characteristics of Well-designed DB

Normalisation theory

In This Lecture. Normalisation to BCNF. Lossless decomposition. Normalisation so Far. Relational algebra reminder: product

Course Logistics & Chapter 1 Introduction

DATABASE MANAGEMENT SYSTEM

SYED AMMAL ENGINEERING COLLEGE

Solved MCQ on fundamental of DBMS. Set-1

1D D0-541 CIW v5 Database Design Specialist Version 1.7

Normalization. Un Normalized Form (UNF) Share. Download the pdf version of these notes.

Database Management System Prof. Partha Pratim Das Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur

Database Principles: Fundamentals of Design, Implementation, and Management Tenth Edition. Chapter 7 Data Modeling with Entity Relationship Diagrams

Vendor: CIW. Exam Code: 1D Exam Name: CIW v5 Database Design Specialist. Version: Demo

Home Page. Title Page. Page 1 of 14. Go Back. Full Screen. Close. Quit

Data Modelling. Static Data in the organisation Fundamental building block of the system Two perspectives (Process and Data) Of Data.

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2009 Lecture 3 - Schema Normalization

4. Entity Relationship Model

Lecture5 Functional Dependencies and Normalization for Relational Databases

Functional Dependencies and. Databases. 1 Informal Design Guidelines for Relational Databases. 4 General Normal Form Definitions (For Multiple Keys)

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Relational Database Management Systems Oct I. Section-A: 5 X 4 =20 Marks

CSE 562 Database Systems

Unit- III (Functional dependencies and Normalization, Relational Data Model and Relational Algebra)

A l Ain University Of Science and Technology

THE RELATIONAL DATABASE MODEL

15CS53: DATABASE MANAGEMENT SYSTEM

Chapter 14 Outline. Normalization for Relational Databases: Outline. Chapter 14: Basics of Functional Dependencies and

Professional Knowledge Complete Theory IT Ebook For IBPS RRB SCALE-2 SBI IT IBPS IT INSURANCE SPECIALIST EXAM

1D0-541_formatted. Number: Passing Score: 800 Time Limit: 120 min File Version: 1.

Chapter 10. Normalization. Chapter Outline. Chapter Outline(contd.)

Week. Lecture Topic day (including assignment/test) 1 st 1 st Introduction to Module 1 st. Practical

COURSE OUTLINE. Page : 1 of 5. Semester: 2 Academic Session: 2017/2018

Transcription:

FIT1004 Database Topic 6: Normalisation Learning Objectives: Understand the purpose of normalisation Understand the problems associated with redundant data Identify various types of update anomalies such as insertion, deletion, and modification anomalies Recognise the appropriateness or quality of the design of relations Identify various types of functional dependencies between attributes Understand how functional dependencies can be used to group attributes into relations that are in a known normal form Identify the most commonly used normal forms, namely 1NF, 2NF and 3NF Perform normalisation Understand various ways to refine 3NF relations to achieve better database design Produce an ER diagram from the derived set of 3NF relations References: Rob, P. & Coronel, C., Database Systems, 6 th Edition, Chapt. 5, p. 182 221, 7 th Edition, Chapt. 5, p. 147 174 www.infotech.monash.edu.au/fit1004/

Where are we? Introduction to Database Systems The Relational Model Database Lifecycle Conceptual Design Logical Design Normalisation Physical Design Implementation SQL (DML) SQL (DDL & DCL) Transaction Management Database Administration Data Warehousing & Data Mining 2

Normalisation Normalisation is a technique for producing a set of relations with desirable properties, given the data requirements of an enterprise: Developed by E.F. Codd (1972) Often performed as a series of tests on a relation to determine whether it satisfies or violates the requirements of a given normal form Four most commonly used normal forms are: First (1NF), Second (2NF), Third (3NF) normally sufficient point, and Boyce-Codd (BCNF) 4NF,. etc (required by some very specialised applications) Based on functional dependencies among the attributes of a relation Major aim of relational database design is to group attributes into relations to minimise data redundancy and reduce file storage space required by base relations 3

Why Normalisation is required Note * signifies Project Leader 4

Problems with table in Figure 5.1 PROJ_NUM intended to be primary key, but it contains nulls JOB_CLASS invites entry errors eg. Elec. Eng. vs Elect. Engineer vs E.E. Project relation has redundant data details of a charge per hour are repeated for every occurrence of job class Every time an employee is assigned to a project emp name repeated Relations that contain redundant information may potentially suffer from update anomalies Types of update anomalies include: > Insertion Insert a new employee only if they are assigned to a project > Deletion Delete the last employee assigned to a project? Delete the last employee of a particular job class? > Modification Update a job class hourly rate - need to update multiple rows 5

Functional Dependence An attribute B is FUNCTIONALLY DEPENDENT on another attribute A, if a value of A determines a single value of B at any one time. A B EMP# EMP_NAME CUSTNUMB CUSTNAME ORDER-NUMBER ORDER-DATE > ORDER-NUMBER - independent variable, also know as DETERMINANT > ORDER-DATE - dependant variable TOTAL DEPENDENCY attribute A determines B AND attribute B determines A > EMPLOYEE-NUMBER TAX-FILE-NUMBER 6

Functional Dependence FULL DEPENDENCY occurs when an attribute is always dependant on AT LEAST TWO other attributes ORDER-NUMBER, PART-NUMBER QTY-ORDERED lack of full dependence for multiple attribute key = partial dependence TRANSITIVE DEPENDENCY occurs when Y depends on X, and Z depends on Y - thus Z also depends on X > X Y Z INVOICE-NUMB CUSTOMER-NUMB CUSTOMER-NAME Dependencies are depicted with the help of a Dependency Diagram NORMALISATION - SIMPLY 'COMMON SENSE' Converts a table into tables of progressively smaller degree and cardinality until an optimum level of decomposition is reached - little or no data redundancy exists 7

First Normal Form Positive results from normalisation - amount of space needed to store data will be lower table can be updated with greater efficiency description of database will be straightforward Unnormalised form (UNF) raw data from table/form/grid UNF: PROJECT (proj_num, proj_name (emp_num, emp_name,.)) Figure 5.1 consists of a set of projects with each project having a set of project-employee details (model 1) FIRST NORMAL FORM (part of formal definition of relation) A TABLE IS IN FIRST NORMAL FORM (1NF) IF - > it is a valid table (in particular no repeating groups) > a unique key has been identified for each row > all attributes are functionally dependant on all or part of the key 1NF: PROJECT (proj_num, proj_name) 1NF: ASSIGN (proj_num, emp_num, emp_name, job_class, chg_hour, assign_hours) 8

UNF to 1NF transformation Identify the repeating group(s), if any, in the unnormalised relation Move from UNF to 1NF by removing repeating group along with the PK of the main relation Important property of normalisation decomposition Lossless-join property enables us to find any instance of the original relation from corresponding instances in the smaller relations hence must extract PK of main relation Determine PK of new relations created extracted repeating group will normally have a composite PK including the main relations PK > but NOT always, PK of main relation may simply act as a FK INSURED (comp_code, comp_name (insured_id, insured_name,..))» COMPANY (comp_code, comp_name)» INSURED (insured_id, comp_code,insured_name,..) 9

First Normal Form continued An alternative way (model 2) of looking at this scenario Present data in tabular format, where each cell has single value and there are no repeating groups Eliminate repeating groups, eliminate nulls by making sure that each repeating group attribute contains an appropriate data value 10

Model 2: Dependency Diagram (1NF) 11

1NF to 2NF A RELATION IS IN 2NF IF - all non key attributes are functionally dependent on the entire key > ie. no partial dependencies exist Model 1: Move from 1NF to 2NF by removing partial dependencies 1NF: PROJECT (proj_num, proj_name) 1NF: ASSIGN (proj_num, emp_num, emp_name, job_class, chg_hour, assign_hours) 1NF: PROJECT (proj_num, proj_name) already in 2NF only one attribute in PK thus CANNOT be any partial dependencies > 2NF: PROJECT (proj_num, proj_name) 1NF: ASSIGN (proj_num, emp_num, emp_name, job_class, chg_hour, assign_hours) becomes > 2NF EMPLOYEE (emp_num, emp_name, job_class, chg_hour) > 2NF ASSIGN (proj_num, emp_num, assign_hours) 12

2NF Conversion Results (Model 1 & 2) Note Model 1 & 2 now equivalent 13

2NF to 3NF A RELATION IS IN 3NF IF - all transitive dependencies have been removed - check for non key attribute dependant on another non key attribute Move from 2NF to 3NF by removing transitive dependencies 2NF: PROJECT (proj_num, proj_name) 2NF EMPLOYEE (emp_num, emp_name, job_class, chg_hour) 2NF ASSIGN (proj_num, emp_num, assign_hours) PROJECT and ASSIGN already in 3NF 3NF: PROJECT (proj_num, proj_name) 3NF ASSIGN (proj_num, emp_num, assign_hours) 2NF EMPLOYEE (emp_num, emp_name, job_class, chg_hour) 3NF EMPLOYEE (emp_num, emp_name, job_class) 3NF JOB (job_class, chg_hour) 14

3NF Conversion Results 15

Improving the Design To improve the design of the database the following changes could be made: PK assignment Naming conventions Attribute atomicity Adding attributes Adding relationships Refining PKs Maintaining historical accuracy Using derived attributes 16

Improving the Design continued Returning to Table 5.1 (slide 4) Data loss who is the project leader? > modify project (R&C approach) 3NF: PROJECT (proj_num, proj_name, emp_num) > Alternative, add emp_num at UNF > Do not use synonyms when naming attributes always use the same name for the same attribute eg. Do not make emp_num in PROJECT leader_num JOB (job_class, chg_hour) > Job_class is a string eg. Systems Analyst Redundant data with associated issues, poor PK Better to create job code > modify job (R&C approach) 3NF JOB (job_code, job_description, job_chg_hour) > Alternative, make changes at UNF 17

Completed Database 18

Completed Database continued 19

Entire Process UNF to 3NF UNF PROJECT (proj_num, proj_name, emp_num (emp_num, emp_name, job_code, job_description, job_chg_hour, assign_hours)) 1NF remove repeating group and identify PK PROJECT (proj_num, proj_name, emp_num) ASSIGN (proj_num, emp_num, emp_name, job_code, job_description, job_chg_hour, assign_hours) 2NF remove partial dependencies PROJECT (proj_num, proj_name, emp_num) EMPLOYEE (emp_num, emp_name, job_code, job_description, job_chg_hour) ASSIGN (proj_num, emp_num, assign_hours) 3NF remove transitive dependencies PROJECT (proj_num, proj_name, emp_num) EMPLOYEE (emp_num, emp_name, job_code) ASSIGN (proj_num, emp_num, assign_hours) JOB (job_code, job_description, job_chg_hour) Note R&C show some further 'suggested' improvements 20

Normalisation presented as a Conceptual ERD 21

Normalisation presented as a Logical ERD 22

Normalisation and Database Design Normalisation should be part of design process Make sure that proposed entities meet required normal form before table structures are created ER diagram Provides the big picture, or macro view, of an organization s data requirements and operations Created through an iterative process > Identifying relevant entities, their attributes and their relationship > Use results to identify additional entities and attributes normalisation procedures Focus on the characteristics of specific entities A micro view of the entities within the ER diagram Difficult to separate normalisation process from ER modeling process Two techniques should be used concurrently 23

Normalisation and ER Diagrams ER Diagramming Top down approach Fast Examine requirements Business knowledge Normalisation Bottom up approach Very slow Examine existing data Mathematically based Top down create - bottom up checking Accuracy Greater understanding of the data 24

Summary This lecture Understand the purpose of normalisation Understand the problems associated with redundant data Identify various types of update anomalies such as insertion, deletion, and modification anomalies Recognise the appropriateness or quality of the design of relations Identify various types of functional dependencies between attributes Understand how functional dependencies can be used to group attributes into relations that are in a known normal form Identify the most commonly used normal forms, namely 1NF, 2NF and 3NF Perform normalisation Understand various ways to refine 3NF relations to achieve better database design Produce an ER diagram from the derived set of 3NF relations Next lecture Structured Query Language (SQL) - DML 25