Detailed Data Modelling: Attribute Collection and Normalisation of Data

Similar documents
Detailed Data Modelling. Detailed Data Modelling. Detailed Data Modelling. Identifying Attributes. Attributes

IMS1002/CSE1205 Lectures 1

Modelling: Review. Modelling Information Systems. Models in analysis and design. Process Modelling. Modelling perspectives

information process modelling DFDs Process description

Distributed Database Systems By Syed Bakhtawar Shah Abid Lecturer in Computer Science

IS 263 Database Concepts

Modern Systems Analysis and Design

Process Modelling. Data flow Diagrams. Process Modelling Data Flow Diagrams. CSE Information Systems 1

Objectives Definition iti of terms List five properties of relations State two properties of candidate keys Define first, second, and third normal for

CSE Information Systems 1

Database Foundations. 3-9 Validating Data Using Normalization. Copyright 2015, Oracle and/or its affiliates. All rights reserved.

Techno India Batanagar Computer Science and Engineering. Model Questions. Subject Name: Database Management System Subject Code: CS 601

Bachelor in Information Technology (BIT) O Term-End Examination

Modelling as a Communication Tool: Introduction to Process Modelling. Modelling. Simplification in modelling. Representation in modelling

MINGGU Ke 8 Analisa dan Perancangan Sistem Informasi

Normalization. Normal Forms. Normal Forms

CMP-3440 Database Systems

Entity-Relationship Modelling. Entities Attributes Relationships Mapping Cardinality Keys Reduction of an E-R Diagram to Tables

Data Modelling. Static Data in the organisation Fundamental building block of the system Two perspectives (Process and Data) Of Data.

Database Design Process

Database Systems: Design, Implementation, and Management Tenth Edition. Chapter 6 Normalization of Database Tables

Introduction to Database. Dr Simon Jones Thanks to Mariam Mohaideen

Database Principles: Fundamentals of Design, Implementation, and Management Tenth Edition. Chapter 7 Data Modeling with Entity Relationship Diagrams

FIT1004 Database Topic 6: Normalisation

Database Systems Relational Model. A.R. Hurson 323 CS Building

Logical Database Design. ICT285 Databases: Topic 06

4. Entity Relationship Model

Lecture 5 STRUCTURED ANALYSIS. PB007 So(ware Engineering I Faculty of Informa:cs, Masaryk University Fall Bühnová, Sochor, Ráček

Conceptual Data Models for Database Design

COSC 304 Introduction to Database Systems. Entity-Relationship Modeling

Chapter 2 Conceptual Modeling. Objectives

DATABASE TECHNOLOGY. Spring An introduction to database systems

Part 7. Logical Data Structures

Translation of ER-diagram into Relational Schema. Dr. Sunnie S. Chung CIS430/530

CMP-3440 Database Systems

COMP Instructor: Dimitris Papadias WWW page:

Represent entities and relations with diagrams

The DBMS accepts requests for data from the application program and instructs the operating system to transfer the appropriate data.

DATABASE TECHNOLOGY - 1DL124

Database Systems. Overview - important points. Lecture 5. Some introductory information ERD diagrams Normalization Other stuff 08/03/2015

FAQ: Relational Databases in Accounting Systems

Database Systems. A Practical Approach to Design, Implementation, and Management. Database Systems. Thomas Connolly Carolyn Begg

Learning outcomes. On successful completion of this unit you will: 1. Understand data models and database technologies.

DATABASE DESIGN I - 1DL300

ER Model. Objectives (2/2) Electricite Du Laos (EDL) Dr. Kanda Runapongsa Saikaew, Computer Engineering, KKU 1

Setting up an Invoice System

CIS 330: Web-driven Web Applications. Lecture 2: Introduction to ER Modeling

Modern Systems Analysis and Design Sixth Edition

Applied Databases. Sebastian Maneth. Lecture 5 ER Model, Normal Forms. University of Edinburgh - January 30 th, 2017

1/24/2012. Chapter 7 Outline. Chapter 7 Outline (cont d.) CS 440: Database Management Systems

Normalisation Chapter2 Contents

Contents. Database. Information Policy. C03. Entity Relationship Model WKU-IP-C03 Database / Entity Relationship Model

1. Heading 1. Normalisation LEARNING OBJECTIVES. Study Guide. On completion of this session you will be able to:

Management of Information Systems Session 3

Lecture 3 (Part 2) Akhtar Ali

Database Design Process. Requirements Collection & Analysis

Relational Database Model. III. Introduction to the Relational Database Model. Relational Database Model. Relational Terminology.

Modeling Your Data. Chapter 2. cs542 1

DATABASE MANAGEMENT SYSTEM

CS/INFO 330 Entity-Relationship Modeling. Announcements. Goals of This Lecture. Mirek Riedewald

Normalisation. Connolly, Ch. 13

MIS Database Systems Entity-Relationship Model.

CMSC 424 Database design Lecture 3: Entity-Relationship Model. Book: Chap. 1 and 6. Mihai Pop

Lecture 03. Spring 2018 Borough of Manhattan Community College

THE RELATIONAL DATABASE MODEL

DATABASE DEVELOPMENT (H4)

Data analysis and design Unit number: 23 Level: 5 Credit value: 15 Guided learning hours: 60 Unit reference number: H/601/1991.

Conceptual Design. The Entity-Relationship (ER) Model

Relational Model. CS 377: Database Systems

Relational Database Management Systems Oct I. Section-A: 5 X 4 =20 Marks

Normalization and Roberts s Rules. Prepared for CSCI 6442 George Washington University. David C. Roberts

High Level Database Models

Entity Relationship Diagram (ERD) Dr. Moustafa Elazhary

Applied Databases. Sebastian Maneth. Lecture 5 ER Model, normal forms. University of Edinburgh - January 25 th, 2016

DBMS. Relational Model. Module Title?

The Entity-Relationship Model. Overview of Database Design

Translation of ER-diagram into Relational Schema. Dr. Sunnie S. Chung CIS430/530

Information Technology Audit & Cyber Security

Lecture 3. Wednesday, September 3, 2014

7+8+9: Functional Dependencies and Normalization

Advance Database Management System

Transforming the Data Model into Relations (Tables) and Normalisation

Multiple Choice Questions

Conceptual Database Design. COSC 304 Introduction to Database Systems. Entity-Relationship Modeling. Entity-Relationship Modeling

8) A top-to-bottom relationship among the items in a database is established by a

Official Statistics - Relational Database Management Systems. Official Statistics - Relational Database Management Systems

B.C.A DATA BASE MANAGEMENT SYSTEM MODULE SPECIFICATION SHEET. Course Outline

A database can be modeled as: + a collection of entities, + a set of relationships among entities.

Online Data Modeling to Improve Students Learning of Conceptual Data Modeling

Lecture 03. Fall 2017 Borough of Manhattan Community College

Sankalchand Patel College of Engineering, Visnagar B.E. Semester III (CE/IT) Database Management System Question Bank / Assignment

DATA Data and information are used in our daily life. Each type of data has its own importance that contribute toward useful information.

High-Level Database Models (ii)

MIS2502: Data Analytics Relational Data Modeling - 1. JaeHwuen Jung

Accounting Information Systems, 2e (Kay/Ovlia) Chapter 2 Accounting Databases. Objective 1

Home Page. Title Page. Page 1 of 14. Go Back. Full Screen. Close. Quit

Step 2: Map ER Model to Tables

Relational Database Components

The Relational Model. Why Study the Relational Model? Relational Database: Definitions

CS403- Database Management Systems Solved MCQS From Midterm Papers. CS403- Database Management Systems MIDTERM EXAMINATION - Spring 2010

Transcription:

Detailed Data Modelling IMS1002 /CSE1205 Systems Analysis and Design Detailed Data Modelling: Attribute Collection and Normalisation of Data The objective of detailed data modelling is to develop a detailed data structure that: has stability, minimum redundancy, and is flexible to allow for future change can be used as the basis for physical file and database design reflects the actual data requirements of the system www.monash.edu.au 2 Detailed Data Modelling Detailed Data Modelling To expand the conceptual data model, we need to identify and describe the details of the entities and relationships Attributes (data elements) of entities and relationships are identified The attributes should be independent of implementation technology Each attribute should represent a single fact An organisation-wide perspective should be adopted to ensure minimum redundancy and inconsistency and to facilitate data sharing 3 Detailed data modelling aims to identify and describe attributes, convert ER models to relations, and to normalise the relations to ensure that they are well structured Techniques: >attribute collection >convert ER models to relations >normalisation >convert to data structure diagram >Data Dictionary 4 Attributes Identifying Attributes An attribute is a named property or characteristic of an entity that is of interest to the organisation Use an initial capital letter followed by lower case letters in naming attributes > eg Date, First Name, Suburb, Account No, Each attribute name must be unique and distinctive There are three main sources of attributes: data to support essential user functions data to support current operations data to measure performance against objectives 5 6

Identifying Attributes Identifying Attributes Data to support all essential user activities identified as system requirements: Involve all potential users Include future data requirements of users eg.the inventory control function needs: ITEM Item Number Item Description Item Location Item Quantity-on-hand Item Re-order Quantity Data to support current operations: examine all forms, documents, reports and files (computerised and manual) used in the current system check with users for accurate definitions of attributes ensure all attributes identified are still required in the new system 7 8 Identifying Attributes Documenting Attributes Data to measure business objectives and performance: ensure that data necessary to measure performance against objectives is identified eg. Objective stock should be held in the warehouse for no more than seven days Requires Date Of Delivery Date Of Despatch for each Item 9 All attributes should be defined and described in the Data Dictionary Information should include: > name > description > format, precision, example values > domain of values, range of values > synonyms > method of derivation > validation constraints > entities which this attribute describes 10 Data Dictionary -data element entry DATA ATTRIBUTE ACADEMIC Product code AUTHOR: David Ross DATE: 14 Oct 2005 Alias: Inventory number, product number Description: Number to identify and differentiate each product held in warehouse Format: Must be a positive integer Range: 00001 to 99999 Method of derivation: Next unused number in sequence Validation: No characters; 6 digits only; ENTITY: PRODUCT Attribute Names Attribute names should be clear, unambiguous, and meaningful Each attribute name should be unique Price, Retail Price, Cost Price for Item Date, Delivery Date of Purchase Order Name, First Name, Last Name of Member Quantity-on-hand, Quantity Sold, Qty Ordered 11 12

Attribute Names Potential problem areas are: Synonyms some attributes may be synonyms of others e.g. retail price, sale price Homonyms some attributes may be homonyms of others e.g. price (for retail price) price (for wholesale price) Derived attributes some attributes may be derived from others e.g. total salary for department Attribute Definition Each attribute should convey a single fact for ease of processing and querying e.g Pay Code A normal hourly B normal salaried C overtime hourly D overtime salaried should be Pay Factor 1 normal 2 overtime Job CategoryX hourly Y salaried 13 14 Attribute Collection exercise 1.Take an order Customers name Customers address Customers phone number Quantity ordered Date product required by Date of order Time of order Attribute Collection exercise 2.Get product Product number Location of product Quantity on hand Product supplier Quantity ordered Date product required by Date of order 15 16 Attribute Collection exercise 3. Order product Product number Supplier name Supplier address Supplier order date Quantity ordered from supplier Expected delivery date Approximate cost of order Product sell price Quantity sold Customer Invoice Number Invoice date Total cost Sale number Date of sale Time of sale Payment amount Receipt number Attribute Collection exercise 4. Sell product 17 18

Problem of transcribing attributes into database structure Attribute Definition Customers name Customers address Customers phone number Quantity ordered Date product required by Date of order Time of order Product number Location of product Quantity on hand Product supplier Quantity ordered Date product required by Date of order Product number Supplier name Supplier address Supplier order date Quantity ordered from supplier Expected delivery date Approximate cost of order Product sell price Quantity sold Customer Invoice Number Invoice date Total cost Sale number Date of sale Time of sale Payment amount Receipt number Each attribute should convey a single fact avoid embedding extra information in ranges of values e.g. Invoice Number 0000-1499 north-east region 1500-2999 south-east region 3000-4499 central region 19 20 Attribute Definition Each attribute should convey a single fact avoid embedding extra information in identifiers of attributes e.g. Part Number (9 digits) consists of: Part Type Code (1) Factory Code (2) Part Seq Number (4) Year Of Manuf (2) 21 Normalisation Normalisation is a process for converting complex data structures into simple, stable data structures in the form of relations. Data models consisting of normalised relations: >are robust and stable >have minimum redundancy >are flexible >are technology-independent (logical) 22 Normalisation Normalisation Normalisation ensures that each attribute is attached to the appropriate relation each attribute is contained in the relation which represents the real world system object or concept that the attribute describes or is a property of e.g. the attribute Student Name should be in the relation STUDENT which represents the real world object student of interest to a student records system Normalisation was originally developed as part of relational database theory by E.F. Codd (1970) Normalisation is accomplished in stages, each of which corresponds to a normal form Originally, Codd defined first, second and third normal forms. Third normal form is adequate for most business applications. Later extensions include Boyce-Codd, 4th, 5th and domain-key normal forms. 23 24

The Relational Database Model The Relational Database Model represents data in the form of tables or relations Important concepts are: >relation >primary key >foreign key >functional dependency Relation A relation is a named, two-dimensional table of data Each relation consists of a set of named columns and an arbitrary number of rows Each column corresponds to an attribute of the relation Each row corresponds to an instance (or record) that contains values for that instance 25 26 Example Relation A relation generally corresponds to some real world object or concept of interest to the system (similar to an entity) e.g.: Employee Emp# Name Salary Dept 1247 1982 9314 Adams Smith Jones 24000 27000 33000 Employee (Emp#, Name, Salary, Dept) Finance MIS Finance 27 28 Properties of Relations Primary Key Relational tables are tables in which: data values are atomic (single-valued) data values in columns are from the same domain each row in the relation is unique the sequence of columns is insignificant the sequence of rows is insignificant An attribute or group of attributes which uniquely identifies a row of a relation Employee (Emp#, Name, Salary, Dept) Order-item (Order#, Item#, Qty-ordered) Book (Book#, ISBN#, Call#, Copy#, Title, Author) Entity integrity (Relational Database theory) requires that each relation has a non-null primary key 29 30

Primary Key Foreign key Where several possible keys are identified, they are known as candidate keys - choose one to be the primary key > stability > meaning > value eg EMPLOYEE (Emp#, Office#, Name) PERSON (TaxFile#, Medicare#, Lic#, SSec#) or assign a unique key BOOK (Book#, ISBN, Call#, Title) A foreign key is an attribute in one relation that is also a primary key in another relation The referential integrity constraint (relational database theory) specifies that if an attribute value exists in one relation then it must also exist in a linked relation A foreign key must satisfy referential integrity 31 32 Foreign Key Functional Dependency In the example below, if a given Dept# exists in an Employee relation then that Dept# must exist in the Department relation Employee (Emp#, Name, Salary, Dept#) Department (Dept#, Dname, Budget) foreign key A functional dependency is a particular relationship between attributes in a relation For any relation R, attribute B is functionally dependent on attribute A if each value of A has only ONE value of B associated with it, i.e. if for every valid instance of A, that value of A uniquely determines the value of B A identifies B A B Emp# Emp-name Emp# Salary 33 34 Steps in Normalisation Steps in Normalisation Normalisation to Third Normal Form (3NF) is accomplished in 3 steps each corresponding to a basic normal form A normal form is a state of a relation that can be determined by applying simple rules concerning dependencies within that relation Each step of the normalisation process is applied to a single relation in sequence so that the relation is converted to 3NF Unnormalised form UNF First Normal Form 1NF Second Normal Form 2NF Third Normal Form 3NF ID Keys, Remove repeating groups Remove partial dependencies Remove transitive dependencies 35 36

Well Structured Relations A well structured relation contains a minimum amount of redundancy and allows users to insert, modify, and delete rows in a table without errors or inconsistencies (known as anomalies) Three types of anomaly are possible: >Insertion >Deletion >Modification 3NF relations are considered to be well structured relations First Normal Form 1NF A relation is in 1NF if it contains no repeating data : the value of the data at the intersection of each row and column must be single-valued Remove any repeating groups of attributes to convert a relation to 1NF (key of the removed group is a composite key) Order (Order#, Customer#, (Item#, Desc, Qty)) Order-Item (Order#, Item#, Desc, Qty) Order (Order#, Customer#) 37 38 Second Normal Form 2NF Second Normal Form 2NF A relation is in 2NF if it is in 1NF and every non-key attribute is fully functionally dependent on the primary key A partial dependency exists if one or more non-key attributes are dependent on only part of a composite primary key Remove any partial dependencies to convert a 1NF relation to 2NF If the primary key consists of only one attribute or there are no non-key attributes, then a 1NF relation is automatically in 2NF 39 Remove Partial Dependencies a non-key attribute cannot be identified by part of a composite key Order (Order#, Item#, Desc, Qty-ordered) Order-Item (Order#, Item#, Qty-ordered) Item (Item#, Desc) 40 Primary Key Dependencies Partial Dependency Anomalies If a relation has a composite key, it must be checked for dependencies within the key to determine whether they should be retained e.g. DEPT (Dept#, Dept-name, (Emp#, Emp-name)) DEPT (Dept#, Dept-name), DEPT-EMP (Dept#, Emp#, Emp-name) But DEPT-EMP (Dept#, Emp#, Emp-name) Remove Dept# from the primary key and make it a foreign key EMP (Emp#,Emp-name, Dept#) 41 UPDATE DELETE CREATE Order-Item Order# Item# Item-desc Qty 27 28 28 873 402 873 nut bolt nut 2 1 10 30 495 washer 50 change Item-desc in many places data for last item lost when last order for that item is deleted cannot add new item until it is ordered 42

Partial Dependency Anomalies Third Normal Form 3NF Order Order# Item# Qty 27 28 28 873 402 873 2 1 10 30 495 50 Item add new item at any time Item# change item description in one place only Desc 873 nut 402 bolt 495 washer delete last order for item, but item remains 43 A relation is in 3NF if it is in 2NF and no transitive dependencies exist A transitive dependency is a functional dependency between two or more non-key attributes Remove any transitive dependencies to convert a 2 NF relation to 3 NF 44 Third Normal Form 3NF Transitive Dependency Anomalies Remove transitive dependencies a non-key attribute cannot be identified by another non key attribute Employee (Emp#, Ename, Dept#, Dname) Employee (Emp#, Ename, Dept#) Department( Dept#, Dname) (look for foreign keys and their attributes) 45 UPDATE DELETE CREATE Employee Emp# Emp-name Dept# Dname 10 Smith D5 MIS 20 Jones D7 Finance 25 Smith D7 Finance 30 Black D8 Sales change dept name in many places data for dept lost when last employee for that dept is deleted cannot add new dept until an employee is allocated to it 46 Transitive Dependency Anomalies Normalisation to 3NF Employee Emp# Ename Dept# 10 20 25 Smith Jones Smith D5 D7 D7 30 Black D8 Department add new dept at any time change dept name in one place Dept# D5 D7 D8 delete last emp in dept, but dept remains Dname MIS Finance Sales 47 A relation is normalised if all attributes are fully functionally dependent on the primary key >Remove repeating groups >Remove partial dependencies >Remove transitive dependencies 48

ER Diagrams and Relations Entities and Relations Transforming an ER diagram into normalised relations, and then merging all the relations into one final, consolidated set of relations can be accomplished in four steps: 1. Represent entities as relations 2. Represent relationships as relations 3. Normalise each relation 4. Merge the relations 49 Each entity in the ER model is transformed into a relation e.g. CUSTOMER BECOMES Customer (Customer-no, Name, Address, City, State, Postcode, Discount) 50 Relationships and Relations Relationships and Relations 1. Binary Relationships (1:N, 1:1) CUSTOMER places ORDER Customer (Customer-no, Name, Address, City, State, Postcode, Discount) Order (Order-no, Order-date, Promised-date, Customer-no) 51 1. Binary Relationships (1:N, 1:1) For 1:M, add the primary key of the entity on the one side of the relationship as a foreign key in the relation that is on the many side For 1:1 relationship involving entities A and B, choose from add the primary key of A as a foreign key of B add the primary key of B as a foreign key of A both of the above 52 Relationships and Relations Relationships and Relations 2. Binary and Higher Degree Relationships (M:N) PRODUCT requests ORDER Where we wish to know the quantity of each product on each order, i.e. this attribute is an attribute of the relationship requests : Order (Order-no, Order-date, Promised-date) Order Line (Order-no, Product-no, Quantity-ordered) Product (Product-no, description, (other attributes)) 53 2. Binary and Higher Degree Relationships (M:N) For M:N, first create a relation for each for each of the entity types, and then create a relation for the relationship, with a composite primary key formed from the primary keys of the participating entity types. 54

Relationships and Relations Relationships and Relations 3. Unary Relationships Has component manages 4. IS-A Relationship (Class-Subclass or generalisation) ITEM is a component of EMPLOYEE reports to PROPERTY BEACH PROPERTY (M:N) Item (Item-no, Name, Cost) Item-Bill (Item-no, Component-no, Quantity) (1:N) Employee (Emp-id, Name, Birthdate, Manager-id) 55 MOUNTAIN PROPERTY Property (Street-address, City-state-postcode, No-rooms, Typical-rent) Beach_Property (Street-address, City-state-postcode, Distance-to-beach) Mountain_Property (Street-address, City-state-postcode, Skiing) 56 Merging Normalised Relations Normalisation of Relations During the normalisation process two or more relations with the same primary key may appear The set of 3NF relations must not contain any duplicate data Relations with the same primary key should be merged Synonyms Two or more attributes may have different names but the same meaning Either adopt one of the names as a standard or choose a third name STUDENT1 (Student-id, Name, Phone-no) STUDENT2 (VCE-no, Name, Address) STUDENT (Student-id, Name, Address, Phone-no) 57 58 Normalisation of Relations Normalisation of Relations Homonyms Two or more attributes may have the same name but different meanings To resolve the conflict, new attribute names need to be created STUDENT1 (Student-id, Name, Address) STUDENT2 (Student-id, Name, Phone-no, Address) STUDENT (Student-id, Name, Phone-no, Campusaddress, Permanent-address ) 59 Transitive dependencies When two 3NF relations are merged, transitive dependencies may result STUDENT1 (Student-id, Major) STUDENT2 (Student-id, Advisor) STUDENT (Student-id, Major, Advisor) BUT Major Advisor STUDENT1 (Student-id, Major) MAJOR (Major, Advisor) 60

Data Structure Diagram DSD Data Structure Diagram DSD A set of 3NF relations may be converted to a simple diagrammatic form to begin physical database design The conversion is simple: 1. Draw a named rectangle for each relation 2. Draw a relationship line between rectangles linked by foreign keys with a many cardinality at the foreign key end of the relationship 61 CUSTOMER (Cust#, Cname,Phone number) SALES ORDER (Sord#, Sord-date, Cust#) SALES ORDER-ITEM (Sord#, Item#, Qty) ITEM (Item#, Item-desc) CUSTOMER SALES ORDER SALES ORDER LINE ITEM 62 Data Structure Diagram DSD Eliminate Redundant Relationships redundant TOUR DEPARTURE (Tourcode,...) (Tourcode, depdate,...) (Booking#,..., Tourcode, depdate) Summary Detailed data modelling involves: collecting detailed attributes for each entity and relationship identified converting ER models to relations normalising the relations merging relations from each user viewpoint converting the normalised and merged relations to create a data structure diagram BOOKING 63 64 Summary Hoffer, J.A., George, J.F. and Valacich, (1999) 2nd edn., Modern Systems Analysis and Design, Benjamin-Cummings, MA USA. Chapter 16 Whitten, J.L. & Bentley, L.D. and Dittman, K.C., (2001) 5th edn., Systems Analysis and Design Methods, McGraw-Hill Irwin, Burr Ridge, Illinois Chapters 7, 12 65