CHAPTER 11. Data Normalization

Similar documents
tablename ORDER BY column ASC tablename ORDER BY column DESC sortingorder, } The WHERE and ORDER BY clauses can be combined in one

Normalization in DBMS

Relational model. Jaroslav Porubän, Miroslav Biňas, Milan Nosáľ (c)

CS313D: ADVANCED PROGRAMMING LANGUAGE

Creating the Data Layer

Normalization Rule. First Normal Form (1NF) Normalization rule are divided into following normal form. 1. First Normal Form. 2. Second Normal Form

THE RELATIONAL DATABASE MODEL

MIS2502: Data Analytics Relational Data Modeling. Jing Gong

MIS2502: Data Analytics Relational Data Modeling. Jing Gong

8) A top-to-bottom relationship among the items in a database is established by a

How to design a database

T-SQL Training: T-SQL for SQL Server for Developers

MySQL. A practical introduction to database design

Introduction to Databases and SQL

Test Bank For A Guide To Mysql 1st Edition By Pratt And Last

Link download full of Solution Manual:

Database Systems. Answers

Chapter 3B Objectives. Relational Set Operators. Relational Set Operators. Relational Algebra Operations

Test Bank for A Guide to SQL 9th Edition by Pratt

OVERVIEW OF RELATIONAL DATABASES: KEYS

30. Structured Query Language (SQL)

DC62 Database management system JUNE 2013

How to speed up a database which has gotten slow

For each layer there is typically a one- to- one relationship between geographic features (point, line, or polygon) and records in a table

Normalisation Chapter2 Contents

A Guide to SQL, Ninth Edition. Chapter Two Database Design Fundamentals

3/3/2008. Announcements. A Table with a View (continued) Fields (Attributes) and Primary Keys. Video. Keys Primary & Foreign Primary/Foreign Key

Learning Objectives. Description. Your AU Expert(s) Trent Earley Behlen Mfg. Co. Shane Wemhoff Behlen Mfg. Co.

CISC 3140 (CIS 20.2) Design & Implementation of Software Application II

Chapter 1 SQL and Data

Database Management Systems

TUTORIAL FOR IMPORTING OTTAWA FIRE HYDRANT PARKING VIOLATION DATA INTO MYSQL

CS 2340 Objects and Design

ACS-3902 Fall Ron McFadyen 3D21 Slides are based on chapter 5 (7 th edition) (chapter 3 in 6 th edition)

MySQL. Rheinisch-Westfälische Technische Hochschule Aachen Data Management and Data Exploration Group Univ.-Prof. Dr. rer. nat.

Systems Analysis & Design

Today Learning outcomes LO2

The Relational Model

Writing High Performance SQL Statements. Tim Sharp July 14, 2014

1. The process of determining the particular tables and columns that will comprise a database is known as database design.

TUTORIAL FOR IMPORTING OTTAWA FIRE HYDRANT PARKING VIOLATION DATA INTO MYSQL

More about Databases. Topics. More Taxonomy. And more Taxonomy. Computer Literacy 1 Lecture 18 30/10/2008

New Perspectives on Access Module 5: Creating Advanced Queries and Enhancing Table Design

MIS2502: Data Analytics Relational Data Modeling (2) Alvin Zuyin Zheng

205CDE Developing the Modern Web. Assignment 2 Server Side Scripting. Scenario D: Bookshop

ch02 True/False Indicate whether the statement is true or false.

EE221 Databases Practicals Manual

Conceptual Design. The Entity-Relationship (ER) Model

Graphical Joins in More Detail

SQL Functionality SQL. Creating Relation Schemas. Creating Relation Schemas

Introduction to Databases. MySQL Syntax Guide: Day 1

Relational Model. Rab Nawaz Jadoon DCS. Assistant Professor. Department of Computer Science. COMSATS IIT, Abbottabad Pakistan

Creating databases using SQL Server Management Studio Express

normalization are being violated o Apply the rule of Third Normal Form to resolve a violation in the model

Relational Data Model

INFO 1103 Homework Project 2

SQL stands for Structured Query Language. SQL is the lingua franca

Chapter 6: Entity-Relationship Model. The Next Step: Designing DB Schema. Identifying Entities and their Attributes. The E-R Model.

Objectives Definition iti of terms List five properties of relations State two properties of candidate keys Define first, second, and third normal for

EGCI 321: Database Systems. Dr. Tanasanee Phienthrakul

CS317 File and Database Systems

Learning outcomes. On successful completion of this unit you will: 1. Understand data models and database technologies.

Computer Science 597A Fall 2008 First Take-home Exam Out: 4:20PM Monday November 10, 2008 Due: 3:00PM SHARP Wednesday, November 12, 2008

Chapter 9: Object-Based Databases

Chapter Five Physical Database Design

Database Systems: Design, Implementation, and Management Tenth Edition. Chapter 6 Normalization of Database Tables

2. E/R Design Considerations

The Next Step: Designing DB Schema. Chapter 6: Entity-Relationship Model. The E-R Model. Identifying Entities and their Attributes.

The Entity-Relationship Model (ER Model) - Part 2

relational Key-value Graph Object Document

Systems Analysis & Design

Seven Interesting Data Warehouse Ideas

Oracle Database 10g: Introduction to SQL

An Incredibly Brief Introduction to Relational Databases: Appendix B - Learning Rails

You Don t Need a DBA. What Every PHP Developer Should Know about Database Development. Maggie Nelson php works 2007

Faculty of Environment & Technology

Database Management

CSC 337. Database Design and More Commands. Rick Mercer.

INTERMEDIATE SQL GOING BEYOND THE SELECT. Created by Brian Duffey

CMPE 131 Software Engineering. Database Introduction

Chapter 11: Data Management Layer Design

SAMPLE FINAL EXAM SPRING/2H SESSION 2017

1D D0-541 CIW v5 Database Design Specialist Version 1.7

II. Structured Query Language (SQL)

Relational Data Model. Christopher Simpkins

CS 327E Lecture 2. Shirley Cohen. January 27, 2016

Introduction. Identifying potential problems, update anomalies, in the design of a relational database Methods for correcting these problems

Structured Query Language. ALTERing and SELECTing

INF3707. Tutorial Letter 201/02/2018. Database Design and Implementation. Semesters 2: Assignment 01 solutions. School of Computing

Database Fundamentals

A practical introduction to database design

SQL - Subqueries and. Schema. Chapter 3.4 V4.0. Napier University

Fundamentals of Information Systems, Seventh Edition

Relational Model and Relational Algebra. Rose-Hulman Institute of Technology Curt Clifton

Solutions Manual for A Guide to MySQL 1st Edition by Pratt and Last

MIS2502: Review for Exam 2. JaeHwuen Jung

Data about data is database Select correct option: True False Partially True None of the Above

MTA Database Administrator Fundamentals Course

Create a simple database with MySQL

DATABASES SQL INFOTEK SOLUTIONS TEAM

Transcription:

CHAPTER 11 Data Normalization

CHAPTER OBJECTIVES How the relational model works How to build use-case models for predicting data usage How to construct entity-relationship diagrams to model your data How to build multi-table databases How joins are used to connect tables How to build a link table to model many-to-many relationships How to optimize your table design for later programming

DESIGNING A DATABASE A relational database stores data in a set of tables each of which stores several pieces of data about a single entity such as books, authors, customers, etc. So, for example, the bookstore database might have tables for books, authors, customers, vendors, orders, and orderlines. The books and authors table schemas might be: books(isbn, title, authorid, yearwritten, price, numberonhand) authors(id, lastname, firstname, address, phone, email, website) An example books record: ('978-0684803357', 'For Whom the Bell Tolls', 15, 1940, 19.80, 3) A (fictional) example authors record: (15, 'Hemingway', 'Ernest', '123 Elm Street, Oak Park, Illinois', '800-555-1212', 'ernest@hemingway.com', 'www.hemingway.com') The beauty of each table containing data about only one entity is that it eliminates redundancy. For example, each time we add a new book by Hemingway, we need to specify that he is the author, but we would not want to have to re-enter his address, phone, etc. as we would if the author data was stored in the books table.

DESIGNING A DATABASE CONT However, with the data stored in different tables, if we want to query the database for a particular title, it's author, and the author's phone number, we need a way to merge the data from the books and authors tables. This we can do in the SELECT statement by connecting each book to it's author with the authorid field as follows: SELECT title, firstname, lastname, phone FROM books, authors WHERE books.authorid = authors.id AND title = 'desiredtitle' The books.authorid field, a foreign key, links each book to its author with the authors.id field, the primary key of the authors table. We say the tables have a relationship. When we connect two tables with a common field like this, it effectively extends each books record with its authors record. The query can thus access fields in both tables. By similar means we can merge more than two tables.

DEFINING RULES FOR A GOOD DATA DESIGN Data developers have come up with a list of rules for creating well-behaved databases: Break your data into multiple tables. Make no field with a list of entries. Do not duplicate data. Make each table describe only one entity. Don t store information that should be calculated instead. Create a single primary key field for each table.

NORMALIZING YOUR DATA The basic concept of normalization is to break down a database into a series of tables. If each of these tables is designed correctly, the database is less likely to have the sorts of problems described so far.

FIRST NORMAL FORM: ELIMINATE LISTED FIELDS A table is in first normal form if and only if it represents a relation. It does not allow nulls or duplicate rows. Eliminate listed fields IE: Specialty field

SECOND NORMAL FORM: ELIMINATE REDUNDANCIES A table is in second normal form (2NF) only if it is in 1NF and all nonkey fields are dependent entirely on the candidate key, not just part of it. The next step is to deal with all the potential redundancy issues. These mainly occur because data is entered more than one time. To fix this, you need to build new tables. The agent table could be further improved by moving all data about operations to another table.

THIRD NORMAL FORM: ENSURE FUNCTIONAL DEPENDENCY A table is in 3NF if it is in 2NF and has no transitive dependencies on the candidate key. For a table to be in the third normal form, that table must have a single primary key and every field in the table must relate only to that key. For example, the description field is a description of the operation, not the agent, so it belongs in the operation table. In the third phase of normalization, you look through each piece of table data and ensure that it directly relates to the table in which it s placed. If not, either move it to a more appropriate table or build a new table for it.

BUILDING YOUR DATA TABLES After designing the data according to the rules of normalization, you are ready to build sample data tables in SQL. It pays to build your tables carefully to avoid problems. Tip: Build all your tables in an SQL script so I can easily rebuild your database if your programs mess up the data structure. And add plenty of sample data in the script.

SQL SCRIPT: HOUSE KEEPING The code specifies the database and deletes all tables if they already existed. This behavior ensures that it start with a fresh version of the data. This is also ideal for testing, since you can begin each test with a database in a known state.

SQL SCRIPT: CREATING THE AGENT TABLE Recall that the first field in a table is usually called the primary key. Primary keys must be unique and each record must have one. I named each primary key according to a special convention. Primary key names always begin with the table name and end with ID. I added this convention because it makes things easier when I write programs to work with this data. The NOT NULL modifier requires you to put a value in the field. In practice, this ensures that all records of this table must have a primary key. The AUTO_INCREMENT identifier is a special tool that allows MySQL to pick a new value for this field if no value is specified. This will ensure that all entries are unique. In fact, when AUTO_INCREMENT is set, you cannot manually add a value to the field. I added an indicator at the end of the CREATE TABLE statement to indicate that agentid is the primary key of the agent table. The FOREIGN KEY reference indicates that the operationid field acts as a reference to the operation table. Some databases use this information to reinforce relationships. Even if the database does not use this information, it can be useful documentation for the purpose of the field.

INSERTING A VALUE INTO THE AGENT TABLE The INSERT statements for the agent table have one new trick made possible by the primary key s AUTO_INCREMENT designation. INSERT INTO agent VALUES( null, 'Bond', 1, '1961-08-30' ); The primary key is initialized with the value null. This might be surprising because primary keys are explicitly designed to never contain a null value. Since the agentid field is set to AUTO_INCREMENT, the null value is automatically replaced with an unused integer.

INTRODUCING SQL FUNCTIONS SQL has a number of functions built in, which allow you to manipulate the data in various ways.

CONVERTING NUMBER OF DAYS TO A DATE Most of the standard math operations work in SQL, but there s a better way. You can convert the number of days back to a date with the FROM_DAYS() function as in Table 11.9. SELECT name, NOW(), birthday, DATEDIFF(NOW(), birthday) as daysold, FROM_DAYS(DATEDIFF(NOW(), birthday)) FROM agent;

CONCATENATING TO BUILD THE AGE FIELD Concatenate values back to one field SELECT name, birthday, CONCAT( YEAR(FROM_DAYS(DATEDIFF(NOW(), birthday))), ' years, ', MONTH(FROM_DAYS(DATEDIFF(NOW(), birthday))), ' months') as age Bond FROM agent;

READING Recommended Reading Textbook Example: Building a View Page 407-419

CODE EXAMPLES FOR THIS CHAPTER The only file that we really need for this chapter is the buildspy.sql SQL script file. Since the ciswebs server will not serve a file with a.sql extension, a copy has been saved as buildspy.sql.txt. A few modifications were made to this file so that it also works on newer versions of MySQL. See the comments at the beginning of the file for more information. ph11withmods.zip is a ZIP folder of the examples, both original and modified versions. Chapter 11 examples