Guideline 1: Semantic of the relation attributes Do not mix attributes from distinct real world. example

Similar documents
Functional Dependencies and Normalization for Relational Databases Design & Analysis of Database Systems

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Chapter 10. Normalization. Chapter Outline. Chapter Outline(contd.)

Chapter 14. Database Design Theory: Introduction to Normalization Using Functional and Multivalued Dependencies

Relational Database design. Slides By: Shree Jaswal

Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 10-2

In Chapters 3 through 6, we presented various aspects

Informal Design Guidelines for Relational Databases

Dr. Anis Koubaa. Advanced Databases SE487. Prince Sultan University

Chapter 10. Chapter Outline. Chapter Outline. Functional Dependencies and Normalization for Relational Databases

Chapter 14 Outline. Normalization for Relational Databases: Outline. Chapter 14: Basics of Functional Dependencies and

Functional Dependencies and. Databases. 1 Informal Design Guidelines for Relational Databases. 4 General Normal Form Definitions (For Multiple Keys)

Functional Dependencies & Normalization for Relational DBs. Truong Tuan Anh CSE-HCMUT

FUNCTIONAL DEPENDENCIES CHAPTER , 15.5 (6/E) CHAPTER , 10.5 (5/E)

CS 338 Functional Dependencies

We shall represent a relation as a table with columns and rows. Each column of the table has a name, or attribute. Each row is called a tuple.

MODULE: 3 FUNCTIONAL DEPENDENCIES

UNIT 3 DATABASE DESIGN

CMP-3440 Database Systems

More Relational Algebra

The strategy for achieving a good design is to decompose a badly designed relation appropriately.

Database Design Theory and Normalization. CS 377: Database Systems

Normalization. Murali Mani. What and Why Normalization? To remove potential redundancy in design

COSC Dr. Ramon Lawrence. Emp Relation

Chapter 16. Relational Database Design Algorithms. Database Design Approaches. Top-Down Design

TDDD12 Databasteknik Föreläsning 4: Normalisering

CSCI 403: Databases 13 - Functional Dependencies and Normalization

Mapping ER Diagrams to. Relations (Cont d) Mapping ER Diagrams to. Exercise. Relations. Mapping ER Diagrams to Relations (Cont d) Exercise

CS 2451 Database Systems: Database and Schema Design

Database design process

Schema Refinement: Dependencies and Normal Forms

SQL STRUCTURED QUERY LANGUAGE

Relational Design 1 / 34

Schema Refinement: Dependencies and Normal Forms

Schema Refinement: Dependencies and Normal Forms

UNIT 8. Database Design Explain multivalued dependency and fourth normal form 4NF with examples.

The Basic (Flat) Relational Model. Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley

Lecture5 Functional Dependencies and Normalization for Relational Databases

Wentworth Institute of Technology COMP2670 Databases Spring 2016 Derbinsky. Normalization. Lecture 9

CIS430 /CIS530 Lab Assignment 6

SQL: A COMMERCIAL DATABASE LANGUAGE. Data Change Statements,

V. Database Design CS448/ How to obtain a good relational database schema

QUESTION BANK. SUBJECT CODE / Name: CS2255 DATABASE MANAGEMENT SYSTEM UNIT III. PART -A (2 Marks)

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2009 Lecture 4 - Schema Normalization

ECE 650 Systems Programming & Engineering. Spring 2018

LAB 3 Notes. Codd proposed the relational model in 70 Main advantage of Relational Model : Simple representation (relationstables(row,

Chapter 3. The Relational database design

RELATIONAL DATA MODEL

Part II: Using FD Theory to do Database Design

Unit IV. S_id S_Name S_Address Subject_opted

DC62 DATABASE MANAGEMENT SYSTEMS DEC 2013

More SQL: Complex Queries, Triggers, Views, and Schema Modification

Mobile and Heterogeneous databases Distributed Database System Query Processing. A.R. Hurson Computer Science Missouri Science & Technology

Slides by: Ms. Shree Jaswal

CSE 562 Database Systems

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2009 Lecture 3 - Schema Normalization

SQL: A COMMERCIAL DATABASE LANGUAGE. Complex Constraints

Database Systems External Sorting and Query Optimization. A.R. Hurson 323 CS Building

PES Institute of Technology Bangalore South Campus (1 K.M before Electronic City,Bangalore ) Department of MCA. Solution Set - Test-II

The Enhanced ER Model

Wentworth Institute of Technology COMP2670 Databases Spring 2016 Derbinsky. Physical Tuning. Lecture 12. Physical Tuning

Database Systems Relational Model. A.R. Hurson 323 CS Building

Overview Relational data model

The Relational Model 2. Week 3

Lecture4: Guidelines for good relational design Mapping ERD to Relation. Ref. Chapter3

CS403- Database Management Systems Solved Objective Midterm Papers For Preparation of Midterm Exam

Database Management System (15ECSC208) UNIT I: Chapter 2: Relational Data Model and Relational Algebra

Translation of ER-diagram into Relational Schema. Dr. Sunnie S. Chung CIS430/530

The Relational Model. Chapter 3

Wentworth Institute of Technology COMP570 Database Applications Fall 2014 Derbinsky. Physical Tuning. Lecture 10. Physical Tuning

The Relational Model. Chapter 3. Comp 521 Files and Databases Fall

Conceptual Design. The Entity-Relationship (ER) Model

The Relational Model. Chapter 3. Database Management Systems, R. Ramakrishnan and J. Gehrke 1

Chapter 4. The Relational Model

Relational Data Model. Christopher Simpkins

Desired properties of decompositions

Relational Model. IT 5101 Introduction to Database Systems. J.G. Zheng Fall 2011

Babu Banarasi Das National Institute of Technology and Management

Basic SQL. Basic SQL. Basic SQL

COSC344 Database Theory and Applications. σ a= c (P) S. Lecture 4 Relational algebra. π A, P X Q. COSC344 Lecture 4 1

CIS 330: Applied Database Systems. ER to Relational Relational Algebra

High-Level Database Models (ii)

The Relational Model

More Normalization Algorithms. CS157A Chris Pollett Nov. 28, 2005.

. : B.Sc. (H) Computer Science. Section A is compulsory. Attempt all parts together. Section A. Specialization lattice and Specialization hierarchy

Lecture 5 Design Theory and Normalization

From ER to Relational Model. Book Chapter 3 (part 2 )

IS 263 Database Concepts

More SQL: Complex Queries, Triggers, Views, and Schema Modification

Normalization in DBMS

SVY227: Databases for GIS

This lecture. Databases -Normalization I. Repeating Data. Redundancy. This lecture introduces normal forms, decomposition and normalization.

Introduction to SQL. ECE 650 Systems Programming & Engineering Duke University, Spring 2018

CS W Introduction to Databases Spring Computer Science Department Columbia University

A database consists of several tables (relations) AccountNum

Chapter 4. Basic SQL. Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley

CS403- Database Management Systems Solved MCQS From Midterm Papers. CS403- Database Management Systems MIDTERM EXAMINATION - Spring 2010

DATABASE MANAGEMENT SYSTEM SHORT QUESTIONS. QUESTION 1: What is database?

Normalization Normalization

CS430 Final March 14, 2005

Transcription:

Design guidelines for relational schema Semantic of the relation attributes Do not mix attributes from distinct real world Design a relation schema so that it is easy to explain its meaning. Do not combine attributes from multiple entity types and relationship types into a single relation. Intuitively, if a relation schema corresponds to one entity type or one relationship type, it is straightforward to explain its meaning. Otherwise, if the relation corresponds to a mixture of multiple entities and relationships, semantic ambiguities will result and the relation cannot be easily explained. example A tuple in the EMP_DEPT relation schema represents a single employee but includes additional informationnamely, the name (DNAME) of the department for which the employee works and the social security number (DMGRSSN) of the department manager. For the EMP_PROJ relation, each tuple relates an employee to a project but also includes the employee name (ENAME), project name (PNAME), and project location (PLOCATION). 1

Although there is nothing wrong logically with these two relations, they are considered poor designs because they violate Guideline 1 by mixing attributes from distinct real-world entities; They may be used as views, but they cause problems when used as base relations Guideline 2: Reducing the redundant values in tuples Avoiding update anomalies One goal of schema design is to minimize the storage space used by the base relations Grouping attributes into relation schemas has a significant effect on storage space. For example, compare the space used by the two base relations EMPLOYEE and DEPARTMENT with that for an EMP_DEPT base relation in, which is the result of applying the NATURAL JOIN operation to EMPLOYEE and DEPARTMENT. 2

In EMP_DEPT, the attribute values pertaining to a particular department (DNUMBER, DNAME, DMGRSSN) are repeated for every employee who works for that department. In contrast, each department's information appears only once in the DEPARTMENT relation Only the department number (DNUMBER) is repeated in the EMPLOYEE relation for each employee who works in that department. Update Anomalies These can be classified into insertion anomalies, deletion anomalies, and modification anomalies Insertion Anomalies. Insertion anomalies can be differentiated into two types, illustrated by the following examples based on the EMP_DEPT relation: 3

Insertion Anomalies To insert a new employee tuple into EMP_DEPT, we must include either the attribute values for the department that the employee works for, or nulls (if the employee does not work for a department as yet). Insertion Anomalies For example, to insert a new tuple for an employee who works in department number 5, we must enter the attribute values of department 5 correctly so that they are consistent with values for department 5 in other tuples in EMP_DEPT Insertion Anomalies It is difficult to insert a new department that has no employees as yet in the EMP_DEPT relation. The only way to do this is to place null values in the attributes for employee. This causes a problem because SSN is the primary key of EMP_DEPT, and each tuple is supposed to represent an employee entitynot a department entity Deletion Anomalies If we delete from EMP_DEPT an employee tuple that happens to represent the last employee working for a particular department, The information concerning that department is lost from the database. Modification Anomalies In EMP_DEPT, if we change the value of one of the attributes of a particular department-say, the manager of department 5- we must update the tuples of all employees who work in that department; otherwise, the database will become inconsistent Modification Anomalies If we fail to update some tuples, the same department will be shown to have two different values for manager in different employee tuples, which would be wrong 4

Guideline 3 Reducing the Null values in tuples As far as possible, avoid placing attributes in a base relation whose values may frequently be null. If nulls are unavoidable, make sure that they apply in exceptional cases only and do not apply to a majority of tuples in the relation. Disallowing the generation of spurious tuples Design relation schemas so that they can be joined with equality conditions on attributes that are either primary keys or foreign keys in a way that guarantees that no spurious tuples are generated. Avoid relations that contain matching attributes that are not (foreign key, primary key) combinations, because joining on such attributes may produce spurious tuples. This informal guideline is called the nonadditive (or lossless) join property, that guarantees that certain joins do not produce spurious tuples nonadditive (or lossless) join property Design relations so that they can be JOINed with equality condition Equijoin on attributes that are either PKs or FKs 5

Suppose that we used EMP_PROJ1 and EMP_LaCS as the base relations instead of EMP_PROJ. This produces a particularly bad schema design, because we cannot recover the information that was originally in EMP_PROJ from EMP_PROJ1 and EMP_LaCS. If we attempt a NATURALJOIN operation on EMP_PROJ1 and EMP_LaCS, the result produces many more tuples than the original set of tuples in EMP_PROJ. Additional tuples that were not in EMP_PROJ are called spurious tuples because they represent spurious or wrong information that is not valid. The spurious tuples are marked by asterisks (*) Questions 6