Lecture 11 - Chapter 8 Relational Database Design Part 1

Similar documents
Chapter 7: Relational Database Design

Chapter 6: Relational Database Design

Chapter 8: Relational Database Design

customer = (customer_id, _ customer_name, customer_street,

Unit 3 : Relational Database Design

Functional Dependencies

Databases Tutorial. March,15,2012 Jing Chen Mcmaster University

UNIT 3 DATABASE DESIGN

Chapter 2 Introduction to Relational Models

Informal Design Guidelines for Relational Databases

Relational Database Design (II)

UNIT -III. Two Marks. The main goal of normalization is to reduce redundant data. Normalization is based on functional dependencies.

Normalisation theory

NORMAL FORMS. CS121: Relational Databases Fall 2017 Lecture 18

Normal Forms. Winter Lecture 19

Chapter 2: Intro to Relational Model

Normalization. Murali Mani. What and Why Normalization? To remove potential redundancy in design

Chapter 10. Normalization. Chapter Outline. Chapter Outline(contd.)

Chapter 10. Chapter Outline. Chapter Outline. Functional Dependencies and Normalization for Relational Databases

Lecture 10 - Chapter 7 Entity Relationship Model

Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 10-2

Chapter 14. Database Design Theory: Introduction to Normalization Using Functional and Multivalued Dependencies

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Lectures 12: Design Theory I. 1. Normal forms & functional dependencies 2/19/2018. Today s Lecture. What you will learn about in this section

CMU SCS CMU SCS CMU SCS CMU SCS whole nothing but

Functional Dependencies and. Databases. 1 Informal Design Guidelines for Relational Databases. 4 General Normal Form Definitions (For Multiple Keys)

Chapter 14 Outline. Normalization for Relational Databases: Outline. Chapter 14: Basics of Functional Dependencies and

Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Overview - detailed. Goal. Faloutsos & Pavlo CMU SCS /615

Database Design Theory and Normalization. CS 377: Database Systems

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2009 Lecture 4 - Schema Normalization

Functional Dependencies and Finding a Minimal Cover

Lecture 14 of 42. E-R Diagrams, UML Notes: PS3 Notes, E-R Design. Thursday, 15 Feb 2007

Relational Database design. Slides By: Shree Jaswal

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2009 Lecture 3 - Schema Normalization

CS211 Lecture: Database Design

Functional dependency theory

Lecture 6a Design Theory and Normalization 2/2

CSCI 403: Databases 13 - Functional Dependencies and Normalization

CS425 Fall 2016 Boris Glavic Chapter 2: Intro to Relational Model

Database Management System Prof. Partha Pratim Das Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur

CSE 562 Database Systems

The Relational Data Model

Applied Databases. Sebastian Maneth. Lecture 5 ER Model, normal forms. University of Edinburgh - January 25 th, 2016

Relational Design: Characteristics of Well-designed DB

Schema Refinement: Dependencies and Normal Forms

CS 338 Functional Dependencies

Database Systems. Answers

Schema Refinement: Dependencies and Normal Forms

Functional Dependencies & Normalization for Relational DBs. Truong Tuan Anh CSE-HCMUT

Redundancy:Dependencies between attributes within a relation cause redundancy.

Relational Design 1 / 34

CSIT5300: Advanced Database Systems

Combining schemas. Problems: redundancy, hard to update, possible NULLs

Functional Dependencies and Normalization for Relational Databases Design & Analysis of Database Systems

COSC Dr. Ramon Lawrence. Emp Relation

Exam I Computer Science 420 Dr. St. John Lehman College City University of New York 12 March 2002

FUNCTIONAL DEPENDENCIES

The strategy for achieving a good design is to decompose a badly designed relation appropriately.

CS411 Database Systems. 05: Relational Schema Design Ch , except and

FINAL EXAM REVIEW. CS121: Introduction to Relational Database Systems Fall 2018 Lecture 27

Concepts from

Database design III. Quiz time! Using FDs to detect anomalies. Decomposition. Decomposition. Boyce-Codd Normal Form 11/4/16

Database Management System

FUNCTIONAL DEPENDENCIES CHAPTER , 15.5 (6/E) CHAPTER , 10.5 (5/E)

More Normalization Algorithms. CS157A Chris Pollett Nov. 28, 2005.

Theory of Normal Forms Decomposition of Relations. Overview

Lecture 5 Design Theory and Normalization

ADVANCED DATABASES ; Spring 2015 Prof. Sang-goo Lee (11:00pm: Mon & Wed: Room ) Advanced DB Copyright by S.-g.

Schema Refinement: Dependencies and Normal Forms

Part II: Using FD Theory to do Database Design

MODULE: 3 FUNCTIONAL DEPENDENCIES

Relational Database Systems 1

MIDTERM EXAMINATION Spring 2010 CS403- Database Management Systems (Session - 4) Ref No: Time: 60 min Marks: 38

Introduction to Database Design, fall 2011 IT University of Copenhagen. Normalization. Rasmus Pagh

Announcements (January 20) Relational Database Design. Database (schema) design. Entity-relationship (E/R) model. ODL (Object Definition Language)

Lecture 4. Database design IV. INDs and 4NF Design wrapup

Applied Databases. Sebastian Maneth. Lecture 5 ER Model, Normal Forms. University of Edinburgh - January 30 th, 2017

Normalization. Anomalies Functional Dependencies Closures Key Computation Projecting Relations BCNF Reconstructing Information Other Normal Forms

Relational Database Design Theory. Introduction to Databases CompSci 316 Fall 2017

Fundamentals of Database Systems

Introduction to Databases, Fall 2003 IT University of Copenhagen. Lecture 4: Normalization. September 16, Lecturer: Rasmus Pagh

Dr. Anis Koubaa. Advanced Databases SE487. Prince Sultan University

Database Normalization. (Olav Dæhli 2018)

This lecture. Databases -Normalization I. Repeating Data. Redundancy. This lecture introduces normal forms, decomposition and normalization.

Database Systems. Basics of the Relational Data Model

Functional Dependencies CS 1270

Databases -Normalization I. (GF Royle, N Spadaccini ) Databases - Normalization I 1 / 24

Mapping ER Diagrams to. Relations (Cont d) Mapping ER Diagrams to. Exercise. Relations. Mapping ER Diagrams to Relations (Cont d) Exercise

Lectures 5 & 6. Lectures 6: Design Theory Part II

Normalisation. Normalisation. Normalisation

Schema Refinement and Normal Forms

CS403- Database Management Systems Solved Objective Midterm Papers For Preparation of Midterm Exam

Functional Dependencies and Normalization

CS 2451 Database Systems: Database and Schema Design

Databases The theory of relational database design Lectures for m

CS403- Database Management Systems Solved MCQS From Midterm Papers. CS403- Database Management Systems MIDTERM EXAMINATION - Spring 2010

Database Systems CSE Comprehensive Exam Spring 2005

Course Logistics & Chapter 1 Introduction

Chapter 6 Formal Relational Query Languages

Relational Database Systems 1. Christoph Lofi Simon Barthel Institut für Informationssysteme Technische Universität Braunschweig

Transcription:

CMSC 461, Database Management Systems Spring 2018 Lecture 11 - Chapter 8 Relational Database Design Part 1 These slides are based on Database System Concepts 6th edition book and are a modified version of the slides which accompany the book (http://codex.cs.yale.edu/avi/db-book/db6/slide-dir/index.html), in addition to the 2009/2012 CMSC 461 slides by Dr. Kalpakis Dr. Jennifer Sleeman https://www.csee.umbc.edu/~jsleem1/courses/461/spr18

Logistics HW3 released today due 3/12/2018 Phase 2 due Wednesday 3/7/2018 Midterm 3/14/2018 2

Lecture Outline Relational Database Design First Normal Form (1NF) Functional Dependencies Boyce-Codd (BCNF) Third Normal Form (intro) 3

Lecture Outline Relational Database Design First Normal Form (1NF) Functional Dependencies Boyce-Codd (BCNF) Third Normal Form (intro) 4

Relational Database Design We want to move from the E-R diagram to a set of relations Eliminate redundancy Ensure design is complete Ensure information is easily retrievable Going to learn about normal form functional dependencies 5

Design Alternatives: Combining Schemas Recall the instructor and department relations 6

Design Alternatives: Combining Schemas Suppose we combine instructor and department into inst_dept No connection to relationship set inst_dept Why is this a bad idea? 7

Design Alternatives: Combining Schemas Repetition of information 8

Combining Schemas without Repetition? Consider combining relations: into one relation 9

Design Alternatives: Smaller schemas Given inst_dept was the result of our design.. How would we know to split up (decompose) it into instructor and department? 10

Design Alternatives: Smaller schemas What can we say about this data? 11

Design Alternatives: Smaller schemas Write a rule: if there were a schema (dept_name, building, budget), then dept_name would be able to serve as the primary key 12

Design Alternatives: Smaller schemas Denote as a functional dependency: dept_name building, budget In inst_dept, because dept_name is not a primary key, the building and budget of a department may have to be repeated This indicates the need to decompose inst_dept 13

Design Alternatives: Smaller schemas Not all decompositions are good. Suppose we decompose: employee(id, name, street, city, salary) into employee1 (ID, name) employee2 (name, street, city, salary) 14

A Lossy Decomposition 15

Lossless Join Decomposition 16

Another Example Consider the relation schema: Lending-schema = (branchname, branchcity, assets, customername, loannumber, amount) 17

Another Example Redundancy: Data for branchname, branchcity, assets are repeated for each loan that a branch makes Wastes space Complicates updating, introducing possibility of inconsistency of assets value Null values Cannot store information about a branch if no loans exist Can use null values, but they are difficult to handle. 18

Decomposition Decompose the relation schema Lending-schema into: All attributes of an original schema (R) must appear in the decomposition (R1,R2): Lossless-join decomposition. For all possible relations r on schema R 19

Lecture Outline Relational Database Design First Normal Form (1NF) Functional Dependencies Boyce-Codd (BCNF) Third Normal Form 20

First Normal Form (1NF) In the relational model attributes have no substructure composite attributes - each component becomes its own attribute multivalued attributes - one tuple for each item Domain is atomic if its elements are considered to be indivisible units Examples of non-atomic domains: Set of names, composite attributes Identification numbers like CS001 where department is combined with employee number Fine line: CS101 might be a course identifier and could be interpreted as atomic 21

First Normal Form (1NF) Non-atomic values complicate storage and encourage redundant (repeated) storage of data Example: Set of accounts stored with each customer, and set of owners stored with each account 22

First Normal Form (1NF) A relational schema R is in first normal form if the domains of all attributes of R are atomic 23

First Normal Form (1NF) Atomicity is actually a property of how the elements of the domain are used Example: Strings would normally be considered indivisible Suppose that students are given roll numbers which are strings of the form CS0012 or EE1127 If the first two characters are extracted to find the department, the domain of roll numbers is not atomic. Doing so is a bad idea: leads to encoding of information in application program rather than in the database 24

Is it atomic? Address Course ID (CS-101) Student Name SSN ISBN Number 25

Convert it to 1NF 26

Pitfalls in Relational Database Design Relational database design requires that we find a good collection of relation schemas A bad design may lead to Repetition of Information Inability to represent certain information Design Goals: Avoid redundant data Ensure that relationships among attributes are represented Facilitate the checking of updates for violation of database integrity constraints 27

Goal: Devise a Theory for the following: Decide whether a particular relation R is in good form In the case that a relation R is not in good form, decompose it into a set of relations {R1, R2,..., Rn } such that each relation is in good form the decomposition is a lossless-join decomposition This theory is based on: functional dependencies multivalued dependencies 28

Lecture Outline Relational Database Design First Normal Form (1NF) Functional Dependencies Boyce-Codd (BCNF) Third Normal Form 29

Functional Dependencies Constraints on the set of legal relations Require that the value for a certain set of attributes determines uniquely the value for another set of attributes A functional dependency is a generalization of the notion of a key 30

Remember r is a relation r(r) is the schema for the relation r R denotes the set of attributes K represents the set of attributes that is the superkey A superkey set of attributes that uniquely identify a tuple 31

Functional Dependencies Let R be a relation schema The functional dependency holds on R if and only if for any legal relations r(r), whenever any two tuples t1 and t2 of r agree on the attributes, they also agree on the attributes. That is Example: Consider r(a,b ) with the following instance of r On this instance, A B does NOT hold, but B A does hold 32

Functional Dependencies Functional dependencies allow us to express constraints that cannot be expressed using superkeys. Consider the schema: inst_dept (ID, name, salary, dept_name, building, budget ) We expect these functional dependencies to hold: dept_name building but would not expect the following to hold: dept_name salary 33

Functional Dependencies We use functional dependencies to: test relations to see if they are legal under a given set of functional dependencies If a relation r is legal under a set F of functional dependencies, we say that r satisfies F specify constraints on the set of legal relations We say that F holds on R if all legal relations on R satisfy the set of functional dependencies F Note: A specific instance of a relation schema may satisfy a functional dependency even if the functional dependency does not hold on all legal instances For example, a specific instance of instructor may sometimes satisfy name ID 34

Functional Dependencies A functional dependency is trivial if it is satisfied by all instances of a relation 35

Functional Dependencies Examples Assume schema: student(student_id, first_name, last_name, major, SSN) Which are true in regards to functional dependencies: student_id last_name last_name student_id student_id last_name, major, SSN, student_id SSN student_id, last_name, major, SSN first_name last_name last_name last_name 36

Functional Dependencies Examples Assume schema: student(student_id, first_name, last_name, major, SSN) Which are true in regards to functional dependencies: student_id last_name TRUE last_name student_id FALSE student_id last_name, major, SSN, student_id TRUE SSN student_id, last_name, major, SSN TRUE first_name last_name FALSE last_name last_name TRUE 37

Closure of a Set of Functional Dependencies Given a set F of functional dependencies, there are certain other functional dependencies that are logically implied by F For example: Given a schema r(a,b,c) If A B and B C then we can infer that A C The set of all functional dependencies logically implied by F is the closure of F We denote the closure of F by F+ F+ is a superset of F 38

Lecture Outline Relational Database Design First Normal Form (1NF) Functional Dependencies Boyce-Codd (BCNF) Third Normal Form (Intro) 39

Boyce-Codd Normal Form 40

Boyce-Codd Normal Form 41

Boyce-Codd Normal Form 42

Decomposing a Schema into BCNF 43

Convert it to BCNF Schema: Student(ID,Name,AdvisorID,AdvisorName) What are the functional dependencies? 44

Convert it to BCNF Schema: Student(ID,Name,AdvisorID,AdvisorName) What are the functional dependencies? ID -> Name AdvisorID -> AdvisorName What uniquely identifies the tuples? 45

Convert it to BCNF Schema: Student(ID,Name,AdvisorID,AdvisorName) What are the functional dependencies? ID -> Name AdvisorID -> AdvisorName What uniquely identifies the tuples? (ID,AdvisorID) Is there a BCNF violation? 46

Convert it to BCNF Schema: Student(ID,Name,AdvisorID,AdvisorName) What are the functional dependencies? ID -> Name AdvisorID -> AdvisorName What is the primary key? (ID,AdvisorID) Is there a BCNF violation? YES! 47

Convert it to BCNF Schema: Student(ID,Name,AdvisorID,AdvisorName) What are the functional dependencies? ID -> Name AdvisorID -> AdvisorName What is the primary key? (ID,AdvisorID) Is there a BCNF violation? YES! Use ID-> Name to decompose R- (Name-ID)= (ID,AdvisorID,AdvisorName) and ID union Name = (ID,Name) 48

BCNF and Dependency Preservation Constraints, including functional dependencies, are costly to check in practice unless they pertain to only one relation If it is sufficient to test only those dependencies on each individual relation of a decomposition in order to ensure that all functional dependencies hold, then that decomposition is dependency preserving Because it is not always possible to achieve both BCNF and dependency preservation, we consider a weaker normal form, known as third normal form 49

Lecture Outline Relational Database Design First Normal Form (1NF) Functional Dependencies Boyce-Codd (BCNF) Third Normal Form (Intro) 50

Third Normal Form 51