Data Modeling. Yanlei Diao UMass Amherst. Slides Courtesy of R. Ramakrishnan and J. Gehrke

Similar documents
The Relational Model. Chapter 3. Database Management Systems, R. Ramakrishnan and J. Gehrke 1

The Relational Model. Chapter 3

The Relational Model. Why Study the Relational Model? Relational Database: Definitions

The Relational Model 2. Week 3

Introduction to Data Management. Lecture #4 (E-R à Relational Design)

The Relational Model. Chapter 3. Comp 521 Files and Databases Fall

Introduction to Data Management. Lecture #4 (E-R Relational Translation)

The Relational Model. Chapter 3. Comp 521 Files and Databases Fall

Why Study the Relational Model? The Relational Model. Relational Database: Definitions. The SQL Query Language. Relational Query Languages

Database Systems ( 資料庫系統 )

Relational Databases BORROWED WITH MINOR ADAPTATION FROM PROF. CHRISTOS FALOUTSOS, CMU /615

Introduction to Data Management. Lecture #5 Relational Model (Cont.) & E-Rà Relational Mapping

The Relational Model. Outline. Why Study the Relational Model? Faloutsos SCS object-relational model

Relational Model. Topics. Relational Model. Why Study the Relational Model? Linda Wu (CMPT )

The Relational Model. Roadmap. Relational Database: Definitions. Why Study the Relational Model? Relational database: a set of relations

CIS 330: Applied Database Systems

High-Level Database Models (ii)

Database Applications (15-415)

Database Systems. Course Administration

Relational data model

The Relational Model. Relational Data Model Relational Query Language (DDL + DML) Integrity Constraints (IC)

High Level Database Models

Database Management Systems. Chapter 3 Part 2

The Relational Model of Data (ii)

The Entity-Relationship Model

From ER to Relational Model. Book Chapter 3 (part 2 )

Comp 5311 Database Management Systems. 4b. Structured Query Language 3

The Entity-Relationship Model. Overview of Database Design

The Relational Model

CIS 330: Applied Database Systems. ER to Relational Relational Algebra

The Relational Data Model. Data Model

The Relational Model (ii)

Database Management Systems. Chapter 3 Part 1

Database Systems. Lecture2:E-R model. Juan Huo( 霍娟 )

Introduction to Data Management. Lecture #3 (E-R Design, Cont d.)

The Relational Model

Introduction to Database Design

Database Management Systems. Session 2. Instructor: Vinnie Costa

Overview of db design Requirement analysis Data to be stored Applications to be built Operations (most frequent) subject to performance requirement

Introduction to Data Management. Lecture #5 (E-R Relational, Cont.)

The Relational Model. Week 2

CONCEPTUAL DESIGN: ER TO RELATIONAL TO SQL

CSE 530A. ER Model to Relational Schema. Washington University Fall 2013

The Entity-Relationship Model

Introduction to Database Design

Translation of ER-diagram into Relational Schema. Dr. Sunnie S. Chung CIS430/530

Conceptual Design. The Entity-Relationship (ER) Model

Contents. Database. Information Policy. C03. Entity Relationship Model WKU-IP-C03 Database / Entity Relationship Model

Introduction to Database Design. Dr. Kanda Runapongsa Dept of Computer Engineering Khon Kaen University

EGCI 321: Database Systems. Dr. Tanasanee Phienthrakul

LAB 3 Notes. Codd proposed the relational model in 70 Main advantage of Relational Model : Simple representation (relationstables(row,

CSC 261/461 Database Systems Lecture 10

Databases Model the Real World. The Entity- Relationship Model. Conceptual Design. Steps in Database Design. ER Model Basics. ER Model Basics (Contd.

MIS Database Systems Entity-Relationship Model.

Lecture 2 SQL. Instructor: Sudeepa Roy. CompSci 516: Data Intensive Computing Systems

OVERVIEW OF DATABASE DEVELOPMENT

Database Applications (15-415)

The Entity-Relationship Model

CompSci 516 Database Systems. Lecture 2 SQL. Instructor: Sudeepa Roy

Review. The Relational Model. Glossary. Review. Data Models. Why Study the Relational Model? Why use a DBMS? OS provides RAM and disk

Database Applications (15-415)

CSE 530A. ER Model. Washington University Fall 2013

Database Design. ER Model. Overview. Introduction to Database Design. UVic C SC 370. Database design can be divided in six major steps:

The Entity-Relationship Model. Overview of Database Design. ER Model Basics. (Ramakrishnan&Gehrke, Chapter 2)

Database Design & Deployment

Databases Model the Real World. The Entity- Relationship Model. Conceptual Design. Steps in Database Design. ER Model Basics. ER Model Basics (Contd.

Introduction to Data Management. Lecture #6 E-Rà Relational Mapping (Cont.)

COMP Instructor: Dimitris Papadias WWW page:

Lecture 2 SQL. Announcements. Recap: Lecture 1. Today s topic. Semi-structured Data and XML. XML: an overview 8/30/17. Instructor: Sudeepa Roy

Database Applications (15-415)

The Entity-Relationship Model. Steps in Database Design

Overview. Introduction to Database Design. ER Model. Database Design

Administrivia. The Relational Model. Review. Review. Review. Some useful terms

The Entity-Relationship (ER) Model

Introduction to Databases

Conceptual Design. The Entity-Relationship (ER) Model

Introduction to Database Design

Database Management Systems. Chapter 2 Part 2

CIS 330: Web-driven Web Applications. Lecture 2: Introduction to ER Modeling

DBMS. Relational Model. Module Title?

The Entity-Relationship (ER) Model 2

ER to Relational Mapping. ER Model: Overview

CSC 261/461 Database Systems Lecture 8. Spring 2018

The Relational Model. Database Management Systems

Announcements If you are enrolled to the class, but have not received the from Piazza, please send me an . Recap: Lecture 1.

CS/INFO 330 Entity-Relationship Modeling. Announcements. Goals of This Lecture. Mirek Riedewald

VARDHAMAN COLLEGE OF ENGINEERING Shamshabad , Hyderabad B.Tech. CSE IV Semester (VCE - R11) T P C 3+1* -- 4 (A1511) DATABASE MANAGEMENT SYSTEMS

CSC 261/461 Database Systems Lecture 8. Fall 2017

ER Model Overview. The Entity-Relationship Model. Database Design Process. ER Model Basics

Translating an ER Diagram to a Relational Schema

Introduction to Data Management. Lecture #4 E-R Model, Still Going

CSIT5300: Advanced Database Systems

Keys, SQL, and Views CMPSCI 645

Integrity constraints, relationships. CS634 Lecture 2

Database Design CENG 351

Entity-Relationship Diagrams

Translation of ER-diagram into Relational Schema. Dr. Sunnie S. Chung CIS430/530

Introduction to Data Management. Lecture #3 (Conceptual DB Design) Instructor: Chen Li

e e Conceptual design begins with the collection of requirements and results needed from the database (ER Diag.)

SYLLABUS ADMIN DATABASE SYSTEMS I WEEK 2 THE ENTITY-RELATIONSHIP MODEL. Assignment #2 changed. A2Q1 moved to A3Q1

Transcription:

Data Modeling Yanlei Diao UMass Amherst Slides Courtesy of R. Ramakrishnan and J. Gehrke 1

Outline v Conceptual Design: ER Model v Relational Model v Logical Design: from ER to Relational 2

Conceptual Design v Start with application requirements v Use Entity-Relationship (ER) Model: Entities and relationships in the enterprise Integrity constraints (or business rules) that hold Pictorially represented by an ER diagram 3

ER Model Basics: Entities ssn name office v Entity: A real-world object. Described using a set of attributes. Employees v Entity Set: A collection of entities described by the same set of attributes. Domain of an attribute. Key of an entity set: minimum set of attributes that uniquely identify each entity in the set. 4

ER Model Basics: Relationships name since dname ssn office did budget Employees Works_In Departments v Relationship: Association between 2+ entities. E.g., Joe works in the accounting dept since 01/2008. v Relationship Set: Collection of similar relationships. 5

Ternary Relationships name since dname ssn office did budget Employees Works_In Departments address Locations capacity v A Works_In relationship involves: an employee a department a location 6

More on Relationships name ssn office Employees subordinate supervisor Reports_To v An entity set can participate in same relationship set, but in different roles. 7

Key Constraints v Works_In: an employee can work in many depts; a dept can have many employees many-to-many name ssn office Employees since Works_In did dname budget Departments v Manages: each dept is managed by at most one manager key constraint on Manages ( ) one-to-many name ssn office Employees since Manages dname did budget Departments 8

Participation Constraints name since dname ssn office did budget Employees Manages Departments Works_In since v Works_In: every employee works in at least one dept (or, every employee must work for some dept) Participation constraint on Works_in (denoted using a thick line) Participation of Employees in Works_In is total (vs. partial). v Key and participation constraints: exactly one 9

Weak Entities name ssn office cost pname age Employees Policy Dependents Does an entity set always have a key? v A weak entity can be identified uniquely only by considering the primary key of another (owner) entity. A weak entity must have an exactly one relationship with its owner. Notation: see the example. 10

ISA ( is a ) Hierarchies ssn name office Employees hourly_wages hours_worked ISA contractid Hourly_Emps Contract_Emps v It is sometimes natural to classify entities into subclasses. v Y ISA X: every Y entity is also considered to be an X entity. Y entity set inherits all attributes of X entity set. Y entity set has its own descriptive attributes. 11

Issues with ISA Hierarchies v Overlap constraints: Can Joe be an Hourly_Emps as well as a Contract_Emps entity? Allowed/disallowed v Covering constraints: Does every Employees entity have to be an Hourly_Emps or a Contract_Emps entity? Yes/no v Reasons for using ISA: Add descriptive attributes specific to a subclass. Identify entities that participate in a specific relationship. 12

Useful things to know about ER v When reading application requirements: Entities, attributes are often extracted from nouns Entities vs. attributes: should address be an attribute of Employees or an entity? Sub-structure? Participation in other relationships? Relationships are often from verbs v Software tools for ER modeling Word on both Mac and Windows Microsoft Visio for Windows OmniGraffle for Mac Draw.io (Google) 13

Outline v Conceptual Design: ER Model v Relational Model v Logical Design: from ER to Relational 14

Relational Model v A relational database is a set of relations. v A relation has: Schema : specifies (1) name of relation, (2) name and domain of each attribute. E.g. Students(sid:string, name:string, login:string, age:integer, gpa:real) Instance : a table with rows (tuples) and columns (attributes, fields). cardinality = #rows, degree / arity = #columns. v Relation is a set of tuples (in theory). All rows must be distinct, no duplicates. 15

Example Instance of Students Relation STUDENTS sid name login age gpa 53666 Jones jones@cs 18 3.4 53688 Smith smith@eecs 18 3.2 53650 Smith smith@math 19 3.8 v Cardinality = 3, degree = 5 v All rows are distinct. v Some columns of two rows can be the same. 16

Creating Relations in SQL v Create the Students relation Specify schema Domain constraint: type of each attribute later enforced by the DBMS upon tuple insertion or update. CREATE TABLE Students (sid CHAR(20), name CHAR(20), login CHAR(10), age INTEGER, REAL); gpa CREATE TABLE Enrolled (sid CHAR(20), cid CHAR(20), grade CHAR(2) ); 17

Destroying and Altering Relations DROP TABLE Students; v Destroy the relation Students. ALTER TABLE Students ADD COLUMN firstyear: integer; v Alter the Students relation by adding a new field. Every tuple in the current instance is extended with a null value in the new field. 18

Integrity Constraints v Integrity Constraints (IC s): condition that must be true for any instance of the database. Domain constraint Primary key constraint Foreign key constraint Specified when schema is defined. 19

Primary Key Constraints v Key of a relation: minimum set of attributes that uniquely identify each entity. 1. No two tuples can have same values in all key fields. 2. This is not true for any subset of the key. Part 2 false? A superkey. If more than 1 key for a relation, candidate keys. One of candidate keys is chosen to be the primary key. v E.g., Students(sid, name, login, age, gpa) v E.g., Enrolled (sid, cid, grade) 20

Primary and Candidate Keys in SQL 1. Specify candidate keys using UNIQUE. 2. Choose one candidate key as the primary key. For a given student and course, there is a single grade. and no two students in a course receive the same grade. CREATE TABLE Enrolled ( sid CHAR(20), cid CHAR(20), grade CHAR(2), PRIMARY KEY (sid,cid)); CREATE TABLE Enrolled ( sid CHAR(20), cid CHAR(20), grade CHAR(2), PRIMARY KEY (sid,cid), UNIQUE (cid, grade) ); 21

Foreign Keys v Foreign key: set of attributes used to refer to the primary key of another relation. v E.g., Enrolled(sid: string, cid: string, grade: string): sid is a foreign key referring to sid in Students. Enrolled sid cid grade 53666 Carnatic101 C 53666 Reggae203 B 53650 Topology112 A 53666 History105 B Students sid name login age gpa 53666 Jones jones@cs 18 3.4 53688 Smith smith@eecs 18 3.2 53650 Smith smith@math 19 3.8 22

Foreign Keys in SQL CREATE TABLE Enrolled ( sid CHAR(20), cid CHAR(20), grade CHAR(2), PRIMARY KEY (sid,cid), FOREIGN KEY (sid) REFERENCES Students(sid) ); 23

Referential Integrity v Referential integrity: Any foreign key value must have a matching primary key value in the referenced reln. E.g., every sid value in Enrolled must appear in Students. No dangling references. v In contrast, consider links in HTML. Does referential integrity hold? 24

Enforcing Referential Integrity v What if an Enrolled tuple with a non-existent student id is inserted to the DB? Reject it! v What if a Students (referenced) tuple is deleted? CASCADE: delete all Enrolled tuples that refer to it. NO ACTION: disallow if the Students tuple is referred to. SET DEFAULT: set the foreign key to a default sid. SET NULL: set the foreign key to a special value null, denoting unknown or inapplicable. 25

Enforcing Referential Integrity v Updates to sid in Students are treated similarly. CASCADE: propagate the update to Enrolled NO ACTION: disallow the update in Students SET DEFAULT: set the sid in Enrolled to a default value SET NULL: set the sid in Enrolled to null 26

Referential Integrity in SQL CREATE TABLE Enrolled (sid CHAR(20), cid CHAR(20), grade CHAR(2), PRIMARY KEY (sid,cid), FOREIGN KEY (sid) REFERENCES Students(sid) ON DELETE NO ACTION ON UPDATE CASCADE); 27

Comments on Integrity Constraints v IC s are based on real-world business logic. Can check violation in a database instance, but can never infer an IC by looking at an instance. An IC is a statement about all possible instances! E.g., name of Students can be unique in an instance. v IC s are specified when defining the schema (CREATE TABLE). v DBMS later enforces IC s. Stored data is faithful to the real-world meaning. Avoids data entry errors, too! 28

Outline v Conceptual Design: ER Model v Relational Model v Logical Design: from ER to Relational 29

Logical DB Design: ER to Relational v An entity set is translated to a table. ssn name Employees office CREATE TABLE Employees (ssn CHAR(11), name CHAR(20), office INTEGER, PRIMARY KEY (ssn)); 30

Relationship Sets to Tables name since dname ssn office did budget Employees Works_In Departments v Each relationship set is also translated to a table with all descriptive attributes, primary key of each related entity set as a foreign key. v All foreign keys form a superkey of the relation. CREATE TABLE Works_In ( ssn CHAR(11), did INTEGER, since DATE, PRIMARY KEY (ssn, did), FOREIGN KEY (ssn) REFERENCES Employees(ssn), FOREIGN KEY (did) REFERENCES Departments(did)); 31

Review: Key Constraints v Key constraint: Each dept is managed by at most one manager. name since dname ssn office did budget Employees Manages Departments 32

Translating ER Diagrams w. Key Constraints v A separate table for Manages: Borrow primary key from the entity with the key constraint. CREATE TABLE Manages( ssn CHAR(11), did INTEGER, since DATE, PRIMARY KEY (did), FOREIGN KEY (ssn) REFERENCES Employees (ssn), FOREIGN KEY (did) REFERENCES Departments(did)); v Merge Manages into Departments: Merge the relationship into the entity with the key constraint. CREATE TABLE Dept_Mgr( did INTEGER, dname CHAR(20), budget REAL, ssn CHAR(11), since DATE, PRIMARY KEY (did), FOREIGN KEY (ssn) REFERENCES Employees); 33

Review: Participation Constraints ssn name lot since did dname budget Employees Manages Departments Works_In since v Participation constraint: Every employee works in at least one dept. Each Dept has at least one employee. v Participation + Key constraints: Every department must have one manager. 34

Key and Participation Constraints v If we have key + participation constraints (exactly one): CREATE TABLE Dept_Mgr( did INTEGER, dname CHAR(20), budget REAL, ssn CHAR(11) NOT NULL, since DATE, PRIMARY KEY (did), FOREIGN KEY (ssn) REFERENCES Employees(ssn) ON DELETE NO ACTION ON UPDATE CASCADE); v For participation constraints only, need to resort to assertions (dynamic checks in SQL). More in SQL lecture 35

Review: Weak Entities name ssn lot cost pname age Employees Policy Dependents v A weak entity can be identified uniquely only by considering the primary key of the owner entity. Must have an exactly one relationship with its owner. 36

Translating Weak Entity Sets v Merge weak entities and identifying relationships in one table. What is the primary key? What if the owner entity is deleted? CREATE TABLE Depndt_Policy ( pname CHAR(20), age INTEGER, cost REAL, ssn CHAR(11), PRIMARY KEY (ssn, pname, age), FOREIGN KEY (ssn) REFERENCES Employees(ssn) ON DELETE CASCADE ON UPDATE CASCADE); 37

Review: ISA Hierarchies ssn name office Employees hourly_wages hours_worked ISA contractid Hourly_Emps Contract_Emps 38

Translating ISA Hierarchies to Relations 1. Create tables for both parent and child entities Employees: (ssn, name, lot) Hourly_Emps: (ssn, hourly_wages, hours_worked) ssn is both primary and foreign key! Must delete Hourly_Emp if referenced Emp is deleted. 2. Create tables only for child entities Hourly_Emps: ( ssn, name, lot, hourly_wages, hours_worked). Each employee must be in one of these two subclasses. 3. Create a table only for the parent entity Which one is better for: covering constraint (Y/N), overlap constraint (Y/N)? 39