Introduction to Data Management. Lecture #3 (E-R Design, Cont d.)

Similar documents
Introduction to Data Management. Lecture #4 E-R Model, Still Going

Introduction to Data Management. Lecture #3 (Conceptual DB Design)

Introduction to Data Management. Lecture #3 (Conceptual DB Design) Instructor: Chen Li

Introduction to Data Management. Lecture #4 (E-R Relational Translation)

Introduction to Data Management. Lecture #3 (Conceptual DB Design)

The Relational Model. Chapter 3. Database Management Systems, R. Ramakrishnan and J. Gehrke 1

The Relational Model. Chapter 3

Database Applications (15-415)

The Entity-Relationship Model

Introduction to Data Management. Lecture #4 (E-R à Relational Design)

The Entity-Relationship Model. Overview of Database Design

CIS 330: Applied Database Systems

The Entity-Relationship Model

Introduction to Data Management. Lecture #5 Relational Model (Cont.) & E-Rà Relational Mapping

The Relational Model. Why Study the Relational Model? Relational Database: Definitions

Why Study the Relational Model? The Relational Model. Relational Database: Definitions. The SQL Query Language. Relational Query Languages

Relational Model. Topics. Relational Model. Why Study the Relational Model? Linda Wu (CMPT )

The Entity-Relationship (ER) Model

The Relational Model 2. Week 3

The Relational Model. Chapter 3. Comp 521 Files and Databases Fall

Database Management Systems. Chapter 2 Part 2

The Entity-Relationship Model. Overview of Database Design. ER Model Basics. (Ramakrishnan&Gehrke, Chapter 2)

The Relational Model. Chapter 3. Comp 521 Files and Databases Fall

Relational data model

Introduction to Database Design

Data Modeling. Yanlei Diao UMass Amherst. Slides Courtesy of R. Ramakrishnan and J. Gehrke

Database Systems ( 資料庫系統 )

Introduction to Database Design. Dr. Kanda Runapongsa Dept of Computer Engineering Khon Kaen University

The Relational Model. Relational Data Model Relational Query Language (DDL + DML) Integrity Constraints (IC)

CIS 330: Web-driven Web Applications. Lecture 2: Introduction to ER Modeling

Database Management Systems. Chapter 3 Part 1

The Entity-Relationship (ER) Model 2

Database Systems. Lecture2:E-R model. Juan Huo( 霍娟 )

Contents. Database. Information Policy. C03. Entity Relationship Model WKU-IP-C03 Database / Entity Relationship Model

Databases Model the Real World. The Entity- Relationship Model. Conceptual Design. Steps in Database Design. ER Model Basics. ER Model Basics (Contd.

Modeling Your Data. Chapter 2. cs542 1

The Relational Data Model. Data Model

MIS Database Systems Entity-Relationship Model.

Overview of db design Requirement analysis Data to be stored Applications to be built Operations (most frequent) subject to performance requirement

Introduction to Data Management. Lecture #6 E-Rà Relational Mapping (Cont.)

CS/INFO 330 Entity-Relationship Modeling. Announcements. Goals of This Lecture. Mirek Riedewald

Databases Model the Real World. The Entity- Relationship Model. Conceptual Design. Steps in Database Design. ER Model Basics. ER Model Basics (Contd.

Database Systems. Course Administration

Relational Databases BORROWED WITH MINOR ADAPTATION FROM PROF. CHRISTOS FALOUTSOS, CMU /615

OVERVIEW OF DATABASE DEVELOPMENT

The Relational Model. Outline. Why Study the Relational Model? Faloutsos SCS object-relational model

Introduction to Data Management. Lecture #5 (E-R Relational, Cont.)

Database Management Systems. Chapter 3 Part 2

Database Applications (15-415)

The Relational Model of Data (ii)

The Relational Model. Week 2

The Relational Model. Roadmap. Relational Database: Definitions. Why Study the Relational Model? Relational database: a set of relations

Database Management Systems. Session 2. Instructor: Vinnie Costa

The Entity-Relationship Model

CompSci 516 Database Systems. Lecture 2 SQL. Instructor: Sudeepa Roy

The Relational Model

Lecture 2 SQL. Announcements. Recap: Lecture 1. Today s topic. Semi-structured Data and XML. XML: an overview 8/30/17. Instructor: Sudeepa Roy

Comp 5311 Database Management Systems. 4b. Structured Query Language 3

Review. The Relational Model. Glossary. Review. Data Models. Why Study the Relational Model? Why use a DBMS? OS provides RAM and disk

Administrivia. The Relational Model. Review. Review. Review. Some useful terms

High Level Database Models

High-Level Database Models (ii)

Lecture 2 SQL. Instructor: Sudeepa Roy. CompSci 516: Data Intensive Computing Systems

SYLLABUS ADMIN DATABASE SYSTEMS I WEEK 2 THE ENTITY-RELATIONSHIP MODEL. Assignment #2 changed. A2Q1 moved to A3Q1

The Relational Model (ii)

The Relational Model

CIS 330: Applied Database Systems. ER to Relational Relational Algebra

Introduction to Database Design

Announcements If you are enrolled to the class, but have not received the from Piazza, please send me an . Recap: Lecture 1.

Database Design CENG 351

The Entity-Relationship Model. Steps in Database Design

ER to Relational Mapping. ER Model: Overview

VARDHAMAN COLLEGE OF ENGINEERING Shamshabad , Hyderabad B.Tech. CSE IV Semester (VCE - R11) T P C 3+1* -- 4 (A1511) DATABASE MANAGEMENT SYSTEMS

Introduction to Database Design

Database Applications (15-415)

Database Design. ER Model. Overview. Introduction to Database Design. UVic C SC 370. Database design can be divided in six major steps:

ER Model Overview. The Entity-Relationship Model. Database Design Process. ER Model Basics

CONCEPTUAL DESIGN: ER TO RELATIONAL TO SQL

Database Management Systems. Syllabus. Instructor: Vinnie Costa

Translation of ER-diagram into Relational Schema. Dr. Sunnie S. Chung CIS430/530

Overview. Introduction to Database Design. ER Model. Database Design

Conceptual Design. The Entity-Relationship (ER) Model

From ER to Relational Model. Book Chapter 3 (part 2 )

LAB 3 Notes. Codd proposed the relational model in 70 Main advantage of Relational Model : Simple representation (relationstables(row,

Entity-Relationship Diagrams

ER Model. CSC 343 Winter 2018 MICHAEL LIUT

DBMS. Relational Model. Module Title?

EGCI 321: Database Systems. Dr. Tanasanee Phienthrakul

e e Conceptual design begins with the collection of requirements and results needed from the database (ER Diag.)

The Relational Model. Database Management Systems

Database Applications (15-415)

Conceptual Design. The Entity-Relationship (ER) Model

COMP Instructor: Dimitris Papadias WWW page:

CS425 Midterm Exam Summer C 2012

Translation of ER-diagram into Relational Schema. Dr. Sunnie S. Chung CIS430/530

Keys, SQL, and Views CMPSCI 645

Relational Query Languages

Introduction to Data Management. Lecture #7 (Relational Design Theory)

Introduction to Data Management. Lecture 16 (SQL: There s STILL More...) It s time for another installment of...

CSC 261/461 Database Systems Lecture 10

Transcription:

Introduction to Data Management Lecture #3 (E-R Design, Cont d.) Instructor: Mike Carey mjcarey@ics.uci.edu Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Announcements Reminders: Sign up on Piazza! (About 10% still haen t done this...) First real quiz in discussions this week! (Study! J) HW #1 is in flight Our startup: TrueOnlyUSNews.com Deeloping your E-R models... Today: More on Conceptual DB Design Adanced E-R features Intro to the Relational model Any lingering Q s from last time? Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 2

ISA ( is a ) Hierarchies As in Jaa or other PLs, ER attributes are inherited (including the key attribute). hourly_wages If we declare A ISA B, eery A entity is also considered to be a B entity. Contract_Emps Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 3 hours_worked Hourly_Emps ISA contractid Coering constraints: Must eery entity be either an Hourly_Emps or a Contract_Emps entity? (Yes or no) Ex: Hourly_Emps AND Contract_Emps COVER (pick 1 of 2 s. 1 of 3) Oerlap constraints: Can some entity be an Hourly_Emps as well as a Contract_Emps entity? (Allowed or disallowed) Ex: Hourly_Emps OVERLAPS Contract_Emps (else pick 1 of the 3 types) Reasons for using ISA: To add descriptie attributes specific to a subclass. To identify subclasses that participate in a relationship. Design: specialization (top-down), generalization (bottom-up) Aggregation Used when we hae to model a relationship inoling (entitity sets and) a relationship set. Aggregation allows us to treat a relationship set as an entity set for purposes of participating in (other) relationships. pid started_on Projects Monitors pbudget since Sponsors budget Aggregation s. ternary relationship: Monitors is a distinct relationship; een has its own attribute here. Each sponsorship can monitored by zero or more employees (as aboe). Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 4 did until d Departments

Additional Adanced ER Features Multi-alued (s. single-alued) attributes phone Deried (s. base/stored) attributes bdate age Composite (s. atomic) attributes address snum street Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 5 city zip NOTE: Can model (two of) these using additional entity and relationship types. Conceptual Design Using the ER Model Design choices: Should a gien concept be modeled as an entity or an attribute? Should a gien concept be modeled as an entity or a relationship? Characterizing relationships: Binary or ternary? Aggregation? Constraints in the ER Model: A of data semantics can (and should) be captured. But, not all constraints cannot be captured by ER diagrams. (Ex: Department heads from earlier!) Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 6

Entity s. Attribute Should address be an attribute of or an entity (connected to by a relationship)? Depends how we want to use address information, the data semantics, and also the model features: If we hae seeral addresses per employee, address must be an entity if we stick only to basic E-R concepts (as attributes cannot be set-alued w/o adanced modeling goodies). If the structure (city, street, etc.) is important, e.g., we want to retriee employees in a gien city, address must be modeled as an entity (since attribute alues are atomic) w/o adanced modeling goodies If the address itself is logically separate (e.g., the property that s located there) and refer-able, it s rightly an entity in any case! Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 7 Entity s. Attribute (Cont d.) Works_In4 does not allow an employee to work in a department for two or more periods. (Q: Why...?) from to Works_In4 did d budget Departments Similar to the problem of wanting to record seeral addresses for an employee: We want to record seeral alues of the descriptie attributes for each instance of this relationship. Could be accomplished by haing a multialued relationship attribute. d budget Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 8 from Works_In4 Period did to Departments

Entity s. Relationship First ER diagram OK if a manager gets a separate discretionary budget for each dept. What if a manager gets a discretionary budget that coers all managed depts? Redundancy: dbudget stored for each dept managed by manager. Misleading: Suggests dbudget is associated with department-mgr combination. ISA Managers Manages2 d budget Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 9 since since dbudget Manages2 dbudget did did Departments d budget Departments Also note ISA and the relationship... Binary s. Ternary Relationships p age If each policy is owned by just 1 employee, with each dependent tied to their coering policy, first diagram is inaccurate. Q: What are the additional constraints in the 2nd diagram? (And what else was wrong with the 1 st diagram? J) Coers Dependents Bad design Policies policyid cost p age Dependents Owner Better design Beneficiary Policies policyid cost Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 10

Binary s. Ternary Relationships (Cont d.) Preious example illustrated a case when two binary relationships were better than one ternary relationship. An example in the other direction: a ternary relation Contracts relates entity sets Parts, Departments and Suppliers, and has descriptie attribute qty. No combination of binary relationships is an adequate substitute: S can-supply P, D needs P, and D deals-with S does not imply that D has agreed to buy P from S. And also, how/where else would we record qty? Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 11 Binary s. Ternary Relationships (Cont d.) Our example in the other direction: a ternary relation Contracts relates entity sets Parts, Departments and Suppliers, and has descriptie attribute qty: dno d phone qty sno s Departments Contracts Suppliers Parts pno p Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 12 cost (Obsere: Prescriptions was similar)

Datatase Design Process (Flow) Requirements Gathering (interiews) Conceptual Design (using ER) Platform Choice (which DBMS?) Logical Design (for target data model) Physical Design (for target DBMS, workload) Implement (and test, of course J) (Expect backtracking, iteration, and also incremental adjustments and, we will actually be giing you a bit of practice with that last one in the next few HW assignments...! J) Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 13 Summary of Conceptual Design Conceptual design follows requirements analysis Yields a high-leel description of data to be stored ER model popular for conceptual design Constructs are expressie, close to the way people think about their applications. Basic constructs: entities, relationships, and attributes (of entities and relationships). Additionally: weak entities, ISA hierarchies, aggregation, and multi-alued, composite and/or deried attributes. Note: Many ariations on the ER model (and many notations in use as well) and also, UML... Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 14

Summary of ER (Cont d.) Seeral kinds of integrity constraints can be expressed in the ER model: cardinality constraints, participation constraints, also oerlap/coering constraints for ISA hierarchies. Some foreign key constraints are also implicit in the definition of a relationship set (more about key constraints will be coming soon). Some constraints (notably, functional dependencies) cannot be expressed in the ER model. Constraints play an important role in determining the best database design for an enterprise. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 15 Summary of ER (Cont d.) ER design is subjectie. There are often many ways to model a gien scenario! Analyzing alternaties can be tricky, especially for a large enterprise. Common choices include: Entity s. attribute, entity s. relationship, binary or n-ary relationship, whether or not to use an ISA hierarchy, and whether or not to use aggregation. Ensuring good database design: The resulting relational schema should be analyzed and refined further. For this, FD information and normalization techniques are especially useful (coming soon). Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 16

Relational Database: Definitions Relational database: a set of relations Relation: consists of 2 parts: Instance: a table, with rows and columns. #Rows = cardinality, #fields = degree or arity. Schema: specifies of relation, plus and type of each column. E.g. Students(sid: string, : string, login: string, age: integer, gpa: real). Can think of a relation as a set of rows or tuples (i.e., all rows are distinct) in the pure relational model (s. reality of SQL J) Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 17 Example Instance of Students Relation sid login age gpa 53666 Jones jones@cs 18 3.4 53688 Smith smith@eecs 18 3.2 53650 Smith smith@math 19 3.8 Cardinality = 3, degree = 5, all rows distinct Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 18

Relational Query Languages A major strength of the relational model: supports simple, powerful querying of data. Queries can be written intuitiely, and the DBMS is responsible for efficient ealuation. The key: precise (and set-based) semantics for relational queries. Allows the optimizer to extensiely re-order operations, and still ensure that the answer does not change. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 19 The SQL Query Language (Preiew) Deeloped by IBM (System R) in the 1970s Need for a standard, since it is used by many endors (Oracle, IBM, Microsoft, ) ANSI/ISO Standards: SQL-86 SQL-89 (minor reision) SQL-92 (major reision, ery widely supported) SQL-99 (major extensions, current standard) Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 20

The SQL Query Language (Preiew) To find all 18 year old students, we can write: SELECT * FROM Students S WHERE S.age=18 sid login age gpa 53666 Jones jones@cs 18 3.4 53688 Smith smith@ee 18 3.2 To find just s and logins, replace the first line: SELECT S., S.login Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 21 Querying Multiple Relations What does the following query compute? SELECT S., E.cid FROM Students S, Enrolled E WHERE S.sid=E.sid AND E.grade= A Gien the following instances of Students and Enrolled: sid login age gpa 53666 Jones jones@cs 18 3.4 53688 Smith smith@eecs 18 3.2 53650 Smith smith@math 19 3.8 sid cid grade 53831 Carnatic101 C 53831 Reggae203 B 53650 Topology112 A 53666 History105 B We will get: S. Smith E.cid Topology112 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 22

Creating Relations in SQL Create the Students CREATE TABLE Students relation. Obsere that the (sid CHAR(20), type (domain) of each field is CHAR(20), specified and enforced by the login CHAR(10), DBMS wheneer tuples are age INTEGER, gpa REAL) added or modified. As another example, the Enrolled table holds information about courses that students take. CREATE TABLE Enrolled (sid CHAR(20), cid CHAR(20), grade CHAR(2)) Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 23 Destroying and Altering Relations DROP TABLE Students Destroys the relation Students. The schema information and the tuples are deleted. ALTER TABLE Students ADD COLUMN firstyear integer The schema of Students is altered by adding a new field; eery tuple in the current instance is extended with a null alue in the new field. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 24

Adding and Deleting Tuples Can insert a single tuple using: INSERT INTO Students (sid,, login, age, gpa) VALUES (53688, Smith, smith@ee, 18, 3.2) Can delete all tuples satisfying some condition (e.g., = Smith): DELETE FROM Students S WHERE S. = Smith Powerful ariants of these commands are aailable; more later! Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 25 Integrity Constraints (ICs) IC: condition that must be true for any instance of the database; e.g., domain constraints. ICs are specified when schema is defined. ICs are checked when relations are modified. A legal instance of a relation is one that satisfies all specified ICs. DBMS should not allow illegal instances. If the DBMS checks ICs, stored data is more faithful to real-world meaning. Aoids data entry errors (centrally), too! Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 26

Primary Key Constraints A set of fields is a key for a relation if : 1. No two distinct tuples can hae same alues in all key fields, and 2. This is not true for any subset of the key. Part 2 false? In that case, this is a superkey. If there s > 1 key for a relation, one of the keys is chosen (by DBA) to be the primary key. The others are referred to as candidate keys. E.g., sid is a key for Students. (What about?) The set {sid, gpa} is a superkey. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 27 Primary and Candidate Keys in SQL Possibly many candidate keys (specified using UNIQUE), with one chosen as the primary key. For a gien student + course, there is a single grade. s. Students can take only one course, and receie a single grade for that course; further, no two students in a course may eer receie the same grade. Used carelessly, an IC can preent the storage of database instances that arise in practice! CREATE TABLE Enrolled (sid CHAR(20) cid CHAR(20), grade CHAR(2), PRIMARY KEY (sid,cid) ) CREATE TABLE Enrolled (sid CHAR(20) cid CHAR(20), grade CHAR(2), PRIMARY KEY (sid), UNIQUE (cid, grade) ) Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 28

Foreign Keys, Referential Integrity Foreign key : Set of fields in one relation used to refer to a tuple in another relation. (Must refer to the primary key of the other relation.) Like a logical pointer. E.g., sid is a foreign key referring to Students: Enrolled(sid: string, cid: string, grade: string) If all foreign key constraints are enforced, referential integrity is achieed, i.e., no dangling references. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 29 Foreign Keys in SQL Ex: Only students listed in the Students relation should be allowed to enroll for courses. CREATE TABLE Enrolled (sid CHAR(20), cid CHAR(20), grade CHAR(2), PRIMARY KEY (sid, cid), FOREIGN KEY (sid) REFERENCES Students ) Enrolled sid cid grade 53666 Carnatic101 C 53666 Reggae203 B 53650 Topology112 A 53666 History105 B Students sid login age gpa 53666 Jones jones@cs 18 3.4 53688 Smith smith@eecs 18 3.2 53650 Smith smith@math 19 3.8 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 30